[Vm-dev] bytecodes (was: stack vm questions)

Jecel Assumpcao Jr jecel at merlintec.com
Sun May 17 21:39:11 UTC 2009


Eliot,

> Have you got an URL for that bytecode set?  If not, can you mail
> me some code or documentation that describes it?

I have a very messy page with several different instruction set designs
and including the two used for Self at:

http://www.merlintec.com:8080/software/3

The Self stuff is near the bottom and easy to miss, but it is simple
enough that I can describe it here so you don't have to bother with that
page. The original (Self 4.1 and earlier) bytecodes had a 3 bit op field
followed by a 5 bit data field. The 8 operations were: extend, pushSelf,
pushLiteral, nonLocalReturn, setDirectee, send, selfSend and resend. The
extend is a prefix that will add its 5 bits of data to the following
bytecode. Note that the pushSelf and nonLocalReturn bytecodes ignored
their data field. The resend was like super in Squeak and since we have
multiple inheritance it could be preceded by setDirectee to limit lookup
to a specific parent. One important detail is that the pushLiteral
bytecode checked for blocks and actually created a closure and pushed
that instead. All variable access is done through the selfSend
bytecodes. Note that selfSend #self does the same thing as pushSelf
because all Contexts have a slot named "self", so this latter bytecode
is not really needed (but does save an entry in the literal array).

For Self 4.1.2 the bytecode set was changed to make it more interpreter
friendly. It now has a 4 bit op field followed by a 4 bit data field.
The 16 operations are : extend, pushLiteral, send, selfSend, extra,
readLocal, writeLocal, lexicalLevel, branchAlways, branchIfTrue,
branchIfFalse, branchIndexed, delegatee, undefined, undefined and
undefined. Only the first four extra operations are defined: pushSelf,
pop, nonLocalReturn, undirectedResend. The lexicalLevel bytecode would
change the meaning of the following readLocal or writeLocal. Resending
is a bit different, with the delegatee bytecode not only setting the
parent for lookup but also changing the following send into a resend.

The original bytecodes are described in various papers and thesis but
the new ones are not well documented except in the code itself. I
mentioned that these are similar to the current Little Smalltalk
bytecodes, which are a bit different from the original LST instruction
set described in the book. The new LST bytecodes are only described in
the source code for the VM and also have a 4 bit op field and 4 bit data
field, with the ops: extended, psuhInstance, pushArgument,
pushTemporary, pushLiteral, pushConstant, assignInstance,
assignTemporary, markArguments, sendMessage, sendUnary, sendBinary,
pushBlock, doPrimitive and doSpecial. The first 12 special instructions
are defined: selfReturn, stackReturn, blockReturn, duplicate, popTop,
branch, branchIfTrue, branchIfFalse, sendToSuper and breakPoint. The
branch bytecodes don't have a data field but seem to use the following
byte as their data.

This is far more understandable when I am not trying to be so compact,
of course. I always like to see instruction sets in a table form, for
example.

-- Jecel



More information about the Vm-dev mailing list