[Vm-dev] stack vm questions

Igor Stasenko siguctua at gmail.com
Thu May 14 01:18:12 UTC 2009

2009/5/14 Eliot Miranda <eliot.miranda at gmail.com>:
> On Wed, May 13, 2009 at 5:05 PM, Jecel Assumpcao Jr <jecel at merlintec.com> wrote:
>> Eliot,
>> > [JIT code uses call which pushes PC first]
>> Ok, so this can't be helped.
>> > So while one could I don't see that its worth-while.  Even if one did
>> > keep the arguments and temporaries together one would still have the
>> > stack contents separate from the arguments and temporaries and
>> > temporary access bytecodes can still access those so arguably one
>> > would still have to check the index against the temp count.
>> Really? I wouldn't expect the compiler to ever generate such bytecodes
>> and so wasn't too worried if the VM did the wrong thing in this
>> situation.
> There's a tension between implementing what the current compiler produces and implementing what the instruction set defines.  For example should one assume arguments are never written to?  I lean on the side of implementing the instruction set.
>> > In the JIT the flag is a bit in the method reference's LSBs and is set for free
>> > on frame build.
>> That sounds like a neat trick. Are the stack formats for the interpreted
>> stack vm and the jit a little diffeent?
> Yes.  In the JIT an interpreted frame needs an extra field to hold the saved bytecode instruction pointer when an interpreted frame calls a machine code frame because the return address is the "return to interpreter trampoline" pc.  There is no flag word in a machine code frame.  So machine code frames save one word w.r.t. the stack vm and interpreted frames gain a word.  But most frames are machine code ones so most of the time one is saving space.
>> Thanks for the explanations. I haven't figured out how to do this in
>> hardware in a reasonable way and so might have to go with a different design.
> I guess that in hardware you can create an instruction that will load a descriptor register as part of the return sequence in parallel with restoring the frame pointer and method so one would never indirect through the frame pointer to fetch the flags word; instead it would be part of the register state.  But that's an extremely uneducated guess :)
>> -- Jecel
The one thing you mentioned is separation of data stack & code stack
(use separate space to push args & temps, and another space to push
VM/JIT-specific state, such as context pointers & return addresses).

In this way, there are no big issue with this separation, because to
access the state in both variants you still have to track two pointers
- namely stack pointer and frame pointer.
But such a separation could be very helpful to handle the unitialized temps.
To call a method you usually pushing args (sp decreasing - if stack
organized in bottom-to-top order)
then pushing return address, save sp & other required context state
(fp decreasing)
And then you are ready to activate a new method.
Now there are 2 variants:
at the time when you entering new method, you can simply reserve the
stack space for its temps, OR, leave the sp unchanged, but organize
the code in such way, that each time you get a temp initialized, you
decreasing sp.
Then, at each point of execution, if debugger needs to determine if
some of the method's temps are unitialized , it can simply check that
context's saved sp should be <= temp pointer.
This of course makes things more complex, because you should
reorganize temps in order of their initialization i.e.
| a b |
b := 5.
a := 6.

to make pointer to a will be < pointer to b.
As well, as you should not alter the sp when "pushing" arguments
(otherwise debugger will assume that you have already initialized

But, it allows you to conserve the stack space between a method calls:

| a b |

"1" b := 5.
"2" self foo.
"3" a := 6.

at 1) you decrement sp by pushing a initialized temp value of b
at 2) you saving the sp then increasing the sp by number of arguments
for method (foo) and activating a new method, on return you restoring
at 3) you pushing a new initialized value , which leads to another
decrement of sp.

and, as i said, if you suspend the process in the middle of "self foo"
call, and inspecting the context of someMethod,
debugger can easily see, that it's saved sp is too small to hold all
temps, and if user wanting to see a current uninitialized value of
'a', then answer is nil, because offset of 'a' is greater than
currently saved sp.

There's another cost , of couse, - if method having 10 temps, then the
generated code needs 10 pushes for each initialization, in contrast to
single instruction for reserving space at the method's activation,
when you simply decreasing the sp by a known constant.

But i think there is no much difference: in both variants you have to
write at some memory location , and there just a little difference how
you calculate the address at initial store i.e. doing
push reg
mov [fp + offset] <- reg

So the major pros is:
 - conserving a stack space
 - easy to determine unitialized temps for debugger

and cons is:
  - this could complicate the code generation

Best regards,
Igor Stasenko AKA sig.

More information about the Vm-dev mailing list