[Vm-dev] CogVM Execution Flow

Eliot Miranda eliot.miranda at gmail.com
Mon Jun 13 18:41:31 UTC 2016


Hi Ben,

    the diagram below shows the trees, but the wood is arguably more
important.  The diagram below is focussing on the transitions, but doesn't
clearly show what is being transitioned between.  I imagine a diagram which
shows the structures and has what you have in the yellow boxes as
transitions.  So...

The essential structures are six-fold, three execution state structures,
and three bodies of code, and in fact there is overlap of one of each.

These are the execution state structures:

1. the C stack.
2. the Smalltalk stack zone.
3. the Smalltalk heap (which includes contexts that overflow the Smalltalk
stack zone).

These are the bodies of code:
4. the run-time, the code comprising the VM interpreter, JIT, garbage
collector, and primitives
5. the jitted code living in the machine code zone, comprising methods,
polymorphic in line caches, and the glue routines (trampolines and
enilopmarts) between that machine code and the run-time
6. Smalltalk "source" code, the classes and methods in the Smalltalk heap
that constitute the "program" under execution

So 3. and 6. overlap; code is data, and 2. overflows into 3., the stack
zone is a "cache", keeping the most recent activations in the most
efficient form for execution.
Further, 4. (the run-time) executes solely on 1. (the C stack), and 5. (the
jitted code) runs only on 2. (the stack zone), and also, code in 6.
executed (interpreted) by the interpreter and primitives in 4. runs on 2.
(the stack zone)

Your diagram names some of the surface transitions, but not the deeper when
and why.  Here they are:

a) execution begins on the C stack as the program is launched.  Once the
heap is loaded, swizzling pointers as required, the interpreter is
entered.  On first entry it
  a1) allocates space for the stack zone on the C stack
  a2) "marries" the context in the image that invoked the snapshot
primitive (a stack frame in the stack zone is built for the context, and
the context is changed to become a proxy for that stack frame).
  a3) captures the top of the C stack (CStackPointer & CFramePointer) as
interpret is about to be invoked, including creating a "landing pad" for
jumping back into the interpreter
    a3 vm) the landing pad is a jmpbuf created via setjmp, and jumped to
via longjmp
    a3 sim) the landing pad is an exception handler for
the ReenterInterpreter notification
  a4) calls interpret to start interpreting the method that executed the
snapshot primitive


Invoking the run-time:
Machine code calls into the run-time for several facilities: adding an
object to the remembered table if a store check indicates this must happen,
running a primitive in the run-time, entering the run-time to lookup and
bind a machine code send, or a linked send that has missed.  To invoke the
run-time, the machine code saves the native stack and frame pointers (those
of the current Smalltalk stack frame) in stackPointer and framePointer,
sets the native stack and frame pointers to CStackPointer and
CFramePointer, passes parameters (pushing x86, loading registers ARM, x64)
and calls the run-time routine.  Simple routines (adding element to the
remembered set) simply perform the operation and return. The code returned
to then switches back to the Smalltalk stack pointers and continues.
Routines that change the Smalltalk frame (send-linking routines, complex
primitives such as perform:) reenter via an enilopmart.


Transition to the interpreter:
So any time the machine code wants to transition to the interpreter (not
simply call a routine in the run-time, but to interpret an
as-yet-unjitted/unjittable method, either via send or return, the machine
code switches the frame and stack pointers to those captured in a3) and
longjmps (raises the ReenterInterpreter exception).  It does this by
calling a run-time routine (as in "Invoking the run-time") that actually
performs the longjmp.  Any intervening state on the C stack will be
discarded, and execution will be in the same state as when the interpret
routine was entered immediately after initialising the stack zone.


*N.B.* Note that if the interpreter merely called the machine-code, and the
machine-code merely called the run-time, instead of substituting the stack
and frame pointers with CStackPointer and CFramePointer set up on initial
invocation of interpret, then the C stack would grow on each transition
between machine code execution and interpreter/run-time execution and the C
stack would soon overflow.


Call-backs:

The C stack /can/ grow however.  If a call-out calls back then the
call-back executes lower down the C stack.  A call out will have been made
from some primitive invoked either from the interpreter or machine-code,
and that primitive will run on the C stack.  On calling back, the VM saves
the current CStackPointer, CFramePointer and "landing-pad" jmpbuf in state
associated with the call-back, and then reenters the interpreter, saving
new values for the CStackPointer, CFramePointer and "landing-pad" jmpbuf.
Execution now continues in this new part of the C stack below the
original.  On the call-back returning (again via a primitive), the
CStackPointer, CFramePointer and "landing-pad" jmpbuf are restored before
returning to the C code that invoked the call-back.  Once this C code
returns, the stack is unwound back to the state before the call-out was
invoked.


Transition to machine-code:
The interpreter uses the simple policy of jitting a method if it is found
in the first-level method lookup cache, effectively hitting methods that
are used more than once.  If the jitter method contains a primitive, that
primitive routine will be invoked just as if it were an interpreted
method.  If the method doesn't have a primitive, the interpreter will jump
into machine code immediately.  t jumps into machine code by pushing any
parameters (the state of the machine code registers, such as
ReceiverResultReg, and the machine code address to begin execution) onto
the top of the Smalltalk stack, and calling an enilopmart that switches
from the C to the Smalltalk stack, loads the registers and jumps to the
machine code address via a return instruction that pops the entry point off
the Smalltalk stack.


Simulating these transitions in the Simulator:
In the Simulator, the C run-time (4.) are Smalltalk objects, and 1., 2.,
3., 5., & 6. live in the memory inst var of the object memory, a large
ByteArray.  The machine code lives in the bottom of this memory byte array
(MBA), and has no direct access to the Smalltalk objects.  In the real VM,
the correlates of these objects all exist at specific addresses and may be
accessed directly from machine code.  In the simulator this is not
possible.  Instead, these objects are all assigned out-of-bounds addresses,
and a dictionary maps from the specific out-of-bounds address to the
specific object being accessed, e.g. stackPointer, an inst var of
InterpreterPrimitives, the superclass of StackInterpreter, has an address
in simulatedAddresses that maps to a block that does a perform to access
stackPointer's value.  See CoInterpreter>>stackPointerAddress.

Machine code is executed by one of the processor aliens via
the primitiveRunInMemory:minimumAddress:readOnlyBelow:
or primitiveSingleStepInMemory:minimumAddress:readOnlyBelow: primitives.
These primitives will fail when they encounter an illegal instruction,
including an instruction that tried to fetch or store or jump to an
out-of-bounds address.  The primitive failure code analyses the instruction
that failed and (when appropriate, the instruction may actually be illegal,
the result of some bug in the system, but is typically an intended access
of some run-time object) creates an instance of the ProcessorSimulationTrap
exception and raises it.  The handler then handles the exception to either
fetch, store or invoke Smalltalk objects in the simulation, and once
handled execution can continue.

Hence in the simulator primitiveRunInMemory:minimumAddress:readOnlyBelow:
or primitiveSingleStepInMemory:minimumAddress:readOnlyBelow: (actually
their wrappers singleStepIn:minimumAddress:readOnlyBelow: &
runInMemory:minimumAddress:readOnlyBelow: are always invoked in the context
of Cogit>>simulateCogCodeAt:, which provides the handler, and tests for
machine code break-points using the breakBlock.

In the same way that the VM must avoid C stack growth when transitioning
between machine code and the interpreter/run-time above, so the simulator
must avoid uncontrolled stack growth when the simulated machine code
invokes Smalltalk code which again invokes simulated machine code.  So the
code that invokes the run-time from simulateCogCodeAt:
(Cogit>>#handleCallOrJumpSimulationTrap:) includes a handler for the
ReenterMachineCode notification.  Whenever the Smalltalk run-time wants to
reenter machine code via an enilopmart it
sends Cogit>>#simulateEnilopmart:numArgs: which raises the notification
before sending Cogit>>simulateCogCodeAt:.  So the first entry into machine
code via an enilopmart starts Cogit>>simulateCogCodeAt:, but subsequent
ones end up returning to that first Cogit>>simulateCogCodeAt: to continue
execution.


Ben, given the above, can you now see how your yellow boxes name specific
transitions amongst the structures explained below?  I hope I've encouraged
you, not discouraged you, to revise and bifurcate your diagram into two
state transition diagrams for the real and simulated VM.  It would be great
to have really good diagrammatic representations of the above.

And once we have that, we can build the relatively simple extension that
allows the interpreter and machine code to interleave interpreted and
machine code frames on the Smalltalk stack (2.) that allow the VM to freely
switch between interpreted and jitter code, and to fall back on the
interpreter whenever convenient.

On Mon, Jun 13, 2016 at 6:24 AM, Ben Coman <btc at openinworld.com> wrote:

>
> In trying to understand the flow of execution (and in particular the
> jumps in the jitted VM, I made a first rough pass to map it in the
> attached chart.
>
> I am trying to colourize it to distinguish between paths that can
> return to the interpreter, those that circulate in jitted code, and
> the transitions.  I'm sure I've missed the mark a bit but its a start.
> Of course corrections welcome, even scanned pen sketches.
>
> cheer -ben
>
>


-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20160613/453759dd/attachment-0001.htm


More information about the Vm-dev mailing list