[Vm-dev] CogVM Execution Flow

Ben Coman btc at openinworld.com
Mon Jun 13 18:51:13 UTC 2016


It will take me a while to digest this, but I'll happy to give it a go.
cheers -ben

On Tue, Jun 14, 2016 at 2:41 AM, Eliot Miranda <eliot.miranda at gmail.com> wrote:
>
> Hi Ben,
>
>     the diagram below shows the trees, but the wood is arguably more important.  The diagram below is focussing on the transitions, but doesn't clearly show what is being transitioned between.  I imagine a diagram which shows the structures and has what you have in the yellow boxes as transitions.  So...
>
> The essential structures are six-fold, three execution state structures, and three bodies of code, and in fact there is overlap of one of each.
>
> These are the execution state structures:
>
> 1. the C stack.
> 2. the Smalltalk stack zone.
> 3. the Smalltalk heap (which includes contexts that overflow the Smalltalk stack zone).
>
> These are the bodies of code:
> 4. the run-time, the code comprising the VM interpreter, JIT, garbage collector, and primitives
> 5. the jitted code living in the machine code zone, comprising methods, polymorphic in line caches, and the glue routines (trampolines and enilopmarts) between that machine code and the run-time
> 6. Smalltalk "source" code, the classes and methods in the Smalltalk heap that constitute the "program" under execution
>
> So 3. and 6. overlap; code is data, and 2. overflows into 3., the stack zone is a "cache", keeping the most recent activations in the most efficient form for execution.
> Further, 4. (the run-time) executes solely on 1. (the C stack), and 5. (the jitted code) runs only on 2. (the stack zone), and also, code in 6. executed (interpreted) by the interpreter and primitives in 4. runs on 2. (the stack zone)
>
> Your diagram names some of the surface transitions, but not the deeper when and why.  Here they are:
>
> a) execution begins on the C stack as the program is launched.  Once the heap is loaded, swizzling pointers as required, the interpreter is entered.  On first entry it
>   a1) allocates space for the stack zone on the C stack
>   a2) "marries" the context in the image that invoked the snapshot primitive (a stack frame in the stack zone is built for the context, and the context is changed to become a proxy for that stack frame).
>   a3) captures the top of the C stack (CStackPointer & CFramePointer) as interpret is about to be invoked, including creating a "landing pad" for jumping back into the interpreter
>     a3 vm) the landing pad is a jmpbuf created via setjmp, and jumped to via longjmp
>     a3 sim) the landing pad is an exception handler for the ReenterInterpreter notification
>   a4) calls interpret to start interpreting the method that executed the snapshot primitive
>
>
> Invoking the run-time:
> Machine code calls into the run-time for several facilities: adding an object to the remembered table if a store check indicates this must happen, running a primitive in the run-time, entering the run-time to lookup and bind a machine code send, or a linked send that has missed.  To invoke the run-time, the machine code saves the native stack and frame pointers (those of the current Smalltalk stack frame) in stackPointer and framePointer, sets the native stack and frame pointers to CStackPointer and CFramePointer, passes parameters (pushing x86, loading registers ARM, x64) and calls the run-time routine.  Simple routines (adding element to the remembered set) simply perform the operation and return. The code returned to then switches back to the Smalltalk stack pointers and continues.  Routines that change the Smalltalk frame (send-linking routines, complex primitives such as perform:) reenter via an enilopmart.
>
>
> Transition to the interpreter:
> So any time the machine code wants to transition to the interpreter (not simply call a routine in the run-time, but to interpret an as-yet-unjitted/unjittable method, either via send or return, the machine code switches the frame and stack pointers to those captured in a3) and longjmps (raises the ReenterInterpreter exception).  It does this by calling a run-time routine (as in "Invoking the run-time") that actually performs the longjmp.  Any intervening state on the C stack will be discarded, and execution will be in the same state as when the interpret routine was entered immediately after initialising the stack zone.
>
>
> N.B. Note that if the interpreter merely called the machine-code, and the machine-code merely called the run-time, instead of substituting the stack and frame pointers with CStackPointer and CFramePointer set up on initial invocation of interpret, then the C stack would grow on each transition between machine code execution and interpreter/run-time execution and the C stack would soon overflow.
>
>
> Call-backs:
>
> The C stack /can/ grow however.  If a call-out calls back then the call-back executes lower down the C stack.  A call out will have been made from some primitive invoked either from the interpreter or machine-code, and that primitive will run on the C stack.  On calling back, the VM saves the current CStackPointer, CFramePointer and "landing-pad" jmpbuf in state associated with the call-back, and then reenters the interpreter, saving new values for the CStackPointer, CFramePointer and "landing-pad" jmpbuf.  Execution now continues in this new part of the C stack below the original.  On the call-back returning (again via a primitive), the CStackPointer, CFramePointer and "landing-pad" jmpbuf are restored before returning to the C code that invoked the call-back.  Once this C code returns, the stack is unwound back to the state before the call-out was invoked.
>
>
> Transition to machine-code:
> The interpreter uses the simple policy of jitting a method if it is found in the first-level method lookup cache, effectively hitting methods that are used more than once.  If the jitter method contains a primitive, that primitive routine will be invoked just as if it were an interpreted method.  If the method doesn't have a primitive, the interpreter will jump into machine code immediately.  t jumps into machine code by pushing any parameters (the state of the machine code registers, such as ReceiverResultReg, and the machine code address to begin execution) onto the top of the Smalltalk stack, and calling an enilopmart that switches from the C to the Smalltalk stack, loads the registers and jumps to the machine code address via a return instruction that pops the entry point off the Smalltalk stack.
>
>
> Simulating these transitions in the Simulator:
> In the Simulator, the C run-time (4.) are Smalltalk objects, and 1., 2., 3., 5., & 6. live in the memory inst var of the object memory, a large ByteArray.  The machine code lives in the bottom of this memory byte array (MBA), and has no direct access to the Smalltalk objects.  In the real VM, the correlates of these objects all exist at specific addresses and may be accessed directly from machine code.  In the simulator this is not possible.  Instead, these objects are all assigned out-of-bounds addresses, and a dictionary maps from the specific out-of-bounds address to the specific object being accessed, e.g. stackPointer, an inst var of InterpreterPrimitives, the superclass of StackInterpreter, has an address in simulatedAddresses that maps to a block that does a perform to access stackPointer's value.  See CoInterpreter>>stackPointerAddress.
>
> Machine code is executed by one of the processor aliens via the primitiveRunInMemory:minimumAddress:readOnlyBelow: or primitiveSingleStepInMemory:minimumAddress:readOnlyBelow: primitives. These primitives will fail when they encounter an illegal instruction, including an instruction that tried to fetch or store or jump to an out-of-bounds address.  The primitive failure code analyses the instruction that failed and (when appropriate, the instruction may actually be illegal, the result of some bug in the system, but is typically an intended access of some run-time object) creates an instance of the ProcessorSimulationTrap exception and raises it.  The handler then handles the exception to either fetch, store or invoke Smalltalk objects in the simulation, and once handled execution can continue.
>
> Hence in the simulator primitiveRunInMemory:minimumAddress:readOnlyBelow: or primitiveSingleStepInMemory:minimumAddress:readOnlyBelow: (actually their wrappers singleStepIn:minimumAddress:readOnlyBelow: & runInMemory:minimumAddress:readOnlyBelow: are always invoked in the context of Cogit>>simulateCogCodeAt:, which provides the handler, and tests for machine code break-points using the breakBlock.
>
> In the same way that the VM must avoid C stack growth when transitioning between machine code and the interpreter/run-time above, so the simulator must avoid uncontrolled stack growth when the simulated machine code invokes Smalltalk code which again invokes simulated machine code.  So the code that invokes the run-time from simulateCogCodeAt: (Cogit>>#handleCallOrJumpSimulationTrap:) includes a handler for the ReenterMachineCode notification.  Whenever the Smalltalk run-time wants to reenter machine code via an enilopmart it sends Cogit>>#simulateEnilopmart:numArgs: which raises the notification before sending Cogit>>simulateCogCodeAt:.  So the first entry into machine code via an enilopmart starts Cogit>>simulateCogCodeAt:, but subsequent ones end up returning to that first Cogit>>simulateCogCodeAt: to continue execution.
>
>
> Ben, given the above, can you now see how your yellow boxes name specific transitions amongst the structures explained below?  I hope I've encouraged you, not discouraged you, to revise and bifurcate your diagram into two state transition diagrams for the real and simulated VM.  It would be great to have really good diagrammatic representations of the above.
>
> And once we have that, we can build the relatively simple extension that allows the interpreter and machine code to interleave interpreted and machine code frames on the Smalltalk stack (2.) that allow the VM to freely switch between interpreted and jitter code, and to fall back on the interpreter whenever convenient.
>
> On Mon, Jun 13, 2016 at 6:24 AM, Ben Coman <btc at openinworld.com> wrote:
>>
>>
>> In trying to understand the flow of execution (and in particular the
>> jumps in the jitted VM, I made a first rough pass to map it in the
>> attached chart.
>>
>> I am trying to colourize it to distinguish between paths that can
>> return to the interpreter, those that circulate in jitted code, and
>> the transitions.  I'm sure I've missed the mark a bit but its a start.
>> Of course corrections welcome, even scanned pen sketches.
>>
>> cheer -ben
>>
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>


More information about the Vm-dev mailing list