[Vm-dev] Question about process preemption during longJump

Mon Apr 22 17:42:23 UTC 2013

On Mon, Apr 22, 2013 at 2:37 AM, Guillermo Polito <guillermopolito at gmail.com
> wrote:

>
>
>
>
> On Tue, Apr 16, 2013 at 8:12 PM, Eliot Miranda <eliot.miranda at gmail.com>wrote:
>
>>
>> Hi Guillermo,
>>
>> On Tue, Apr 16, 2013 at 8:27 AM, Guillermo Polito <
>> guillermopolito at gmail.com> wrote:
>>
>>>
>>> Hi! I'm looking at some code from StackInterpreter, and the given piece
>>> of code raised a question I cannot fully answer:
>>>
>>> longUnconditionalJump
>>> | offset |
>>>  offset := (((currentBytecode bitAnd: 7) - 4) * 256) + self fetchByte.
>>> localIP := localIP + offset.
>>> (offset < 0 "backward jump means we're in a loop; check for possible
>>> interrupts"
>>>  and: [*localSP < stackLimit*]) ifTrue:
>>> [self externalizeIPandSP.
>>>  self checkForEventsMayContextSwitch: true.
>>>  self browserPluginReturnIfNeeded.
>>>  self internalizeIPandSP].
>>> self fetchNextBytecode
>>>
>>> What does the (localSP < stackLimit) condition stand for? What is its
>>> intention when deciding to make (or not) a context switch?
>>>
>>
>> The Cog and Stack VMs steal a technique from Peter Deutsch's PS and HPS
>> VMs which combines the stack page overflow check on method entry with the
>> check for events.  Internally these VMs run Smalltalk on a small set of
>> stack pages (this is to do with going fast by using a
>> call-instruction-based inline cache, and needing to implement contexts,
>> mapping stack frames to contexts as needed).
>>
>
> Yeap, that I understood (given my limitations, haha) from your blog :).
>
>
>> Periodically the VM needs to break out of executing Smalltalk and check
>> for events (the VMs heartbeat)., and this is done by setting the stackLimit
>> to the highest possible value ((all ones)
>>
>
> Then, my question is: When does the VM decide it needs to do a context
> switch? In other words, when does it decide to put the stack limit to
> 0xffffffff ?
>

These two things are separate.  The VM decides to do a context switch when
either the current process blocks (Semaphore wait) or when another
higher-priority process to become runnable.  Another process becomes
runnable either when a higer-priority process is resumed (via the resume
primitive) or when a semaphore a higher-priority process is waiting on gets
signalled, either via an even (the active delay expiring in the VM, an I/O
event, etc) or via a primitive (Semaphore signal).

The stack limit is set to 0xffffffff by the heartbeat at about 500Hz (see
calls on forceInterruptCheckFromHeartbeat in platforms/*/vm/*), or by other
activities in the VM such as an allocation primitive causing the allocation
pointer to cause a threshold or an attempt to compile a method to machine
code running out of code memory (see senders of forceInterruptCheck). This
causes the VM to check for a whole series of possible events including
performing a garbage collect, reclaiming code memory or signalling the
Delay semaphore because the active delay has expired, etc.
 See checkForEventsMayContextSwitch:.

>
> My case (triggering the question) is the following:
>
> I'm experimenting on switching the special objects array with different
> processes, and therefore debugging the process switching.
>
> - I resume a process with the following code:
>
>  | finished times |
>  times := 0.
>  [ times := times + 1.
>   times < 30000 ] whileTrue.
>  finished := true.
>
> with an empty processor
>
> - a stack page is created (or taken, not sure about the exact terminology
> here) for the process cause it was not married
> - it enters #checkForEventsMayContextSwitch: after long run (lets say
> ~16000 loops), and gets preempted (and I move to another special objects
> array with lots of processes)
> - then, every time I resume it, it enters #checkForEventsMayContextSwitch:
> after looping only one time
>

Something seems wrong.  The heartbeat is smashing the stack limit at 500Hz
so you'd expect checkForEventsMayContextSwitch: to get called no more than
that often.  But then checkForEventsMayContextSwitch: resets the stack
limit (see its send of restoreStackLimit towards the start of the method).
 So what should happen is however many loops take 2ms followed by a call
to checkForEventsMayContextSwitch:, followed by however many loops take 2ms
and so on.

I know that I'm playing with a non-standard vm, but any clue that helps me
> to understand why the difference is appreciated :).
>
>  so that all stack overflow checks fail, and on the next method entry or
>> backward jump the VM will check for stack overflow.  If there really is a
>> stack overflow then it is dealt with, but if the stack limit has been
>> changed to all ones then the  M checks for events.
>>
>> This is explained here:
>> http://www.mirandabanda.org/cogblog/2009/01/14/under-cover-contexts-and-the-big-frame-up/
>>
>> "We have a linked list of StackPage objects, one for each page, that we
>> use to keep stack pages in order of usage, along with free pages,
>> referenced by a variable called the mostRecentlyUsedPage. Each StackPage
>> keeps track of whether a stack page is in use (baseFP is non-null) and what
>> part of the page is in use (from the first slot through to the headSP) and
>> what the frames are in the page (the list from headFP chained through
>> caller saved fp to the baseFP). The interpreter’s current stack page is
>> called stackPage. On stack switch we load stackLimit from stackPage’s
>> stackLimit. Peter cleverly realised that one can use the stackLimit check
>> to cause the VM to break out of execution to process input events. The VM
>> is set up to respond to potential input with an interrupt handler that sets
>> the stackLimit to all ones (the highest possible address) so that the next
>> stack overflow check will fail. We also check for stack overflow on
>> backward branch so that we can break out of infinite loops:
>>
>> *StackInterpreter methods for jump bytecodes*
>> *longUnconditionalJump*
>>         | offset |
>>         offset *:=* (((currentBytecode bitAnd: 7) - 4) * 256) + self
>> fetchByte.
>>         localIP *:=* localIP + offset.
>>         (offset < 0 “backward jump means we’re in a loop; check for
>> possible interrupts”
>>          and: [localSP < stackLimit]) ifTrue:
>>                  [self externalizeIPandSP.
>>                   self checkForEventsMayContextSwitch: true.
>>                   self browserPluginReturnIfNeeded.
>>                   self internalizeIPandSP].
>>         self fetchNextBytecode
>> "
>>
>>
> Another silly question. The blog says:
>
> "But a naive implementation has to allocate a context on each send, move
> the receiver and arguments from the stack of the caller context to that of
> the callee, and assign the callee’s sender with the caller. For essentially
> every return the garbage collector eventually has to reclaim, and every
> return has to nil the sender and instruction pointer fields of, the context
> being returned from."
>
> When talking about the cost of moving the receiver and arguments to the
> new context, don't you have to push all that in the stack too? Or the
> values pushed in the stack by the sender are the ones used by the new
> activation (thus pushing only once)? (that's what I kind of understand from
> the post, but better double check :))
>

yes.  the outgoing receiver and arguments become the incoming receiver and
arguments just by building a frame, which is push return pc, push frame
pointer, assign stack pointer to new frame pointer.  It's all in the blog
post.

> Thanks!
> Guille
>
>
>> --
>> cheers,
>> Eliot
>>
>>
>
>

-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20130422/1cf0edc9/attachment-0001.htm