Hi! I'm looking at some code from StackInterpreter, and the given piece of code raised a question I cannot fully answer:
longUnconditionalJump | offset | offset := (((currentBytecode bitAnd: 7) - 4) * 256) + self fetchByte. localIP := localIP + offset. (offset < 0 "backward jump means we're in a loop; check for possible interrupts" and: [*localSP < stackLimit*]) ifTrue: [self externalizeIPandSP. self checkForEventsMayContextSwitch: true. self browserPluginReturnIfNeeded. self internalizeIPandSP]. self fetchNextBytecode
What does the (localSP < stackLimit) condition stand for? What is its intention when deciding to make (or not) a context switch?
Tx, Guille
I don't know the answer to your question, Guille, but i'd like to propose a practice for all VM developers: - every time the question like that arises, the code should be factored out into separate method and commented with corresponding comment, to make clear the intent and serve as a documentation.
Like in this case:
localSP < stackLimit
could be
self checkThatLocalSPIsXYZ
On 16 April 2013 17:27, Guillermo Polito guillermopolito@gmail.com wrote:
Hi! I'm looking at some code from StackInterpreter, and the given piece of code raised a question I cannot fully answer:
longUnconditionalJump | offset | offset := (((currentBytecode bitAnd: 7) - 4) * 256) + self fetchByte. localIP := localIP + offset. (offset < 0 "backward jump means we're in a loop; check for possible interrupts" and: [localSP < stackLimit]) ifTrue: [self externalizeIPandSP. self checkForEventsMayContextSwitch: true. self browserPluginReturnIfNeeded. self internalizeIPandSP]. self fetchNextBytecode
What does the (localSP < stackLimit) condition stand for? What is its intention when deciding to make (or not) a context switch?
Tx, Guille
A fairly wild guess based on old memories and speculation about code I haven't actually read yet - this is using stackLimit as a flag for an interrupt having occurred. If stackLimit is a globally accessible value that an interrupt or thread handler can write then one would set it to 0 as a way of saying "at the very next opportunity, force a porces switch check (or even a process switch check)"
You might find a similar check in other places like primitive calls etc.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Ought to have a warning label on his forehead.
Hi Guillermo,
On Tue, Apr 16, 2013 at 8:27 AM, Guillermo Polito <guillermopolito@gmail.com
wrote:
Hi! I'm looking at some code from StackInterpreter, and the given piece of code raised a question I cannot fully answer:
longUnconditionalJump | offset | offset := (((currentBytecode bitAnd: 7) - 4) * 256) + self fetchByte. localIP := localIP + offset. (offset < 0 "backward jump means we're in a loop; check for possible interrupts" and: [*localSP < stackLimit*]) ifTrue: [self externalizeIPandSP. self checkForEventsMayContextSwitch: true. self browserPluginReturnIfNeeded. self internalizeIPandSP]. self fetchNextBytecode
What does the (localSP < stackLimit) condition stand for? What is its intention when deciding to make (or not) a context switch?
The Cog and Stack VMs steal a technique from Peter Deutsch's PS and HPS VMs which combines the stack page overflow check on method entry with the check for events. Internally these VMs run Smalltalk on a small set of stack pages (this is to do with going fast by using a call-instruction-based inline cache, and needing to implement contexts, mapping stack frames to contexts as needed). Periodically the VM needs to break out of executing Smalltalk and check for events (the VMs heartbeat)., and this is done by setting the stackLimit to the highest possible value ((all ones) so that all stack overflow checks fail, and on the next method entry or backward jump the VM will check for stack overflow. If there really is a stack overflow then it is dealt with, but if the stack limit has been changed to all ones then the M checks for events.
This is explained here: http://www.mirandabanda.org/cogblog/2009/01/14/under-cover-contexts-and-the-...
"We have a linked list of StackPage objects, one for each page, that we use to keep stack pages in order of usage, along with free pages, referenced by a variable called the mostRecentlyUsedPage. Each StackPage keeps track of whether a stack page is in use (baseFP is non-null) and what part of the page is in use (from the first slot through to the headSP) and what the frames are in the page (the list from headFP chained through caller saved fp to the baseFP). The interpreter’s current stack page is called stackPage. On stack switch we load stackLimit from stackPage’s stackLimit. Peter cleverly realised that one can use the stackLimit check to cause the VM to break out of execution to process input events. The VM is set up to respond to potential input with an interrupt handler that sets the stackLimit to all ones (the highest possible address) so that the next stack overflow check will fail. We also check for stack overflow on backward branch so that we can break out of infinite loops:
*StackInterpreter methods for jump bytecodes* *longUnconditionalJump* | offset | offset *:=* (((currentBytecode bitAnd: 7) - 4) * 256) + self fetchByte. localIP *:=* localIP + offset. (offset < 0 “backward jump means we’re in a loop; check for possible interrupts” and: [localSP < stackLimit]) ifTrue: [self externalizeIPandSP. self checkForEventsMayContextSwitch: true. self browserPluginReturnIfNeeded. self internalizeIPandSP]. self fetchNextBytecode "
On Tue, Apr 16, 2013 at 8:12 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
Hi Guillermo,
On Tue, Apr 16, 2013 at 8:27 AM, Guillermo Polito < guillermopolito@gmail.com> wrote:
Hi! I'm looking at some code from StackInterpreter, and the given piece of code raised a question I cannot fully answer:
longUnconditionalJump | offset | offset := (((currentBytecode bitAnd: 7) - 4) * 256) + self fetchByte. localIP := localIP + offset. (offset < 0 "backward jump means we're in a loop; check for possible interrupts" and: [*localSP < stackLimit*]) ifTrue: [self externalizeIPandSP. self checkForEventsMayContextSwitch: true. self browserPluginReturnIfNeeded. self internalizeIPandSP]. self fetchNextBytecode
What does the (localSP < stackLimit) condition stand for? What is its intention when deciding to make (or not) a context switch?
The Cog and Stack VMs steal a technique from Peter Deutsch's PS and HPS VMs which combines the stack page overflow check on method entry with the check for events. Internally these VMs run Smalltalk on a small set of stack pages (this is to do with going fast by using a call-instruction-based inline cache, and needing to implement contexts, mapping stack frames to contexts as needed).
Yeap, that I understood (given my limitations, haha) from your blog :).
Periodically the VM needs to break out of executing Smalltalk and check for events (the VMs heartbeat)., and this is done by setting the stackLimit to the highest possible value ((all ones)
Then, my question is: When does the VM decide it needs to do a context switch? In other words, when does it decide to put the stack limit to 0xffffffff ?
My case (triggering the question) is the following:
I'm experimenting on switching the special objects array with different processes, and therefore debugging the process switching.
- I resume a process with the following code:
| finished times | times := 0. [ times := times + 1. times < 30000 ] whileTrue. finished := true.
with an empty processor
- a stack page is created (or taken, not sure about the exact terminology here) for the process cause it was not married - it enters #checkForEventsMayContextSwitch: after long run (lets say ~16000 loops), and gets preempted (and I move to another special objects array with lots of processes) - then, every time I resume it, it enters #checkForEventsMayContextSwitch: after looping only one time
I know that I'm playing with a non-standard vm, but any clue that helps me to understand why the difference is appreciated :).
so that all stack overflow checks fail, and on the next method entry or
backward jump the VM will check for stack overflow. If there really is a stack overflow then it is dealt with, but if the stack limit has been changed to all ones then the M checks for events.
This is explained here: http://www.mirandabanda.org/cogblog/2009/01/14/under-cover-contexts-and-the-...
"We have a linked list of StackPage objects, one for each page, that we use to keep stack pages in order of usage, along with free pages, referenced by a variable called the mostRecentlyUsedPage. Each StackPage keeps track of whether a stack page is in use (baseFP is non-null) and what part of the page is in use (from the first slot through to the headSP) and what the frames are in the page (the list from headFP chained through caller saved fp to the baseFP). The interpreter’s current stack page is called stackPage. On stack switch we load stackLimit from stackPage’s stackLimit. Peter cleverly realised that one can use the stackLimit check to cause the VM to break out of execution to process input events. The VM is set up to respond to potential input with an interrupt handler that sets the stackLimit to all ones (the highest possible address) so that the next stack overflow check will fail. We also check for stack overflow on backward branch so that we can break out of infinite loops:
*StackInterpreter methods for jump bytecodes* *longUnconditionalJump* | offset | offset *:=* (((currentBytecode bitAnd: 7) - 4) * 256) + self fetchByte. localIP *:=* localIP + offset. (offset < 0 “backward jump means we’re in a loop; check for possible interrupts” and: [localSP < stackLimit]) ifTrue: [self externalizeIPandSP. self checkForEventsMayContextSwitch: true. self browserPluginReturnIfNeeded. self internalizeIPandSP]. self fetchNextBytecode "
Another silly question. The blog says:
"But a naive implementation has to allocate a context on each send, move the receiver and arguments from the stack of the caller context to that of the callee, and assign the callee’s sender with the caller. For essentially every return the garbage collector eventually has to reclaim, and every return has to nil the sender and instruction pointer fields of, the context being returned from."
When talking about the cost of moving the receiver and arguments to the new context, don't you have to push all that in the stack too? Or the values pushed in the stack by the sender are the ones used by the new activation (thus pushing only once)? (that's what I kind of understand from the post, but better double check :))
Thanks! Guille
-- cheers, Eliot
On Mon, Apr 22, 2013 at 2:37 AM, Guillermo Polito <guillermopolito@gmail.com
wrote:
On Tue, Apr 16, 2013 at 8:12 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
Hi Guillermo,
On Tue, Apr 16, 2013 at 8:27 AM, Guillermo Polito < guillermopolito@gmail.com> wrote:
Hi! I'm looking at some code from StackInterpreter, and the given piece of code raised a question I cannot fully answer:
longUnconditionalJump | offset | offset := (((currentBytecode bitAnd: 7) - 4) * 256) + self fetchByte. localIP := localIP + offset. (offset < 0 "backward jump means we're in a loop; check for possible interrupts" and: [*localSP < stackLimit*]) ifTrue: [self externalizeIPandSP. self checkForEventsMayContextSwitch: true. self browserPluginReturnIfNeeded. self internalizeIPandSP]. self fetchNextBytecode
What does the (localSP < stackLimit) condition stand for? What is its intention when deciding to make (or not) a context switch?
The Cog and Stack VMs steal a technique from Peter Deutsch's PS and HPS VMs which combines the stack page overflow check on method entry with the check for events. Internally these VMs run Smalltalk on a small set of stack pages (this is to do with going fast by using a call-instruction-based inline cache, and needing to implement contexts, mapping stack frames to contexts as needed).
Yeap, that I understood (given my limitations, haha) from your blog :).
Periodically the VM needs to break out of executing Smalltalk and check for events (the VMs heartbeat)., and this is done by setting the stackLimit to the highest possible value ((all ones)
Then, my question is: When does the VM decide it needs to do a context switch? In other words, when does it decide to put the stack limit to 0xffffffff ?
These two things are separate. The VM decides to do a context switch when either the current process blocks (Semaphore wait) or when another higher-priority process to become runnable. Another process becomes runnable either when a higer-priority process is resumed (via the resume primitive) or when a semaphore a higher-priority process is waiting on gets signalled, either via an even (the active delay expiring in the VM, an I/O event, etc) or via a primitive (Semaphore signal).
The stack limit is set to 0xffffffff by the heartbeat at about 500Hz (see calls on forceInterruptCheckFromHeartbeat in platforms/*/vm/*), or by other activities in the VM such as an allocation primitive causing the allocation pointer to cause a threshold or an attempt to compile a method to machine code running out of code memory (see senders of forceInterruptCheck). This causes the VM to check for a whole series of possible events including performing a garbage collect, reclaiming code memory or signalling the Delay semaphore because the active delay has expired, etc. See checkForEventsMayContextSwitch:.
My case (triggering the question) is the following:
I'm experimenting on switching the special objects array with different processes, and therefore debugging the process switching.
- I resume a process with the following code:
| finished times | times := 0. [ times := times + 1. times < 30000 ] whileTrue. finished := true.
with an empty processor
- a stack page is created (or taken, not sure about the exact terminology
here) for the process cause it was not married
- it enters #checkForEventsMayContextSwitch: after long run (lets say
~16000 loops), and gets preempted (and I move to another special objects array with lots of processes)
- then, every time I resume it, it enters #checkForEventsMayContextSwitch:
after looping only one time
Something seems wrong. The heartbeat is smashing the stack limit at 500Hz so you'd expect checkForEventsMayContextSwitch: to get called no more than that often. But then checkForEventsMayContextSwitch: resets the stack limit (see its send of restoreStackLimit towards the start of the method). So what should happen is however many loops take 2ms followed by a call to checkForEventsMayContextSwitch:, followed by however many loops take 2ms and so on.
I know that I'm playing with a non-standard vm, but any clue that helps me
to understand why the difference is appreciated :).
so that all stack overflow checks fail, and on the next method entry or
backward jump the VM will check for stack overflow. If there really is a stack overflow then it is dealt with, but if the stack limit has been changed to all ones then the M checks for events.
This is explained here: http://www.mirandabanda.org/cogblog/2009/01/14/under-cover-contexts-and-the-...
"We have a linked list of StackPage objects, one for each page, that we use to keep stack pages in order of usage, along with free pages, referenced by a variable called the mostRecentlyUsedPage. Each StackPage keeps track of whether a stack page is in use (baseFP is non-null) and what part of the page is in use (from the first slot through to the headSP) and what the frames are in the page (the list from headFP chained through caller saved fp to the baseFP). The interpreter’s current stack page is called stackPage. On stack switch we load stackLimit from stackPage’s stackLimit. Peter cleverly realised that one can use the stackLimit check to cause the VM to break out of execution to process input events. The VM is set up to respond to potential input with an interrupt handler that sets the stackLimit to all ones (the highest possible address) so that the next stack overflow check will fail. We also check for stack overflow on backward branch so that we can break out of infinite loops:
*StackInterpreter methods for jump bytecodes* *longUnconditionalJump* | offset | offset *:=* (((currentBytecode bitAnd: 7) - 4) * 256) + self fetchByte. localIP *:=* localIP + offset. (offset < 0 “backward jump means we’re in a loop; check for possible interrupts” and: [localSP < stackLimit]) ifTrue: [self externalizeIPandSP. self checkForEventsMayContextSwitch: true. self browserPluginReturnIfNeeded. self internalizeIPandSP]. self fetchNextBytecode "
Another silly question. The blog says:
"But a naive implementation has to allocate a context on each send, move the receiver and arguments from the stack of the caller context to that of the callee, and assign the callee’s sender with the caller. For essentially every return the garbage collector eventually has to reclaim, and every return has to nil the sender and instruction pointer fields of, the context being returned from."
When talking about the cost of moving the receiver and arguments to the new context, don't you have to push all that in the stack too? Or the values pushed in the stack by the sender are the ones used by the new activation (thus pushing only once)? (that's what I kind of understand from the post, but better double check :))
yes. the outgoing receiver and arguments become the incoming receiver and arguments just by building a frame, which is push return pc, push frame pointer, assign stack pointer to new frame pointer. It's all in the blog post.
Thanks! Guille
-- cheers, Eliot
vm-dev@lists.squeakfoundation.org