[squeak-dev] [BUG] Cannot compile cascade sent to block

Thiede, Christoph Christoph.Thiede at student.hpi.uni-potsdam.de
Sun Dec 29 19:50:55 UTC 2019


Hi Eliot, thank you very much for the detailed and interesting explanation! I think I got a rough idea of the blockExtent concept; certainly, it will become more clear as I continue to work with it.


Concretely to this bug, I suppose that optimized blocks don't need their own blockExtent, because they store all their temps directly on their declaring context, don't they?
This would explain the following outputs:

[|x| x:=true] whileFalse. "will be optimized"
thisContext tempNames. #('x')
  compared to:
[|x| x:=true] yourself whileFalse. "wont be optimized"
thisContext tempNames #()

So I would say that blockExtent may indeed be nil, I would at least note this fact in BlockNode >> #blockExtent. However, #computeCopiedValues: is called by #constructClosureCreationNode:, and a closure should be only created of non-optimized BlockNodes. Equally, the senders of #reindexingLocalsDo:encoder: should be only activated in case of a non-optimized BlockNode ... So better forget the rest of my first changeset, sigh.


It is interesting that in the example, the receiver BlockNode keeps flagged as optimized (whereas the byte code produces a separate closure):

[self confirm: #foo]
whileTrue;
whileFalse

First, the compiler reads the first message only ("[self confirm: #foo] whileTrue") and optimizes it, and only then finds the second message and discards the optimized bytecode. (See Parser >> #cascade, #messagePart:repeat, MessageNode >> #receiver:selector:arguments:precedence:from:, #transform:.)
But it does not reset the optimized flag of the BlockNode! To me, this appears to be the actual bug!
There is #ensureCanCascade: which should specifically solve this problem by resetting the optimized flag. However, during creation of MessageNode, originalReceiver is assigned a copy of the actual receiver. This leads to the problem that while in each temporary MessageNode, the receiver block is deoptimized, the tempvar rcrv in #cascade holds the copied original receiver which never got deoptimized. For what purpose do we copy the receiver into originalReceiver?
Did I explain this understandable? :-)

By applying the following single patch, I can fix the problem:

Parser >> #cascade
" {; message} => CascadeNode."

- | rcvr msgs |
+ | rcvr cascadeRcvr msgs|
parseNode canCascade ifFalse:
[^self expected: 'Cascading not'].
parseNode ensureCanCascade: encoder.
rcvr := parseNode cascadeReceiver.
+  cascadeRcvr := rcvr copy.
msgs := OrderedCollection with: parseNode.
[self match: #semicolon]
whileTrue:
[parseNode := rcvr.
(self messagePart: 3 repeat: false)
ifFalse: [^self expected: 'Cascade'].
parseNode canCascade ifFalse:
[^self expected: '<- No special messages'].
parseNode ensureCanCascade: encoder.
parseNode cascadeReceiver.
msgs addLast: parseNode].
- parseNode := CascadeNode new receiver: rcvr messages: msgs
+  parseNode := CascadeNode new receiver: cascadeRcvr messages: msgs

All relevant tests pass (but I don't know how good their coverage is).
Is there any need for respecting possible side-effects on rcvr that might occur while creating the next cascade message (via #messagePart:repeat:)?
If not, how would you think about this change? :-)

(Sorry for the long text, I did my best to maximize the compression ratio!)

Best,
Christoph

________________________________
Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von Eliot Miranda <eliot.miranda at gmail.com>
Gesendet: Sonntag, 29. Dezember 2019 16:59 Uhr
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] [BUG] Cannot compile cascade sent to block

Hi Christoph,

On Dec 28, 2019, at 6:53 AM, Thiede, Christoph <Christoph.Thiede at student.hpi.uni-potsdam.de> wrote:



Hi all! :-)


Steps to reproduce:

Do it:

[true]
whileTrue;
whileFalse

Expected behavior:
The script is compiled and executed (and an infinite loop occurs).

Actual behavior:
<pastedImage.png>


Further investigations:
In #analyseTempsWithin:rootNode:assignmentPools:, the inner BlockNode [true] is not computed a blockExtent, because it is optimized.
This leads to a nil key in the ByteCodeEncoder's blockExtentsToLocals.
A fast solution would be to skip nil blockExtents in #noteBlockExtent:hasLocals:, but that's a dirty workaround.

Similar bug:
Try to do:
Object newSubclass compile:  'iWontCompile: foo
foo.
[true]
whileTrue;
whileFalse'
Fails with:
<pastedImage.png>

Again, the problem is that blockExtent isNil.

Workaround:
Send #yourself to the block before sending the cascade. This prevents the receiver from being optimized.

Please note that this problem does not only occur in case of special blocks. For example, [Sensor anyButtonPressed] whileTrue; whileFalse fails as well.


How to fix this appropriately? I'm not sure whether I understand the whole idea of blockExtent so far.

blockExtents are the model of blocks that the compiler uses to analyze temporary references so that temporaries are either local or remote.  Local temporaries are temporaries on a Context’s own stack.  Remote temporaries are temporaries stored in an array which is on a Context’s own stack.

The local/remote distinction is to ensure that a Context _never_ has to reference another Context’s stack to access an outer temporary.  Why? In the VM Contexts serve as proxies for active stack frames, with most stack frames not having a context unless the programmer accesses it by accessing some other context’s sender. This optimization is key to the VM’s performance. Cog is fast because it does context-to-stack mapping and creates contexts lazily.

If the execution model (defined by the bytecode set) for temporary access implements access to outer temporaries via direct access to outer contexts stacks then there are two states an outer context can be in. The common case is that an outer context is associated with a stack frame and the temporaries exist in the stack frame; any slots in the context object are stale. But if the block is long lived and runs after its enclosing context has been returned from then the temporaries will be in the context.  To keep the slots in the vm text up-to-date the stack frame slots must be copied to the context in return, and *before* the stack frame is torn down. Every outer temp access must check what state the outer context is in. This is what Peter Deutsch’s first bytecode set for HPS (objectworks 2.5) did (with the optimization that an outer context would be converted into having a stack frame if it didn’t, so that temporary access only deals with the common case). And the cost and complexity is very high.  (Forgive the deep dive).

The alternative, that I introduced in VW 5i, and that has been in Cog from the start is to never have contexts access outer temps directly.  This is done using a standard closure implementation model from lisp.  Any outer temporary which cannot possibly change during the execution of an inner block is copied into the closure and copied onto the block context/frame stack where it is effectively read only (an r-value). These are called “copied values”.

Any outer temp which is written to in the block or possibly modified after being closed over (ie assigned to lexically after the block) must be put in an array (an indirection vector). Since the temporary holding the indirection vector itself is not written to after initialization indirection vectors are always copied values.

So for example in inject:into: nextValue ends up in an indirection vector but aBlock is a copied value (look at the bytecode for the method; I’m on my phone).  Now the VMs context-to-stack mapping is greatly simplified (no need to intercept returns, no need to check outer temp access) and _much_ faster (and less buggy, known from hash experience with HPS).

But to do this the bytecode compiler must identify when closed over temporaries can be treated as copied values or put in indirection vectors.  Block extents model block nesting and allow the bytecode compiler to check whenever a temporary is read if it is being read in an inner block (ie is being closed over), and check whenever a temporary is being written if it is being written to in an inner block, or, if it has been closed over is being written to after a block that closed over it.  In either of these two cases, written to during the block and that write needing to be visible in the outer context, or written to after being closed over and that write needing to be visible within the block, the temp must live in an indirection vector.

Christoph, you *have* to understand this to be able to be effective in working in the execution machinery (eg the runUntil... debugger machinery).  If you do not understand this then the fault is mine (ENTIRELY!), and the fault is that my explanation doesn’t make sense. So take as long as you need and ask yourself if you do or do not understand it.  If you don’t, please ask questions and I’ll try my best to explain things better.  I introduced this machinery and it is my responsibility for it not to be a burden.  This is a case of moving complexity out of the VM into the image, and is justified on performance and reliability grounds.  But it is a severe cognitive burden on those using the bytecode compiler where, yo at the Smalltalk level, it seems they accessing outer temporaries directly is simple and all the block extent machinery horribly complex.  I assure you that the block extent machinery is trivial compared to the support required in the vm to implement direct access to outer temps as efficiently as possible ;which is way less efficiently than our current architecture allows).

According to the documentation comment, it should be okay to be nil; however, the #blockExtent comment does not mention this fact.
Please see the attached changeset for an approach to fix the issue. All tests from KernelTests-Methods and most tests from Test-Compiler pass it, except of the testAllNodePCs* which time out in my fresh image. It would be great to hear if this is the right way to solve the bug, or just extending a workaround :-)

I’ll take a look some time today.  But I’d be much happier if you can understand the long winded explanation above and answer your own question.  Great bug though :-)

Best,
Christoph
<bugfix compiling block cascades.2.cs>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20191229/91e4c858/attachment-0001.html>


More information about the Squeak-dev mailing list