[squeak-dev] [BUG] Cannot compile cascade sent to block

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Tue Dec 31 22:59:40 UTC 2019


Hi Christoph,
what about removing the copy in originalReceiver:

receiver: rcvr arguments: args precedence: p
     receiver := rcvr.
     originalReceiver := rcvr.
     arguments := args.
     originalArguments := arguments copy.
     sizes := Array new: arguments size.
     precedence := p

Le mar. 31 déc. 2019 à 02:08, Thiede, Christoph <
Christoph.Thiede at student.hpi.uni-potsdam.de> a écrit :

> Ooooh, my patch was erroneous ... I discovered this while working at a
> total different thing and it took me several hours to trace recurring VM
> crashes back to this change ...
>
>
> Compare the following before and after loading my commit:
>
> PreferenceWizardMorph compile: (PreferenceWizardMorph sourceCodeAt:
> #updateWindowBounds).
> (PreferenceWizardMorph >> #updateWindowBounds) symbolic edit
>
> There is a difference!
>
> (blue is correct)
> I don't get the full explanation at the moment, but apparently, there *are*
>  made some relevant transformations to the receiver, which is offsetting
> the temp index because of the nested closure or something.
> Transformations are more than optimization, I guess.
>
> The following change appears to fix the current problem:
>
> But I won't commit this to the inbox until I will have made sure that I
> understand whether and why this is (hopefully) correct. So long, please
> move Compiler-ct.414 to the Treated Inbox. So sorry!
>
>
> > cose over -> close over
> > throguh -> through
>
> Thank you, will fix them with my next commit.
>
> Best,
> Christoph
>
> ------------------------------
> *Von:* Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im
> Auftrag von Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>
> *Gesendet:* Dienstag, 31. Dezember 2019 01:04 Uhr
> *An:* The general-purpose Squeak developers list
> *Betreff:* Re: [squeak-dev] [BUG] Cannot compile cascade sent to block
>
> Ah, yes, here is why:
>
> Name: Compiler-bf.293
> Author: bf
> Time: 19 February 2015, 6:22:49.349 pm
> UUID: 737af1fb-e8f0-4653-851b-0c1b3ed0f65f
> Ancestors: Compiler-topa.292
>
> Fix deoptimization of ifNil: etc.
>
> --------
>
> Forgot to say, your changes in Compiler-ct.414 are correct.
> But there are other typos in BlockNode comments, I see at least 2:
>
> cose over -> close over
> throguh -> through
>
> Le mar. 31 déc. 2019 à 00:56, Nicolas Cellier <
> nicolas.cellier.aka.nice at gmail.com> a écrit :
>
>> Oups, I probably missunderstood the question.
>> You mean why you need to copy the cascadeReceiver...
>>
>> Well, it's simple: the rcvr is shared by all cascade message, because we
>> do:
>>
>>     parseNode := rcvr.
>>     (self messagePart: 3 repeat: false)
>>
>> So [true] whileTrue is optimized.
>> Then in #cascade, after #ensureCanCascade: it is #deoptimize(d).
>> That's when we get the deoptimized rcvr := parseNode cascadeReceiver.
>>
>> Then in second message [true] whileFalse, it gets optimized again.
>> The receiver of second message is deoptimized again, but it's the
>> originalReceiver that gets deoptimized (a copy) not the receiver!
>> That's a change Bert made to #deoptimize in 2015 for some reason (I guess
>> for some transform, the receiver was changed).
>>
>> So your rcvr copy is preserved from re-optimization.
>>
>> Le lun. 30 déc. 2019 à 19:36, Nicolas Cellier <
>> nicolas.cellier.aka.nice at gmail.com> a écrit :
>>
>>> Hi Christoph,
>>> Because sending a message unstack receiver, and replace it with returned
>>> object. So we need dup for each cascade.
>>>
>>> Le lun. 30 déc. 2019 à 19:22, Thiede, Christoph <
>>> Christoph.Thiede at student.hpi.uni-potsdam.de> a écrit :
>>>
>>>> > As long as you add a test to capture the issue, one that tests both
>>>> compilation and, if it succeeds, execution, you should add what you see
>>>> fit.  And yes, trunk is the right place, which means inbox until after the
>>>> release.
>>>>
>>>> Alright, done, hope you like it :)
>>>>
>>>> I still don't see why cascadeReceiver is a copy, though ...
>>>>
>>>> ------------------------------
>>>> *Von:* Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im
>>>> Auftrag von Eliot Miranda <eliot.miranda at gmail.com>
>>>> *Gesendet:* Montag, 30. Dezember 2019 17:51:36
>>>> *An:* The general-purpose Squeak developers list
>>>> *Betreff:* Re: [squeak-dev] [BUG] Cannot compile cascade sent to block
>>>>
>>>> Hi Christoph, Hi Nicolas,
>>>>
>>>> On Dec 30, 2019, at 5:28 AM, Thiede, Christoph <
>>>> Christoph.Thiede at student.hpi.uni-potsdam.de> wrote:
>>>>
>>>> 
>>>>
>>>> Hi Eliot,
>>>>
>>>>
>>>> Of course, a changeset that defers the optimization of ParseNodes until
>>>> a cascade can be precluded would be perfect - also with respect to the
>>>> compiler performance. I would be very interested to read such a changeset!
>>>>
>>>>
>>>> However, if we can't patch this quickly, would you consider my proposal
>>>> an appropriate workaround? Should we maybe include it into Trunk, just for
>>>> 5.3, until we have the perfect solution from Nicolas?
>>>>
>>>>
>>>> As long as you add a test to capture the issue, one that tests both
>>>> compilation and, if it succeeds, execution, you should add what you see
>>>> fit.  And yes, trunk is the right place, which means inbox until after the
>>>> release.
>>>>
>>>> That suggests we switch over to a variation of inbox for the release
>>>> process.  We could provide releasebox or done such (I would call it
>>>> release) and only commit things good for the release.  But in thinking
>>>> about it it’s a bad idea.  It means a lot of work to change ones working
>>>> images from trunk to releasebox and back.  However, if in 5.3 we provided a
>>>> bulk change operation for the Monticello browser I’d consider it.  Has
>>>> anything like this been discussed? (we should change subject to continue...)
>>>>
>>>>
>>>> Best,
>>>>
>>>> Christoph
>>>> ------------------------------
>>>> *Von:* Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im
>>>> Auftrag von Eliot Miranda <eliot.miranda at gmail.com>
>>>> *Gesendet:* Sonntag, 29. Dezember 2019 21:01:04
>>>> *An:* The general-purpose Squeak developers list
>>>> *Betreff:* Re: [squeak-dev] [BUG] Cannot compile cascade sent to block
>>>>
>>>> Hi Christoph,
>>>>
>>>> On Sun, Dec 29, 2019 at 11:51 AM Thiede, Christoph <
>>>> Christoph.Thiede at student.hpi.uni-potsdam.de> wrote:
>>>>
>>>>> Hi Eliot, thank you very much for the detailed and interesting
>>>>> explanation! I think I got a rough idea of the blockExtent concept;
>>>>> certainly, it will become more clear as I continue to work with it.
>>>>>
>>>>>
>>>>> Concretely to this bug, I suppose that optimized blocks don't need
>>>>> their own blockExtent, because they store all their temps directly on their
>>>>> declaring context, don't they?
>>>>>
>>>>
>>>> Exactly.  They're just basic blocks and jumps, within either proper
>>>> methods or proper blocks.
>>>>
>>>>
>>>>> This would explain the following outputs:
>>>>>
>>>>> [|x| x:=true] whileFalse. "will be optimized"
>>>>> thisContext tempNames. #('x')
>>>>>
>>>>>   compared to:
>>>>>
>>>>> [|x| x:=true] yourself whileFalse. "wont be optimized"
>>>>> thisContext tempNames #()
>>>>>
>>>>>
>>>> Exactly.
>>>>
>>>>
>>>>> So I would say that blockExtent may indeed be nil, *I would at least
>>>>> note this fact in BlockNode >> #blockExtent.*
>>>>>
>>>>
>>>> And feel free to take as much of my explanation, with rewrites, and add
>>>> it to the relevant parts of the bytecode compiler.  I like discursive doc
>>>> like this in class comments, but that implies that a pointer to the class
>>>> comment from a method is the right thing to do (as in "See the class
>>>> comment in ...").
>>>>
>>>>
>>>>> However, #computeCopiedValues: is called by #constructClosureCreationNode:,
>>>>> and a closure should be only created of non-optimized BlockNodes. Equally,
>>>>> the senders of #reindexingLocalsDo:encoder: should be only activated in
>>>>> case of a non-optimized BlockNode ... So better forget the rest of my first
>>>>> changeset, sigh.
>>>>>
>>>>
>>>> Right.  So what maybe happening is that the system is realizing that it
>>>> can't inline the block because it is being used in a cascade.  So the
>>>> attempt to optimize it in the first place should fail.  We should only
>>>> consider for optimization blocks that are sent one of the inline messages,
>>>> *but not in a cascade*.  This is probably why Nicolas expects we should
>>>> have already fixed this bug. Nicolas if I neglected to integrate your fix I
>>>> apologize.  I invite you to commit straight to trunk; maybe ask Christoph
>>>> for review?.
>>>>
>>>> It is interesting that in the example, the receiver BlockNode keeps
>>>>> flagged as optimized (whereas the byte code produces a separate closure):
>>>>>
>>>>> [self confirm: #foo]
>>>>> whileTrue;
>>>>> whileFalse
>>>>>
>>>>>
>>>> Right.  I would guess that the block is marked as being optimizable
>>>> when the whileTrue is processed, but everything goes pear shaped when the
>>>> whileFalse is processed.  What we need to do is forestall the optimization
>>>> caused by processing the whileTrue send iff the whileTrue is part of a
>>>> cascade (which is probably exactly what Nicolas' as-yet-unintegrated
>>>> changes effect).
>>>>
>>>>
>>>>>
>>>>> First, the compiler reads the first message only ("[self confirm:
>>>>> #foo] whileTrue") and optimizes it, and only then finds the second
>>>>> message and discards the optimized bytecode. (See Parser >> #cascade, #messagePart:repeat,
>>>>> MessageNode >> #receiver:selector:arguments:precedence:from:,
>>>>> #transform:.)
>>>>> But it does not reset the optimized flag of the BlockNode! To me, this
>>>>> appears to be the actual bug!
>>>>>
>>>>
>>>> You have it.
>>>>
>>>>
>>>>> There is #ensureCanCascade: which should specifically solve this
>>>>> problem by resetting the optimized flag. However, during creation of
>>>>> MessageNode, originalReceiver is assigned a copy of the actual receiver.
>>>>> This leads to the problem that while in each temporary MessageNode, the
>>>>> receiver block is deoptimized, the tempvar rcrv in #cascade holds the
>>>>> copied original receiver which never got deoptimized. For what purpose do
>>>>> we copy the receiver into originalReceiver?
>>>>> Did I explain this understandable? :-)
>>>>>
>>>>> *By applying the following single patch, I can fix the problem:*
>>>>>
>>>>> Parser >> #cascade
>>>>> " {; message} => CascadeNode."
>>>>>
>>>>> - | rcvr msgs |
>>>>> + | rcvr cascadeRcvr msgs|
>>>>> parseNode canCascade ifFalse:
>>>>> [^self expected: 'Cascading not'].
>>>>> parseNode ensureCanCascade: encoder.
>>>>> rcvr := parseNode cascadeReceiver.
>>>>> +  cascadeRcvr := rcvr copy.
>>>>> msgs := OrderedCollection with: parseNode.
>>>>> [self match: #semicolon]
>>>>> whileTrue:
>>>>> [parseNode := rcvr.
>>>>> (self messagePart: 3 repeat: false)
>>>>> ifFalse: [^self expected: 'Cascade'].
>>>>> parseNode canCascade ifFalse:
>>>>> [^self expected: '<- No special messages'].
>>>>> parseNode ensureCanCascade: encoder.
>>>>> parseNode cascadeReceiver.
>>>>> msgs addLast: parseNode].
>>>>> - parseNode := CascadeNode new receiver: rcvr messages: msgs
>>>>> +  parseNode := CascadeNode new receiver: cascadeRcvr messages: msgs
>>>>>
>>>>>
>>>>> All relevant tests pass (but I don't know how good their coverage is).
>>>>> Is there any need for respecting possible side-effects on rcvr that
>>>>> might occur while creating the next cascade message
>>>>> (via #messagePart:repeat:)?
>>>>> If not, how would you think about this change? :-)
>>>>>
>>>>> (Sorry for the long text, I did my best to maximize the compression
>>>>> ratio!)
>>>>>
>>>>
>>>> Um, it's a pleasure to read your writing, so don't worry about being
>>>> prolix.  I can hardly complain ;-)
>>>>
>>>>
>>>>
>>>>>
>>>>> Best,
>>>>> Christoph
>>>>>
>>>>> ------------------------------
>>>>> *Von:* Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im
>>>>> Auftrag von Eliot Miranda <eliot.miranda at gmail.com>
>>>>> *Gesendet:* Sonntag, 29. Dezember 2019 16:59 Uhr
>>>>> *An:* The general-purpose Squeak developers list
>>>>> *Betreff:* Re: [squeak-dev] [BUG] Cannot compile cascade sent to block
>>>>>
>>>>> Hi Christoph,
>>>>>
>>>>> On Dec 28, 2019, at 6:53 AM, Thiede, Christoph <
>>>>> Christoph.Thiede at student.hpi.uni-potsdam.de> wrote:
>>>>>
>>>>> 
>>>>>
>>>>> Hi all! :-)
>>>>>
>>>>>
>>>>> *Steps to reproduce:*
>>>>>
>>>>> Do it:
>>>>>
>>>>> [true]
>>>>> whileTrue;
>>>>> whileFalse
>>>>>
>>>>>
>>>>> *Expected behavior:*
>>>>> The script is compiled and executed (and an infinite loop occurs).
>>>>>
>>>>> *Actual behavior:*
>>>>> <pastedImage.png>
>>>>>
>>>>>
>>>>> *Further investigations:*
>>>>> In #analyseTempsWithin:rootNode:assignmentPools:, the inner BlockNode
>>>>> [true] is not computed a blockExtent, because it is optimized.
>>>>> This leads to a nil key in the ByteCodeEncoder's blockExtentsToLocals.
>>>>> A fast solution would be to skip nil blockExtents in #noteBlockExtent:hasLocals:,
>>>>> but that's a dirty workaround.
>>>>>
>>>>> *Similar bug:*
>>>>> Try to do:
>>>>>
>>>>> Object newSubclass compile:  'iWontCompile: foo
>>>>> foo.
>>>>> [true]
>>>>> whileTrue;
>>>>> whileFalse'
>>>>>
>>>>> Fails with:
>>>>> <pastedImage.png>
>>>>>
>>>>> Again, the problem is that blockExtent isNil.
>>>>>
>>>>> *Workaround:*
>>>>> Send #yourself to the block before sending the cascade. This prevents
>>>>> the receiver from being optimized.
>>>>>
>>>>> Please note that this problem does not only occur in case of special
>>>>> blocks. For example, [Sensor anyButtonPressed] whileTrue; whileFalse
>>>>> fails as well.
>>>>>
>>>>>
>>>>> How to fix this appropriately? I'm not sure whether I understand
>>>>> the whole idea of blockExtent so far.
>>>>>
>>>>>
>>>>> blockExtents are the model of blocks that the compiler uses to analyze
>>>>> temporary references so that temporaries are either local or remote.  Local
>>>>> temporaries are temporaries on a Context’s own stack.  Remote temporaries
>>>>> are temporaries stored in an array which is on a Context’s own stack.
>>>>>
>>>>> The local/remote distinction is to ensure that a Context _never_ has
>>>>> to reference another Context’s stack to access an outer temporary.  Why? In
>>>>> the VM Contexts serve as proxies for active stack frames, with most stack
>>>>> frames not having a context unless the programmer accesses it by accessing
>>>>> some other context’s sender. This optimization is key to the VM’s
>>>>> performance. Cog is fast because it does context-to-stack mapping and
>>>>> creates contexts lazily.
>>>>>
>>>>> If the execution model (defined by the bytecode set) for temporary
>>>>> access implements access to outer temporaries via direct access to outer
>>>>> contexts stacks then there are two states an outer context can be in. The
>>>>> common case is that an outer context is associated with a stack frame and
>>>>> the temporaries exist in the stack frame; any slots in the context object
>>>>> are stale. But if the block is long lived and runs after its enclosing
>>>>> context has been returned from then the temporaries will be in the
>>>>> context.  To keep the slots in the vm text up-to-date the stack frame slots
>>>>> must be copied to the context in return, and *before* the stack frame is
>>>>> torn down. Every outer temp access must check what state the outer context
>>>>> is in. This is what Peter Deutsch’s first bytecode set for HPS (objectworks
>>>>> 2.5) did (with the optimization that an outer context would be converted
>>>>> into having a stack frame if it didn’t, so that temporary access only deals
>>>>> with the common case). And the cost and complexity is very high.  (Forgive
>>>>> the deep dive).
>>>>>
>>>>> The alternative, that I introduced in VW 5i, and that has been in Cog
>>>>> from the start is to never have contexts access outer temps directly.  This
>>>>> is done using a standard closure implementation model from lisp.  Any outer
>>>>> temporary which cannot possibly change during the execution of an inner
>>>>> block is copied into the closure and copied onto the block context/frame
>>>>> stack where it is effectively read only (an r-value). These are called
>>>>> “copied values”.
>>>>>
>>>>> Any outer temp which is written to in the block or possibly modified
>>>>> after being closed over (ie assigned to lexically after the block) must be
>>>>> put in an array (an indirection vector). Since the temporary holding
>>>>> the indirection vector itself is not written to after initialization
>>>>> indirection vectors are always copied values.
>>>>>
>>>>> So for example in inject:into: nextValue ends up in an indirection
>>>>> vector but aBlock is a copied value (look at the bytecode for the method;
>>>>> I’m on my phone).  Now the VMs context-to-stack mapping is greatly
>>>>> simplified (no need to intercept returns, no need to check outer temp
>>>>> access) and _much_ faster (and less buggy, known from hash experience with
>>>>> HPS).
>>>>>
>>>>> But to do this the bytecode compiler must identify when closed over
>>>>> temporaries can be treated as copied values or put in indirection vectors.
>>>>> Block extents model block nesting and allow the bytecode compiler to check
>>>>> whenever a temporary is read if it is being read in an inner block (ie is
>>>>> being closed over), and check whenever a temporary is being written if it
>>>>> is being written to in an inner block, or, if it has been closed over is
>>>>> being written to after a block that closed over it.  In either of these two
>>>>> cases, written to during the block and that write needing to be visible in
>>>>> the outer context, or written to after being closed over and that write
>>>>> needing to be visible within the block, the temp must live in an
>>>>> indirection vector.
>>>>>
>>>>> Christoph, you *have* to understand this to be able to be effective in
>>>>> working in the execution machinery (eg the runUntil... debugger
>>>>> machinery).  If you do not understand this then the fault is mine
>>>>> (ENTIRELY!), and the fault is that my explanation doesn’t make sense. So
>>>>> take as long as you need and ask yourself if you do or do not understand
>>>>> it.  If you don’t, please ask questions and I’ll try my best to explain
>>>>> things better.  I introduced this machinery and it is my responsibility for
>>>>> it not to be a burden.  This is a case of moving complexity out of the VM
>>>>> into the image, and is justified on performance and reliability grounds.
>>>>> But it is a severe cognitive burden on those using the bytecode compiler
>>>>> where, yo at the Smalltalk level, it seems they accessing outer temporaries
>>>>> directly is simple and all the block extent machinery horribly complex.  I
>>>>> assure you that the block extent machinery is trivial compared to the
>>>>> support required in the vm to implement direct access to outer temps as
>>>>> efficiently as possible ;which is way less efficiently than our current
>>>>> architecture allows).
>>>>>
>>>>> According to the documentation comment, it should be okay to be nil;
>>>>> however, the #blockExtent comment does not mention this fact.
>>>>> Please see the attached changeset for an approach to fix the issue.
>>>>> All tests from KernelTests-Methods and most tests from Test-Compiler pass
>>>>> it, except of the testAllNodePCs* which time out in my fresh image. It
>>>>> would be great to hear if this is the right way to solve the bug, or just
>>>>> extending a workaround :-)
>>>>>
>>>>>
>>>>> I’ll take a look some time today.  But I’d be much happier if you can
>>>>> understand the long winded explanation above and answer your own question.
>>>>> Great bug though :-)
>>>>>
>>>>> Best,
>>>>> Christoph
>>>>> <bugfix compiling block cascades.2.cs>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> _,,,^..^,,,_
>>>> best, Eliot
>>>>
>>>>
>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20191231/78e14a76/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedImage.png
Type: image/png
Size: 12558 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20191231/78e14a76/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pastedImage.png
Type: image/png
Size: 44509 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20191231/78e14a76/attachment-0003.png>


More information about the Squeak-dev mailing list