[squeak-dev] modifying copied context vars

Sat Mar 26 12:02:05 UTC 2016

2016-03-26 1:58 GMT+01:00 Eliot Miranda <eliot.miranda at gmail.com>:

> Hi Nicolai,
>
> On Fri, Mar 25, 2016 at 2:51 AM, Nicolai Hess <nicolaihess at gmail.com>
> wrote:
>
>>
>>
>> 2016-03-25 6:26 GMT+01:00 Eliot Miranda <eliot.miranda at gmail.com>:
>>
>>> Hi Nicolai,
>>>
>>> > On Mar 24, 2016, at 1:53 AM, Nicolai Hess <nicolaihess at gmail.com>
>>> wrote:
>>> >
>>> > Hello,
>>> >
>>> > we have a failing test (OCClosureCompilerTest>>#testDebuggerTempAccess)
>>> > (fails since spur, but not related - I think).
>>> >
>>> > A simple example to reproduce the behavior:
>>> > | local1 remote1|
>>> > remote1:=4.
>>> > local1 :=3.
>>> > [:i | |c|
>>> >     c := local1.
>>> >     remote1 := i.
>>> >     i halt.
>>> >     "local1:=25.  <-- evaluate during debugging"
>>> >      ] value:1.
>>> > Transcript show:local1;cr.
>>> > Transcript show:remote1;cr.
>>> > (Executing this code and evaluating "local:=25" after the debugger
>>> halts for
>>> > "i halt" will modify the var "remote1" instead of "local1", as this is
>>> a vector tempvar , proceeding
>>> > the execution will crash at accessing the remote1 value).
>>> >
>>> >
>>> > The purpose of the testDebuggerTempAccess test case is, evaluating code
>>> > that changes the value of a copied temp var *does not* change the
>>> value of
>>> > this var in the outer context. (see comment in the test case:
>>> > "this is not 25 as the var is a local, non escaping variable that was
>>> copied into the block,
>>> >     If the compiler would have known about the write, it would have
>>> made the var escaping".
>>> > )
>>> >
>>> > But the implementation (OCCopyingTempVariable>>#writeFromContext:
>>> aContext scope: contextScope value: aValue)
>>> > explicitly iterates through all outer context(s) and changes the vars
>>> as well.
>>> >
>>> > 1. Question: Who is right?
>>>
>>> What do you mean?  Because the closure model is as it is (for very good
>>> reason) the computer copies temporaries whose values cannot change during
>>> the execution of a block into that block if the block references the
>>> variable.  If the debugger is to support updating such copied variables
>>> then it must create the illusion of "the variable being updated" because
>>> there isn't just one variable.
>>>
>>
>> I know that it only operates on a copy of this var (if the var is only
>> read (in the original code)), but you *can* evaluate code
>> during debugging, that can modify the var.
>> I don't know why or if we want to support that. I just see a failing test
>> case that tries to do exactly that.
>> And the supposed behavior is, that the outer context variable value does
>> not change.
>>
>
> But it would be much nicer if the debugger /did/ change all the copies of
> the variable it could find, right?  What behaviour do we want to support?
> What behaviour can we afford to support?
>

I think I would prefer if the debugger only changes the local copy. For me,
it is like changing a local variable that shadows a variable from outer
scope. Of course, you can not
easily see that this variable is a copy resp. shadows the one / the others
from outer scope.

My main concern is that some tests fail, and you may crash pharo by
changing the var from within debugger.

>
>
>
>> But this test fails.
>>
>> I tried similar code in squeak, and there the outer context var does not
>> change:
>>
>> |local remote outerContext|
>> local :=1.
>> remote:=2.
>> outerContext := thisContext.
>> (1 to:1) do:[:i |
>>     remote := i.
>>     Transcript show:local;cr.
>>     i halt.
>>     "remote:=20"
>>     "local:=10"
>>     "outerContext tempAt:1 put:30"
>>     Transcript show:local;cr.
>>     ].
>> Transcript show:local;cr.
>> Transcript show:remote;cr.
>>
>>
>> If the debugger halts at "i halt" and you evaluate the code
>>
>>     "remote:=20"
>> the remote value changes for t he block context and for its outer context
>> - OK
>>     "local:=10"
>> the local value changes for the block context but not for its outer
>> context - OK
>>     "outerContext tempAt:1 put:30"
>> the local value only changes for the outer context - OK
>>
>> Is this right?
>>
>
> I don't think it's either right or wrong; I think it's what one would
> expect from a naive debugger and the current compilation semantics.  I
> think you need to think about the problem at a tool level.
>
> The right question to ask is "what behaviour would you like in the
> debugger?"  In a debugger there are typically two interfaces that allow
> modifying the variables in a context. One is via a doit in which the
> variables are in scope, and one is an inspector on the variables embedded
> in a debugger.  It would be nice if the two were consistent.  But it is
> easier to make them consistent if only local values are changed.  The issue
> is really how powerful you want to map DebuggerMap (or whatever the Pharo
> equivalent).
>
> Here are some issues to think about.  It's easy for a debugger map to work
> out what the right indices are for a varable that is copied in various
> scopes; it just asks the compiler for the relevant info.  It is /not/ easy
> to find out all the potentially copied variables.  For example, the
> following creates many block and many activations of those blocks, and a
> copy of "copied" is in all of them.
>
> | copied |
> copied := 7.
> ^(1 to: copied) collect:
>     [:i| [Semaphore new wait. copied] fork. [Semaphore new wait. copied]]
>
> Just before the collect: terminates execution there are 9 contexts
> accessing copied; the context declaring it, the context for the block [:i|
> [Semaphore...copied]], and the 7 contexts waiting on a new semaphore.
> There are 15 blocks whose copiedValues includes copied, the block [:i|
> [Semaphore...copied]], the 7 copies of the first [Semaphore new wait.
> copied] block, which all have contexts whose closureOrNil points to them,
> and seven 7 copies of the second [Semaphore new wait. copied] block.  So
> the only way to reliably change copy is to do allInstances on Context and
> BlockClosure and collect all whose home context is the context that
> declares copied and to change the 0th temp or the 0th copied value in each
> context or block.  Is it worth it?  Is it perhaps too expensive in some
> circumstances, and hence needs to be a preference?  Is the preference too
> hard to understand or explain?
>
> An alternative might be to warn the debugger user, saying something like
> "warning, not all references to temporary variable foo can be updated." and
> update only in activations on the stack.
>
> Personally I would implement the all instances approach if I had the
> time.  But I haven't had time for 8 years ;-)
>

Now I really prefer to only change the local value :-)

>
>
> > But the reason why the test actually fails is different. It is because
>>> evaluating
>>> > the code that modifies the var "local1:=25" ends up to a call
>>> > aContext tempAt: self indexFromIR put: aValue
>>> > but the index (indexFromIR) can be different for every context (method
>>> context / inner block
>>> > context(s)).
>>>
>>> Right.
>>>
>>> >
>>> > 2. Question: should the index be always the same?
>>>
>>> How can it be?  If, for example, a nullary block makes use of a copied
>>> variable and has no local temps then that variable will end up with index
>>> 0, no matter its indices in outer scopes.  Surely you're not proposing
>>> padding the block with unused variables just so copied variables can have
>>> the same index?
>>>
>>
>> Yes, makes sense.
>>
>>
>>>
>>> >
>>> > thanks inadvance
>>> >
>>> > Nicolai
>>> >
>>> >
>>> > ps: since spur (resp. compiler changes that were done for new spur
>>> images), the index of tempvars can be different.
>>> > In pre-spur, this testcase modifies an outer context var, but one that
>>> isn't checked (the argument "two"), therefore
>>> > the test succeed. In spur, this testcase modifes a different outer
>>> context var "remote1", and the test
>>> > fails.
>>>
>>>
>>> Since Sour hasn't changed the compiler there looks to be a bug.  Sour
>>> has changed the identityHash size a lot from IIRC 11 bits to 22.  So hashed
>>> collections can end up with different enumerations.  But that shouldn't
>>> affect the ordering assigned to temps in the compiler.  I suggest tracking
>>> down why the code is different in Spur is a very important thing to do.
>>>
>>
>> Yes, we merged some code for spur support (opals compiler code) and lost
>> some intermediate changes that were made for opal. Debugging this bugs was
>> a mess.
>>
>
> Ouch.  Debugging the debugger.  Always frustrating.  Does Opal put the
> indirection vectors in the same place in each declaring scope?  Also what
> rule does Opal use for ordering copied values?  Here's the rule for
> Squeak's compiler:
>

It is not about debugging the debugger, but finding the change that was
responsible for this changed index, and to find out if it was changed on
purpose

>
> ParseNode class>>*tempSortBlock*
> "Answer a block that can sort a set of temporaries into a stable
> order so that different compilations produce the same results."
> ^[:t1 :t2| | be1 be2 bs1 bs2 |
>   t1 index < t2 index "simple sort by index."
>   or: [t1 index = t2 index "complex tie break"
>  and: [t1 isRemote ~= t2 isRemote
> ifTrue: [t2 isRemote] "put direct temps before indirect temps"
> ifFalse:
> [((be1 := t1 definingScope blockExtent) isNil
>  or: [(be2 := t2 definingScope blockExtent) isNil])
> ifTrue: [t1 name < t2 name] "only have the name left to go on"
> ifFalse: "put temps from outer scopes before those from inner scopes"
> [(bs1 := be1 first) < (bs2 := be2 first)
> or: [bs1 = bs2 and: [t1 name < t2 name]]]]]]] "only have the name left to
> go on"
>
> Hence in the following:
>
> | c1 r1 c2 r2 |
> c1 := #(copied 1). c2 := #(copied 2).
> r1 := #(remote 1). r2 := #(remote 2).
> [ | c3 r3 |
>   c3 := #(copied 3).
>   r3 :=  #(remote 3).
>   [r1 := r1 copy. r2 := r2 copy. r3 := r3 copy.
>    { c1. c2. c3. thisContext shallowCopy}] value] value
>
> Numbering the scopes 1, 2 & 3, then
>
> Scope 1:
> 0: c1
> 1: c2
> 2: indirection1, r1 at 1, r2 at 2
>
> Copied values in block for scope 2:
> 1: c1
> 2: c2
> 3: indirection1
>
> Scope 2:
> 1: c1
> 2: c2
> 3: indirection1
> 4: c3
> 5: indirection2
>
> Copied values in block for scope 3:
> 1: c1
> 2: c2
> 3: indirection1
> 4: c3
> 5: indirection2
>
> Scope 3:
> 1: c1
> 2: c2
> 3: indirection1
> 4: c3
> 5: indirection2
>
>
> Intuitively I expected
>
> Scope 1:
> 0: c1
> 1: c2
> 2: indirection1, r1 at 1, r2 at 2
>
> Copied values in block for scope 2:
> 1: c1
> 2: c2
> 3: indirection1
>
> Scope 2:
> 1: c1
> 2: c2
> 3: indirection1
> 4: c3
> 5: indirection2
>
> Copied values in block for scope 3:
> 1: c1
> 2: c2
> 3: c3
> 4: indirection1
> 5: indirection2
>
> Scope 3:
> 1: c1
> 2: c2
> 3: c3
> 4: indirection1
> 5: indirection2
>
>
> :-/
>

I need more time to step through this.

>
>
> HTH
>>>
>>> Eliot
>>>
>>
>> thanks
>>
>>
>>> _,,,^..^,,,_ (phone)
>>>
>>> PS if you want to understand why copied variables are so important in
>>> the closure model you need to understand context-to-stack mapping and the
>>> overhead that not copying temps adds to returns in a context-to-stack
>>> mapping VM.  It's not obvious but I hope my blog does an ok job of
>>> explaining something alas complex but extremely important for performance.
>>>
>>
>> I read your blog
>>
>
> Good.  So then ask yourself what the right thing to do is, and what the
> affordable thing to do is.  You understand the issue, now you have to make
> some decisions...
>

I do not yet see use cases in which one or the other would be better or
right.
Therefore I would go for the solution that is easier to implement resp. fix.

>
> _,,,^..^,,,_
> best, Eliot
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20160326/1caeaa39/attachment-0001.htm