[squeak-dev] Solving multiple termination bugs - summary & proposal

mail at jaromir.net mail at jaromir.net
Sun Nov 28 21:57:25 UTC 2021


Hi Eliot,

thanks very much for reviewing this.

> This looks good. The one thing I find concerning is running terminate in a different process.  Various idioms could legitimately require an unwind block to run in the process in which it was created.  

Frankly, I haven't explicitly considered that... In the worst case we could revert active process termination bit like it was. Terminating an active process via a different process was just supposed to unify (and extend) the termination semantics.

> At the very least shouldn't the effectiveProcess be set to be that of the process that is terminating while its unwind block executes?

That's what I though I did in Process>>#complete: topContext to: aContext 

[...]
	pair := Processor activeProcess
				evaluate: [topContext runUnwindUntilErrorOrReturnFrom: aContext]
				onBehalfOf: self.
[...]

where self is the process being terminated. #runUnwindUntilErrorOrReturnFrom: jumps to the stack of the process being terminated and executes the unwind block with effective process set to the process being terminated. 

Or am I mistaken?

>  For example, what if I did something like this:
> doWhileMonitoring: aBlock
>     monitoredProcesses add: Processor activeProcess.
>     ^aBlock ensure: [monitoredProcesses remove:: Processor activeProcess ifAbsent: []]
>
> That unwind blocks are run in the same process as they were created seems to me to be a legitimate expectation. Can this be reengineered maintaining this invariant, or at least setting the effective process to be that of the process being terminated?

This is a very interesting example - thanks! Running terminate in a different process allows to relatively easily identify all unwind blocks half-way through their execution and complete them. In case setting the effective process in #complete:to: doesn't work as you requested, then I'm afraid unwinding blocks half-way through their execution would be very hard to achieve from the process being terminated itself (not impossible though - I tried but then doubted it was worth the effort). Or theoretically, it might be possible to maintain two stacks for the process being terminated - one "original" that needs to be unwound and a second "helper" one running terminate - but I haven't gotten beyond the idea, it seemed too unchartered :)

Besides, when an error happens during unwind, a debugger opens and the identity of the process running terminate may change, so the invariant may not survive the first error anyway (at least in what I was considering).

> So instead of
>
>     self isActiveProcess ifTrue: [
>         [self terminate] fork.
>         ^self suspend].
>
> we would use
>
>     self isActiveProcess ifTrue:
>         [| arnold |
>         arnold := [arnold evaluate: [self terminate] onBehalfOf: self] newProcess.
>         arnold resume.
>         ^self suspend].
>

No, that wouldn't work because the Terminator identifies itself as the process being terminated when executing [self terminate] resulting in an infinite loop.

At the moment I can't see why setting the effective process in #complete:to: wouldn't suffice but I'll get there :) I'll try to come up with a test.

I'm intrigued: why did you use '[arnold evaluate:...' and not '[Processor activeProcess evaluate:...'; I can't figure out the difference :)

> I might also consider discarding the return of suspend, because it shouldn't be executed normally, and using, say
>
> ```
> self isActiveProcess ifTrue:
>     [| arnold |
>     arnold := [arnold evaluate: [self terminate] onBehalfOf: self] newProcess.
>     arnold resume.
>     self suspend.
>     self error: 'error in terminate; execution resumed after suspend for termination'].
> ```

No, this wouldn't work if Processor activeProcess terminate was inside an unwind block. Terminate would attempt to proceed after self suspend and would always raise the error.

Thanks again,

Jaromir

PS: I used this example to test your suggestions:

```
| p |
p := [
	[
		[ ] ensure: [
			[Processor activeProcess terminate] ensure: [
				Transcript show: 'x1']. 
			Transcript show: 'x2']
	] ensure: [
		Transcript show: 'x3'].
	Transcript show: 'x4'
] newProcess.
p resume.
"Two yields necessary: terminate active is a two-step procedure"
Processor yield. Processor yield.
Transcript show: p isTerminated printString

"prints x1 x2 x3"
```


^[^ Jaromir
  --

Sent from Squeak Inbox Talk

On 2021-11-28T10:52:41-08:00, eliot.miranda at gmail.com wrote:

> Hi Jaromir, Hi Christoph,
> 
> On Sat, Nov 27, 2021 at 12:12 PM <mail at jaromir.net> wrote:
> 
> > Hi all, Christoph,
> >
> > I finally managed to separate #terminate and the 'block cannot return
> > issue' so that we can discuss them independently :) And thanks to Eliot's
> > solution of the infinite recursion in #doesNotUnderstand the #terminate
> > code is now cleaner. I haven't included Eliot's modification of
> > #doesNotUnderstand in my changeset but paste it below instead (because I'm
> > not the author; I hope it'll be merged though):
> >
> > The #terminate code presented in Kernel-jar.1426 is the final version of
> > my effort to get rid of all the bugs reported in this thread (and make
> > #terminate semantics more consistent). It passes all standard tests and the
> > tests complementing this changeset:
> >
> > KernelTests-jar.406 (Terminator test)
> > KernelTests-jar.407 (McClure test)
> > Tests-jar.466 (unwind tests)
> > ToolsTests-jar.105 (debugger tests)
> >
> > In addition, it's fully compatible with Christoph's solution of the 'block
> > cannot return issue' presented in Kernel-ct.1405 (
> > http://lists.squeakfoundation.org/pipermail/squeak-dev/2021-May/215526.html).
> >
> >
> > Summary and discussion about the bugs and changes in #terminate:
> > http://forum.world.st/Solving-multiple-termination-bugs-summary-amp-proposal-td5128285.html
> >
> > I think it's ready for merging... Please review.
> >
> 
> This looks good. The one thing I find concerning is running terminate in a
> different process.  Various idioms could legitimately require an unwind
> block to run in the process in which it was created.  At the very least
> shouldn't the effectiveProcess be set to be that of the process that is
> terminating while its unwind block executes?
> 
> For example, what if I did something like this:
> doWhileMonitoring: aBlock
>     monitoredProcesses add: Processor activeProcess.
>     ^aBlock ensure: [monitoredProcesses remove:: Processor activeProcess
> ifAbsent: []]
> 
> That unwind blocks are run in the same process as they were created seems
> to me to be a legitimate expectation. Can this be reengineered maintaining
> this invariant, or at least setting the effective process to be that of the
> process being terminated?
> 
> So instead of
> 
>     self isActiveProcess ifTrue: [
>         [self terminate] fork.
>         ^self suspend].
> 
> we would use
> 
>     self isActiveProcess ifTrue:
>         [| arnold |
>         arnold := [arnold evaluate: [self terminate] onBehalfOf: self]
> newProcess.
>         arnold resume.
>         ^self suspend].
> 
> I might also consider discarding the return of suspend, because it
> shouldn't be executed normally, and using, say
> 
>     self isActiveProcess ifTrue:
>         [| arnold |
>         arnold := [arnold evaluate: [self terminate] onBehalfOf: self]
> newProcess.
>         arnold resume.
>         self suspend.
>         self error: 'error in terminate; execution resumed after suspend
> for termination'].
> 
> 
> > --
> > Eliot's fix of infinite recursion in doesNotUnderstand:
> >
> >
> > doesNotUnderstand: aMessage
> >          "Handle the fact that there was an attempt to send the given
> >           message to the receiver but the receiver does not understand
> >           this message (typically sent from the machine when a message
> >          is sent to the receiver and no method is defined for that
> > selector)."
> >
> >         "Testing: (3 activeProcess)"
> >
> >         | exception resumeValue |
> >         (exception := MessageNotUnderstood new)
> >                 message: aMessage;
> >                 receiver: self.
> >         resumeValue := exception signal.
> >         ^exception reachedDefaultHandler
> >                 ifFalse: [resumeValue]
> > "----------> this wrapper detecting recursion is added:  -------------->"
> >
> >                 ifTrue: [
> >                         [aMessage sentTo: self]
> >                                 on: MessageNotUnderstood
> >                                 do: [:ex| | args |
> >                                        args := ex message arguments.
> >                                        (ex receiver == self
> >                                        and: [ex message selector ==
> > aMessage selector
> >                                         and: [(1 to: aMessage numArgs)
> > allSatisfy: [:i| (args at: i) == (aMessage argumentAt: i)]]]) ifFalse:
> >                                        [ex pass].
> >                                     self error: 'infinite recursion in
> > doesNotUnderstand:']]
> >
> > (see discussion
> > http://lists.squeakfoundation.org/pipermail/squeak-dev/2021-November/217031.html
> > )
> >
> > best,
> >
> > ^[^ Jaromir
> >   --
> >
> > Sent from Squeak Inbox Talk
> >
> > On 2021-11-17T18:49:44+01:00, mail at jaromir.net wrote:
> >
> > > Hi Christoph,
> > >
> > > Once more, apologies for taking so long to respond.
> > >
> > > > > [...] there really are two distinct unwind semantics : one "light"
> > for regular returns and one "heavy" for termination. Both are very similar
> > yet each require a slightly different behavior - that's why the duality
> > #runUntilErrorOrReturnFrom / #runUnwindUntilErrorOrReturnFrom or #complete:
> > / #complete:to: and #unwindTo: / #terminate.
> > > >
> > > > But they are still pretty similar ... Couldn't you just add some extra
> > parameter to these message? E.g., "Context runUntilErrorOrReturnFrom:
> > aggressive"? I have the feeling that we could eliminate significant
> > duplication that way by inserting some ifs and elses ... Duplicated code
> > very often tends to be harder to maintained.
> > >
> > > Yes, I understand your concern and agree two ifs would allow to pack the
> > two twin beasts into one can (I refer to "rueorf" & "ruueorf"). Also, we
> > could consider Process >> complete:to: as a generalized version of Process
> > >> complete and rewrite it accordingly. This is doable, maybe even
> > desirable but for the time being I'd very much prefer to keep them separate
> > to facilitate troubleshooting of potential problems (or localizing them).
> > Each of the two similar methods belong to a different execution path after
> > all. And besides, #runUntilErrorOrReturnFrom is already crazy enough and I
> > don't feel like making it even more complex now :)
> > >
> > > > > With regards to #unwindTo: - I haven't tested it yet but I'm
> > wondering whether it wouldn't have the same unwind problem with non-local
> > returns as the original #terminate and require a similar fix?
> > > >
> > > > Hm, do we need to implement both modi - (ii) and (iii) as described in
> > [1] - in Context >> #unwindTo: as well? Something like
> > #unwindTo:aggressive:?
> > >
> > > I put #unwindTo: aside for the moment; it contains the same flaw and
> > deserves its own thread.
> > >
> > > > Unfortunately I've lost track of these infinite loops [...]
> > >
> > > Yeah, tell me about it :)
> > >
> > > > [...]- could you maybe point me to some concrete examples that lead to
> > an infinite recursion without this special check? :-)
> > >
> > > Try:
> > >
> > > ```
> > > [self error: 'error'] ensure: [self gotcha. Transcript show: 'all
> > cleaned up! ']
> > > ```
> > >
> > > 'gotcha' represents a method unknown to the system. The code will print
> > 'all cleaned up! ' after abandoning the error and the debugger - meaning
> > the system will survive this code and execute all cleanups (aka unwinds).
> > In case you use your #terminateAggressively as the default action for
> > abandoning the Debugger, you won't have any problem of course because you
> > skip completing the halfway through unwind. What I'm saying is: even if you
> > chose the classic heavy #terminate to exit the Debugger, the code will
> > execute remaining unwinds as one might expect (or hope for).
> > >
> > > > One more question about your #runUnwindUntilErrorOrReturnFrom: Are you
> > maybe missing something like "cxt terminate" in the "No error was raised"
> > case? Just wondering.
> > >
> > > I guess I know what you mean... 'ctxt' would get garbage collected soon
> > anyway so I left it as is to keep the code as simple as possible.
> > >
> > >
> > >
> > > Christoph, I'll also address your comments regarding the
> > BlockCannotReturn behavior here:
> > >
> > > > Isn't that ProceedBlockCannotReturn tautologous? I think that by
> > actively proceeding from a BlockCannotReturn error users already accept
> > that they are going to resume execution in another way.
> > >
> > > Well, the very proceeding from a BlockCannotReturn error sort of
> > violates common sense but during our lengthy discussion you convinced me it
> > makes a very good sense when troubleshooting :) The idea is by no means
> > trivial - unlike hitting Proceed :) So an extra warning can't hurt...
> > >
> > > But more importantly, I need something to know the user let the process
> > continue after reaching the BlockCannotReturn error - thus the new
> > ProceedBlockCannotReturn exception which allows Process >> #complete:to: to
> > deal with the new course of events.
> > >
> > > > Apart from that, the message text of your new warning is not correct
> > if self pc <= self endPC. :-)
> > >
> > > Yes, and I'd like to make the warning message more verbose so even if
> > someone hit Proceed without much thinking they could get an idea what's
> > about to happen :) Should the warning interfere with some potential
> > automation efforts we could come up with some alternative way.
> > >
> > > I'll copy my reply regarding BCR to the "The Inbox: Kernel-ct.1405.mcz"
> > thread and we can continue discussing this particular issue there. It's
> > really a separate issue and I included your patch here along with the main
> > termination code because it perfectly complements it and prevents the
> > disastrous crashes caused by Proceeding the BCR error.
> > >
> > > > [...] the computation terminated is also wrong, IMO, you should get a
> > BlockCannotReturn here.
> > >
> > > Yes, I agree, there's definitely more to it which deserves to be
> > discussed thoroughly and separately in the "The Inbox: Kernel-ct.1405.mcz"
> > thread.
> > >
> > > If you agree I'd really appreciate if your fix could be accepted here in
> > including the ProceedBlockCannotReturn exception I need to make it work
> > together with the code I'm presenting here. If we come up with a better
> > idea in the other discussion we can make amends here as well; I take it as
> > a patch, not a definitive solution :)
> > >
> > > Thanks again so much for all your suggestions and examples! It's always
> > a pleasure :) Plus - the Inbox Talk is priceless... I use it exclusively
> > now and look forward to any improvements you might come up in the future ;)
> > >
> > > Best,
> > >
> > >
> > > ^[^ Jaromir
> > >   --
> > >
> > > Sent from Squeak Inbox Talk
> > >
> > > On 2021-08-22T16:49:10+02:00, christoph.thiede at
> > student.hpi.uni-potsdam.de wrote:
> > >
> > > > Hi Jaromir,
> > > >
> > > > > Yes, I was wondering why I couldn't get rid of the duplication and
> > now I think it's because there really are two distinct unwind semantics :
> > one "light" for regular returns and one "heavy" for termination. Both are
> > very similar yet each require a slightly different behavior - that's why
> > the duality #runUntilErrorOrReturnFrom / #runUnwindUntilErrorOrReturnFrom
> > or #complete: / #complete:to: and #unwindTo: / #terminate.
> > > >
> > > > But they are still pretty similar ... Couldn't you just add some extra
> > parameter to these message? E.g., "Context runUntilErrorOrReturnFrom:
> > aggressive"? I have the feeling that we could eliminate significant
> > duplication that way by inserting some ifs and elses ... Duplicated code
> > very often tends to be harder to maintained.
> > > >
> > > > > With regards to #unwindTo: - I haven't tested it yet but I'm
> > wondering whether it wouldn't have the same unwind problem with non-local
> > returns as the original #terminate and require a similar fix?
> > > >
> > > > Hm, do we need to implement both modi - (ii) and (iii) as described in
> > [1] - in Context >> #unwindTo: as well? Something like
> > #unwindTo:aggressive:?
> > > >
> > > > > But in general - yes, any method/exception purposefully (or not)
> > written to create a loop will break this patch (I admit it is just a patch
> > really). I extracted it to #complete:to: to make #terminate clean; this is
> > a WIP; I wish there was a holistic solution to this - maybe checking for
> > exception recursion by default? :)
> > > >
> > > > Sounds better already, if feasible! But how would you detect this?
> > Unfortunately I've lost track of these infinite loops - could you maybe
> > point me to some concrete examples that lead to an infinite recursion
> > without this special check? :-)
> > > >
> > > > One more question about your #runUnwindUntilErrorOrReturnFrom: Are you
> > maybe missing something like "cxt terminate" in the "No error was raised"
> > case? Just wondering.
> > > >
> > > > Best,
> > > > Christoph
> > > >
> > > > [1]
> > http://forum.world.st/The-semantics-of-halfway-executed-unwind-contexts-during-process-termination-tp5129800p5130110.html
> > > >
> > > > ---
> > > > Sent from Squeak Inbox Talk
> > > >
> > > > On 2021-05-31T17:01:46-05:00, m at jaromir.net wrote:
> > > >
> > > > > Jaromir Matas wrote
> > > > > > Hi All,
> > > > > > I've sent an updated version of #teminate integrating Christoph's
> > solution
> > > > > > of BlockCannotReturn recursion problem (in [1]), along with a
> > battery of
> > > > > > tests exploring termination of nested ensure and cascading errors
> > behavior
> > > > > > (Debugger tests are for info and a final version can wait until
> > releasing
> > > > > > Christoph's proposal in [2]).
> > > > > >
> > > > > > It's pretty much final, I hope...
> > > > > >
> > > > > > Any complaints about #terminate - please shout ;)
> > > > > >
> > > > > > [1]
> > http://forum.world.st/The-Inbox-Kernel-ct-1405-mcz-td5129706.html
> > > > > > [2]
> > > > > >
> > http://forum.world.st/The-semantics-of-halfway-executed-unwind-contexts-during-process-termination-tp5129800p5130110.html
> > > > > >
> > > > > > best,
> > > > >
> > > > > Here's the link:
> > > > > http://forum.world.st/The-Inbox-Kernel-jar-1414-mcz-td5130198.html
> > > > >
> > > > >
> > > > >
> > > > > -----
> > > > > ^[^ Jaromir
> > > > > --
> > > > > Sent from: http://forum.world.st/Squeak-Dev-f45488.html
> > > > >
> > > > >
> > > > -------------- next part --------------
> > > > An HTML attachment was scrubbed...
> > > > URL: <
> > http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20210822/9adf7f0d/attachment.html
> > >
> > > >
> > > >
> > >
> > >
> >
> >
> 
> -- 
> _,,,^..^,,,_
> best, Eliot
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20211128/ceb22524/attachment-0001.html>
> 
> 


More information about the Squeak-dev mailing list