[squeak-dev] Solving multiple termination bugs - summary & proposal

Jaromir Matas m at jaromir.net
Mon May 10 18:49:19 UTC 2021


Hi Christoph!

I apologize for not responding earlier to your great comments. I had to
educate myself in error handling first :)

> 1. Regarding issue no. #5 in your list above ("Bug in Process>>#terminate
> |
> Returning from unwind contexts" [1]): Do you consider this thread resolved
> by now or is my answer to it still being expected? At the moment, this
> snippet you mentioned fails to unwind completely:
>
> x := nil.
> [self error: 'x1'] ensure: [
>     [self error: 'x2'] ensure: [
>         [self error: 'x3'] ensure: [
>             x:=3].
>         x:=2].
>     x:=1].

I reread your remarks regarding how to interpret a situation like above:
what do we actually abandon when new errors appear during termination and we
abandon the nested debuggers? I've enclosed a changeset that makes
abandoning the debuggers equivalent to terminating the debugged process
(including unwind) - i.e. in the example above we'll get the first debugger,
abandon it which causes a process termination, encounter the second error
and start the second debugger, abandon it which again causes another
termination, etc. As a result all assignments will be executed (imagine a
`stream close` instead of `x:=1` so I guess it's justified). 

However, because this is happening in the ensure block during unwind, it
seems that abandoning is almost equivalent to proceeding :) (Not entirely
though: proceed would continue after unwinding, abandon only proceeds within
the unwind scope). This poses a new challenge however - how to kill a
debugger if we deliberately want or have to stop debugging a process
immediately, i.e. without unwinding? Consider this example:

`[] ensure: [self gotcha]`

We'd get a debugger with a MNU error (Message Not Understood), abandon it
and get another debugger with the same error creating an infinite recursion
(due to how #doesNotUnderstand is written). This particular example is taken
care of in the changeset but in general I miss a Kill button - has this been
ever considered? Note: the infinite recursion danger is present even in the
current implementation but neutralized by allowing just one error during
unwinding halfway through ensure blocks :)

There's also a file Kernel-jar.1403 in the Inbox:
http://forum.world.st/The-Inbox-Kernel-jar-1403-mcz-td5129607.html

There are some additional changes to #terminate - mostly cleaning and
simplifying the code. And more comments.



> 2. What is the current state of this thread [2]? If all issues are
> resolved
> from your perspective, there is no need to discuss anything further -
> otherwise, I guess it's your turn to answer again. :)

No progress on my side but I look forward to getting to it and responding :)



> 3.1 Consider the following snippet:
>
> | p |
> p := Processor activeProcess.
> Transcript showln: p == Processor activeProcess.
> [Transcript showln: p == Processor activeProcess] ensure: [
>     Transcript showln: p == Processor activeProcess].
> p
>
> Debug it, then step into the first block, and abandon the debugger. We
> would
> expect to see another "true" in the Transcript, but instead, we see a
> "false". This is because #runUntilErrorOrReturnFrom: does not honor
> process-faithful debugging. The protocol on Process, on the other hand,
> does
> so. So probably we would want to wrap these sends into
> #evaluate:onBehalfOf:.

This is not a new issue; if you step further - into the ensure (i.e. the
argument block) block and then abandon the debugger, you will see false
instead of true even in images before the change I introduced. The reason is
precisely what you described - the use of #runUntilErrorOrReturnFrom which
operates on a context stack belonging to an other process and this way
guarantees a correct execution of non-local returns on that process's
context stack (for the price of losing process-faithful debugging).

I'm aware of the process-faithful debugging issue and I'd love to fix it,
but I'm afraid my debugger implementation knowledge is presently next to
none; I'll have to put it on my to-do list ;) I'd expect though simple
wrapping into #evaluate:onBehalfOf: may reintroduce the original nasty
non-local error bug. Would you have an idea how to wrap it so that non-local
returns still worked? That would be awesome.



> 3.2 As I think I mentioned somewhere else already, the result of
> #runUntilErrorOrReturnFrom: *must* be checked to make sure that the
> execution or the unwinding has not halted halfway. I don't see this in
> Process >> #terminate either. This might be the cause of the bug I
> mentioned
> in #1 of this post. Probably it's the best idea to discuss this in [1],
> too.
> :-)

The cause of the bug in [1] (i.e. the disastrous behavior of `[self error]
ensure: [^2]`) was caused by executing the non-local return (`^2`) on a
wrong context stack which happened as a result of using #popTo (and
consequently #evaluate:onBehalfOf:) for evaluation of the said non-local
return. 

Checking the return value of #runUntilErrorOrReturnFrom is the main idea
behind the fix presented in this changeset. If the execution of the unwind
block gets halted by an error, #runUntilErrorOrReturnFrom returns from the
"foreign" stack and reports the error. Because we are operating in an unwind
block I suggest the execution of the unwind block continues by opening a new
debugger window for the found error, and the user will decide what to de
next. To achieve this the implementation of #runUntilErrorOrReturnFrom must
be modified slightly to resignal the caught exception rather than resume it
- see the enclosed changeset. No other methods use
#runUntilErrorOrReturnFrom so let's either accept the suggested modification
or create a version of #runUntilErrorOrReturnFrom with the modified
behavior.



> 4. What's the current state of tests? Have you contributed tests for all
> the
> issues you mentioned above? This would be awesome. :-)

There's a set of basic semantics tests Tests-jar.448 in the Inbox. I
realized I don't know how to "simulate" pressing debugger's Abandon in a
test but I'll add more when I figure it out :) Plus more test will come if
the change proposed here is accepted.



> Organizational notes: For sake of overview, I propose to keep this thread
> of
> the current level of abstraction and discuss all the implementation
> details
> in separate threads such as [1]. Ideally, we should also "close" all the
> different threads about Process issues by adding another message to each
> of
> them in order to help our future selves to keep an overview of their
> solutions ...

Absolutely, I'll go through all relevant discussions and update them.

Thanks again very much for all your comments,

best,



-----
^[^ Jaromir
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html


More information about the Squeak-dev mailing list