[squeak-dev] Code simulation error (was Re: I broke the debugger?)

Thiede, Christoph Christoph.Thiede at student.hpi.uni-potsdam.de
Tue Jan 28 08:17:37 UTC 2020


Hi Tim, excellent work!


Coincidentally, I studied the same problem yesterday, but I did not yet complete to report my observations to you. So let me to this hereby:


After many hours of funny debugging, now I could create this minimum failing example:


Processor activeProcess
evaluate: [self error. self inform: #foo]
onBehalfOf: [] newProcess


Expected behavior: First, a debugger is shown, and after proceeding it, a dialog window is shown.

Actual behavior: Both the debugger and the dialog window are shown asynchronously!

Suspicion of someone who did not yet dive deeply into the activeProcess concept: The debugger resumes the wrong process, as the activeProcess concept simulates a different running process for the error, even against the debugger.

If my theory is correct, we would need to find a way to look behind the scenes of the activeProcess and use it in the debugging code. But first, I really need to learn more about this concept.


(Connection to our Context >> #at: problems: Probably no primitive issue at all, just the fact, that #at: calls itself recursively after the error was proceeded - similar like #doesNotUnderstand: does.)


This is an in-midst-of-work message; just did not want us to any duplicate or redundant work. Will have a closer look at this disgusting problem ASAP!

And vice versa, it would be very nice if you could keep me/us up-to-date!


(Oh, what a fun to debug a self-simulating system ...)


Best,

Christoph


________________________________
Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von tim Rowledge <tim at rowledge.org>
Gesendet: Dienstag, 28. Januar 2020 03:08 Uhr
An: The general-purpose Squeak developers list
Betreff: [squeak-dev] Code simulation error (was Re: I broke the debugger?)

Something pretty weird is happening when the break is hit. I *finally* got a debugger open on a backtrace that includes the problem with Context>at: failing becasue the argument is 0. It's all a bit strange and unless I managed to do something very odd it looks like a fairly serious bug.

I actually caught this because something went wrong in code I added to try to log the dan initial error. Despite that it does appear to be a trace on the #break problem.


debugger>doStep called from #stepOver.
#handleLabelUpdatesIn:whenExecuting: used and does [interruptedProcess completeStep: currentContext] which uses...
Process>>evaluate:onBehalfOf:
Context>runUntilErrorOrReturnFrom:

Context>jump
 - *we are checking for stackp = 0 which is the very thing that causes problems later with the #pop*

In the #stepToSendOrReturn we use interpretNextInstructionFor: which leads to InterpretV3ClosuresExtension: 7 in: (Object>>break) for: ( aContext sender #on:do:, pc 24 stackp 0 method Object>>break, etc)
 -> doPop
 -> pop  (presumably stackp was 0 here? See above re: #jump)
 -> at: ... but if so why did the error code appear to skip over the first two tests of it?
        <primitive: 210>
        index = 0 ifTrue:[FileStream newFileNamed: 'squeakBreak.log' do:[:f| self errorReportOn: f]].
        index isInteger ifTrue:
                [self errorSubscriptBounds: index].
        index isNumber
                ifTrue: [^self at: index asInteger]"<--- it went here and on the second go around it picked up that index = 0 properly."
                ifFalse: [self errorNonIntegerIndex]


So I *think* that there is an issue in Context>jump where we explicitly check for stackp = 0 but then call code that carefully does a pop via #at:. Something about the primitive: 210 (maybe?) does something weird and the index is both 0 and not 0 - nor even an Integer.

As an interesting bonus, the clause I added to log things when 'index = 0' went very wrong because the 'f' getting passed to the block is apparently the MultiByteFileStream *class* rather than the opened file!

The bit bothering me at the moment is just how this can be a problem that hasn't whacked us before. I haven't been able to cause it with 'normal' code but all I was doing to fall over this was loading & testing the old Plumbing demo code.

Oh, I did just try to see if a newer vm (I am running the 201912311458 ARMv6) would have a fix for the prim 210 but no newer ARM VM will run at all. The latest Mac vm runs but fails with the same huge list of notifiers reporting the error in Context>at: - the fact that there is a *lot* of #at: on the stack is a little more odd.

tim
--
tim Rowledge; tim at rowledge.org; http://www.rowledge.org/tim
Useful random insult:- If you stand close enough to him, you can hear the ocean



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200128/0ccb9fff/attachment.html>


More information about the Squeak-dev mailing list