Hi all,
I would like to report a bug I found related to debugging the debugger. I actually found the issue when debugging the context simulation of "Over", but let's start with a simple example.
Scenario 1
Steps to reproduce:
* (Save your image, if it contained any valuable information.)
* Type the following into a Workspace:
x := 5.
thisContext insertSender: (Context contextEnsure: [x := 4]).
* Debug it and step:
* (into) #contextEnsure
* (through) #ensure:
* (into) #jump
* step over the implementation, and press (into) while returning.
Expected behavior:
The same should happen as if you would not debug, but normally do the code: It should run without raising any error, and x should finally contain the value 4.
Actual behavior:
* In Trunk:
* An infinite chain of debuggers appears, making it impossible to reuse the image. If you gain the chance to see any debugger, or to receive a non-empty debug log, you can see there is #errorSubscriptBounds: called from within a context. However, it is not possible to find out the complete stack of this error. (At least, I did not manage after a few attempts.)
* In Squeak 5.1 (Note: You will need to replace here with MethodContext, for compatibility):
* The same error is raised, but instead of endless debuggers, you get endless emergency evaluators:
[cid:2c2ee09f-1f81-4969-a2af-0fac1c093119]
Workaround:
Modify the #contextEnsure: implementation in the following way:
[cid:02411c12-9a78-40fe-a3b3-c7f8357c133d]
Reason:
As the comment in Context>>#jump states, the receiver context "MUST BE a top context (ie. a suspended context or a abandoned [sic] context that was jumped out of)", which, in particular, "already has its return value on its stack".1
However, in the #contextEnsure: example, ctxt is not a top context because is it still running, and on its top, it does not have a return value but the closure vector2 {ctxt. chain}. This leads to the fact that #jump pops away this closure vector so that any attempt to access this vector will result in Context>>#at: to call #errorSubscriptBounds:. I'm not sure where this attempt is made actually, but I suppose each debugger's ContextVariableInspector could do this and so raise another debugger and so forth ...
Considerations:
A simple solution would be the workaround presented above. However, we would not only need to patch #contextOn:do: the same way, we also wondered why the same problem does not also has had any visible impacts on the other senders of #jump in past, which are #restart and #runUntilErrorOrReturnFrom:. The first refreshes the context and then jumps into it, and from theory, I think it should be possible to call it from a foreign sender in a similar minimal example in order to invalidate the context register. However, I did not manage to construct an appropriate example yet. The same goes for #runUntilErrorOrReturnFrom:, because when "here jump" is executed, "here" again is neither a sender of the block context, nor it has is return value on stack.
In a nutshell, to us, it has been keeping an open question so far whether the other senders of #jump are legitime.
Should we maybe add a method like "Context >> jumpWith: aReturnValue" which would push aReturnValue on the receiver first and then perform the jump? Or could this have any negative impacts if there is indeed already a return value on stack of the receiver?
Scenario 2
Steps to reproduce:
The same as Scenario 1, but instead of stepping into #jump, just step over it.
Expected behavior:
The same expected behavior as in Scenario 1.
Actual behavior: (diff to Actual behavior of Scenario 1 highlighted for ease of reading)
* In Trunk:
* An infinite chain of debuggers appears, making it impossible to reuse the image. If you gain the chance to see any debugger, or to receive a non-empty debug log, you can see there is #cannotReturn:to: called from within a new process.
* In Squeak 5.1 (Note: You will need to replace here with MethodContext, for compatibility):
* The same error is raised, but instead of endless debuggers, you get endless emergency evaluators:
[cid:f7412be3-907b-441a-8a7e-889cbc88f1cd]
Note: This issue continues to exist after applying the workaround patch from Scenario 1.
Reason:
No idea. After applying the workaround from above, I tried to debug #doStep before stepping over the #jump per button, but this lead me to some kind of infinite loop: When I debug the whole debugging logic via Process>>#completeStep: and Process>>#complete: down to Context>>#runUntilErrorOrReturnFrom: and there step *into* the "self jump" statement, the error does not occur. When I step *over* this line instead, the same error occurs. I also tried to debug that #doStep in the second debugger window again, but this bug seems to be a real Heisenbug - it only occurs when you do *not* step into the relevant #jump call.
Considerations:
As mentioned above, we also could not yet explain whether all calls to #jump in #runUntilErrorOrReturnFrom: are valid with regard to the "return value on stack" precondition. Maybe there is a connection?
We will keep investigating the issue, but any help is appreciated, of course!
1 I'm afraid the example method "Interpreter>>primitiveSuspend" further mentioned in the comment does no longer exist.
2 Is "closure vector" an official term? If not, what would be the official term? :)
Best,
Christoph