Hi Christoph, all,
I've studied your proposal.
I've sent Kernel-jar.1551 to share an idea how to fix the stepOver bug: instead of recomputing the first unwind context in #resume:through: I suggest to simply adjust the position of the simulation guard contexts in the sender chain that would reflect the target of the non-local return that is about to happen.
It uses your #nextRunUntilErrorOrReturnFromCalleeContext to do the heavy lifting - to detect the simulation guard contexts (this is great!) and then I just cut the sender chain and sew it together :)
I think this works nicely; however, as you noted, I not sure what would be the impact on the runtime performance. In the next step I'll try to think about a way to optimize it a bit if possible - e.g. using simulation flags.
Regarding the general approach including jumps, sender swaps - could you please give me an example (or two) of situations you have in mind? I'm not sure #jump needs such a protection but I haven't though about it much yet and any example would be very appreciated.
I tried your original approach but the disruptions while debugging felt a bit too drastic a solution. I often lost focus when the debugger stopped unexpectedly. Sometimes I had to trudge through lots of contexts I didn't care about to get where I wanted to be in the first place :)
So my idea is to check what other situations you may have considered as dangerous and try to deal with them one by one or possibly in a more general manner if a pattern emerges :)
Thanks again, I look forward to your thoughts.
best, Jaromir
On 30-Dec-23 5:17:14 PM, christoph.thiede@student.hpi.uni-potsdam.de wrote:
Hi Jaromir,
thanks for the reply!
I like about your new solution that it restores the original value of the #aboutToReturn:through:/#resumeEvaluating:through: second argument and thus fixes my SimulationStudio tests. Stepping over the non-local return in your example still seems to work, which is great. :-) I am a bit confused about the added complexity in BlockClosure>>ensure: and BlockClosure>>ifCurtailed:. IMO none of such logic should be aware of the existence of a possible simulator. (Besides, just a small comment, I believe #isSimulationFlagSet is an underspecific name. thisContext sender isSimulationFlagSet in a do-it evaluates to true, which is surprising. Maybe #isUnwindContextSimulationg or something similar would be better. Or just avoid this extra message as you send it only once.)
I still wonder whether we need to fix this bug specifically to non-local returns, since the problem occurs for all kinds of irregular context switches while a simulation guard is present. Have you had a chance to take a look at runUntilErrorOrReturnFrom.8.cs? It is similar to your patch in that i recognizes an existing simulation guard and halts the return earlier when present. However, it is more generic and works for #jump and #swapSender too. While yours assumes an existing guard context in #return:from:, mine scans the current context stack for guard contexts: I think your assumption does not always hold true (e.g., Context>>#runSimulated:... does not install a guard context); on the other hand, my approach might be obviously slower (though I did not run benchmarks yet. I might draw inspiration from your approach and store a simulation flag in the Process rather than walking up the entire stack).
tl;dr: I like your new solution (Kernel-jar.1545) better than what's currently in the trunk, but I still wonder whether we can refine and merge runUntilErrorOrReturnFrom.8.cs instead, for which my main concern remains performance. And as always, if anyone else is reading this (Eliot? Marcel?), further perspectives would be very welcome. :-)
Looking forward to your feedback!
Best, Christoph
Sent from Squeak Inbox Talk https://github.com/hpi-swa-lab/squeak-inbox-talk
On 2023-12-28T18:37:24+00:00, mail@jaromir.net wrote:
Hi Christoph,
Try this please: Kernel-jar.1545
Now the simulation for stepOver works as expected even without a
dirty
trick :)
Please check whether this solution is consistent with your Simulation Studio.
Here's an FYI:
The whole issue emerged probably in 2012 when #aboutToReturn:through: started using the `firstUnwindContext` argument, I guess as an optimization, and forced a starting point to searching for unwind contexts in the resumtion logic (#resumeEvaluating:through:) - which
was
inconsistent with inserting the ensure guard context **before** this starting point during simulation (by #runUntilErrorOrReturnFrom:
called
by stepping methods). This resulted in erroneous skipping over the inserted guard ensure context during stepOver simulation and the unexpected behavior.
The idea is for the resumption logic (#resumeEvaluating:through:) to recognize a simulated execution and search for a potential guard
(i.e.
#ensure:) context inserted during the simulation. I suggest using a `simulationFlag` tempVar in #ensure: which would be set in #return:from:, i.e. if, and only if, the `firstUnwindContext` is supplied in simulaton (i.e. potentially dangerous). When the
resumption
logic recognizes a simulation supplied `firstUnwindContext` it recomputes the real first unwind context at the moment and proceeds
as
expected.
I look forward to your reply.
Best, Jaromir
On 27-Dec-23 10:41:45 AM, "Jaromir Matas" <mail(a)jaromir.net> wrote:
Hi Christoph,
yes, I know it's not a clean solution; I understand your objection
and
agree with it: nilling the firstUnwindContext argument during simulation departs from mimicking the VM behavior but sticking with
the
firstUnwindContext breaks the simulation. Please give me a few days
to
revisit the issue again.
Coincidentally, I found another simulation bug in #return:from: -
see
Kernel-jar.1538 (it's a part of a series of changesets Kernel-jar.1537-1539). I'd appreciate your feedback; the plan is to doublecheck and consolidate the three changesets for final review.
Thanks and best regards, Jaromir
On 27-Dec-23 1:53:47 AM,
christoph.thiede(a)student.hpi.uni-potsdam.de
wrote:
Hi Jaromir,
sorry to bring this up again with so much delay, but honestly I
don't
think this is a good patch and we should revert it in the trunk.
One core assumption about the simulation engine is that it mimics
the
behavior of the VM interpreter in every utmost detail. This
includes
the arguments sent through special messages sent by the VM such as #aboutToReturn:through:. Just because the current implementation of Context>>aboutToReturn:through: does not require the value of firstUnwindContext, this does not mean that we can just break the existing contract of the VM and provide nil for that argument from
the
simulator.
Besides that theoretical perspective, your change also broke a
couple
of tests in SimulationStudio as I discovered today, because SimulationStudio relies on many of these assumptions about the code execution. If you would like to, you can also explore the failing
test
yourself: SimulationContextTest>>#testNonLocalReturn. SimulationContext>>#runSimulated:contextAtEachStep: manually checks the arguments passed to #aboutToReturn:through: (or more precisely,
of
#resume:through:) to detect when the simulator is unwinding the
code
provided to a custom simulator. As you niled out that value, I can
no
longer check that.
I think that your patch is just a hack that fixes a symptom rather than resolving the original problem, which is still the guard
contexts
inserted by Context>>#runUntilErrorOrReturnFrom: that are not fully transparent to the executed process. See
https://lists.squeakfoundation.org/archives/list/squeak-dev(a)lists.squeakfo....
The changeset that I proposed there
(runUntilErrorOrReturnFrom.8.cs),
which basically ensures before every irregular context switch that
the
guard contexts are removed and the debugger is halted again, fixes
the
original issue in the debugger as well. Try reverting your patch
from
Kernel-jar.1503 and loading my change set instead and you will end
up
in Context>>#resume:through: after stepping over the non-local
return,
which I would consider even favorable to the behavior of your patch which makes the logic from the unwind blocks unaccessible in that situation.
So, tl;dr, I see three issues with this patch: (i) it violates the
VM
contract of #aboutToReturn:through:, which (ii) breaks SimulationStudio due to a missing piece of information, and (iii) stepping over non-local returns no longer allows programmers to observe or step into unwind blocks. Given that, I would ask you for your agreement to reverting that patch and working on fixing the underlying issue with non-transparent guard contexts in general instead.
Despite that, please feel not discouraged to continue contributing
to
this important part of Squeak: It's just that you're sharing so
many
interesting ideas and Eliot and me and others lack the time to read all of them but certain changes nevertheless should require a
proper
review ... Have a nice end of Christmas! :-)
Best, Christoph
Sent from Squeak Inbox Talk https://github.com/hpi-swa-lab/squeak-inbox-talk
On 2023-02-14T22:08:39+00:00, mail(a)jaromir.net wrote:
Hi,
I'm enclosing a changeset that entirely removes unwind code
duplicity in Context AND at the same time fixes the "stepOver bug" still persisting in the system.
To refresh what the bug is: You cannot #stepOver '^2' when you
debug;
[^2] ensure: [] "step through to ^2 and try step over --> you
get
BCR"
The root cause is #return:from: supplies 'firstUnwindContext' to
#aboutToReturn:through: but before this gets evaluated the real
first
unwind context changes during simulation and that causes the bug.
Originally (prior 2012), #aboutToReturn:through: just called
'self
methodReturnContext return: result' and didn't use the 'firstUnwindContext' but in 2012 the call has been replaced with
'self
methodReturnContext return: result through: firstUnwindContext' and the bug was born.
In the changeset I suggest to use one common method replacing
#resume:through: and #resumeEvaluating: which solves the bug as a side-effect with only a tiny change in #return:from: (simply supply nil instead of 'firstUnwindContext').
I will post the changeset to the Inbox soon.
Best, Jaromir
--
Jaromír Matas
mail at jaromir.net
From: Jaromir Matas<mailto:mail at jaromir.net> Sent: Thursday, February 2, 2023 15:07 To: Squeak Dev<mailto:squeak-dev at lists.squeakfoundation.org> Subject: [squeak-dev] Code duplication around unwinding
Hi all,
When I started learning Squeak I wondered why so much code
duplication around unwinding: Context>>#restart, #resume:through:, #resumeEvaluating:, #terminate and #unwindTo: all implementing the exact same algorithm. Kernel-jar.1499 suggests removing all this duplicity while keeping the current unwind semantics. All Process
and
Exceptions tests are green.
This changeset is the first step in my effort to reduce the
"noise"
around the "collection" of resume/return methods in Context and Exception. One idea is to return calling the unwind mechanism back
to
the Exception class, as originally implemented in Squeak 2.x up to 3.5; please let me know if you find something fundamentally wrong
with
this idea.
Example: I find this call sequence begging for simplification: Exception>>resume: --> resumeUnchecked: --> resumeEvaluating:
-->
Context>>returnEvaluating: --> resumeEvaluating:
Best,
Jaromir
--
Jaromír Matas
mail at jaromir.net