[squeak-dev] Improve termination behavior when in #critical section

Jaromir Matas mail at jaromir.net
Wed Feb 1 15:22:54 UTC 2023

Hi Eliot, all,

I'd like to expand a bit on releasing critical sections during termination presented in Kernel-jar.1498. I *think* the second part of the procedure can be further simplified (and generalized) as follows.

The point is we do not need to check the process is immediately beyond the wait/lock primitive (by checking selectorJustSentOrSelf); instead, just being still inside the #critical context should suffice to assume the process left the condition variable's primitive but hasn't made progress and an intervention is required.

                "Figure out if we are terminating a process that is in the ensure: block of a critical section.
                If it hasn't made progress but is beyond the wait (which we can tell by the oldList being
                one of the runnable lists, i.e. a LinkedList, not a Semaphore or Mutex), then the ensure:
                block needs to be run."

                | oldList |
                "Suspend and unblock the receiver from a condition variable using the suspend primitive #88.
                It answers the list the receiver was on before the suspension."
                oldList := self suspendAndUnblock.
                suspendedContext ifNotNil: [:context |
                                (context method pragmaAt: #criticalSection) ifNil: [^self].

                                (oldList isNil or: [oldList class == LinkedList]) ifFalse: [
                                "If still blocked at the condition variable of a critical section, skip the rest of the current context."
                                                suspendedContext := context pc: context endPC.

                                "Now we know we are somewhere beyond the wait/lock primitive in a critical section but
                                haven't made progress into the ensure block; the only point of the following code is to identify
                                the type of condition variable we're in and let the process enter the ensure block when applicable."

                                "If still haven't made progress into the ensure block then it has not been activated, so step into it."
                                (context methodClass == Semaphore or: [

                                "If still haven't made progress into the ensure block and the lock primitive just acquired ownership
                                (indicated by it answering false) then the ensure block has not been activated, so step into it."
                                (context methodClass == Mutex and: [
                                                context stackPtr > 0 and: [context top == false]])]) ifTrue: [

                                "In such cases we need to step into the ensure block and let the unwind execute the ensure argument
                                block and, if the critical section itself is already inside an unwind block, also the ensure receiver block."
                                                suspendedContext := context stepToCallee]]

The first part deals with the situation the process is blocked at the condition variable but has been just released from it by suspendAndUnblock (prim 88) to be able to proceed with termination; in this case we do need to skip the critical block because the critical block itself can be inside some outer ensure argument block.

I hope it's a bit clearer now. Any ideas would be greatly appreciated.


PS: here's Eliot's original explanation of #releaseCriticalSection: http://forum.world.st/Solving-termination-of-critical-sections-in-the-context-of-priority-inversion-was-SemaphoreTest-fail-td5082184.html
What's changed since then is we can now safely unwind critical sections and non-local returns inside ensure argument blocks.

From: Jaromir Matas<mailto:mail at jaromir.net>
Sent: Monday, January 30, 2023 17:40
To: Squeak Dev<mailto:squeak-dev at lists.squeakfoundation.org>; Eliot Miranda<mailto:eliot.miranda at gmail.com>
Subject: [squeak-dev] Improve termination behavior when in #critical section

Hi Eliot, Nicolas, or anyone interested in context juggling,

A few Process/Semaphore/Mutex tests have been left as expected failures in the current image. It's time to fix them. I'd very much appreciate if you could review/merge the improved versions of #terminate and *especially* #suspendAndReleaseCriticalSection that finally make all currently failing tests pass. Sent in Kernel-jar.1498 and KernelTests-jar.443.

In addition, I've added a few Mutex and Semaphore tests to further illustrate the behavior when terminating a process stuck in a #critical section, especially when the critical section itself is inside an unwind block (which has previously not been fully addressed). The steps in #suspendAndReleaseCriticalSection should, hopefully, be sufficiently commented.

In case you'd like to hear more detailed comments or explanations I'll be happy to elaborate.

The changes have been tested also in Cuis which gives me more confidence they're solid.

Thanks for any comments.



Jaromír Matas

mail at jaromir.net

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20230201/b9bdd0ce/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Process-suspendAndReleaseCriticalSection.st
Type: application/octet-stream
Size: 2067 bytes
Desc: Process-suspendAndReleaseCriticalSection.st
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20230201/b9bdd0ce/attachment.obj>

More information about the Squeak-dev mailing list