[Vm-dev] Simple versus Recursive Mutexes

Tue Feb 2 18:35:00 UTC 2016

Hi Ben,

On Tue, Feb 2, 2016 at 9:54 AM, Ben Coman <btc at openinworld.com> wrote:

>
> While we are in the process of revising the Mutex infrastructure [1]
> to use dedicated primitives rather than rely on counting via a
> semaphore, I want to raise the question of simple versus recursive
> mutexes.
>
> This isn't something I know much about, except that some people
> strongly believe recursive mutexes are evil, including David Butenhof
> who apparently was the guy who added recursive mutexes to POSIX "on a
> dare [...] but nobody was supposed to use recursive mutexes." [1].
>
> There are two interesting discussion on stockoverflow [2][3]. But btw
> I disagree with Tall Jeff [3] says "The difference between a recursive
> and non-recursive mutex has to do with ownership. " I think he
> confuses a simple-mutex and binary-semaphore, as do a some commenters.
>
> "Prefer Simple Mutex Over Recursive Mutex" [4] demonstrates some
> performance considerations and advises "a recursive mutex is dangerous
> because you lose sense of locking scope. It costs more than a simple
> mutex."
>
> "Recursive Locks Will Kill You!" [5] provides some thread safety
> guidelines.
>
> Finally, there is a paper "Ad Hoc Synchronization Considered harmful"
> [6], which I haven't read yet since I wanted to get this post out
> before heading to bed, but I hope to tomorrow.
>
>
> [1] http://www.zaval.org/resources/library/butenhof1.html
> [2] http://stackoverflow.com/questions/2415082/when-to-use-recursive-mutex
> [3]
> http://stackoverflow.com/questions/187761/recursive-lock-mutex-vs-non-recursive-lock-mutex
> [4]
> http://askldjd.com/2009/10/26/prefer-simple-mutex-over-recursive-mutex/
> [5]
> http://www.fieryrobot.com/blog/2008/10/14/recursive-locks-will-kill-you/
> [6] https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Xiong.pdf
>
>
> So our existing Mutex implementation happens to be recursive, and I'm
> not suggesting we change that. However if we should consider *not*
> baking the *assumption* of recursion into the primitives, so the same
> primitives could *also* be used for a SimpleMutex class. The logic to
> provide recursion for Mutex is only [ owner = activeProcess ] which is
> easily done in-image.
>
> So I propose introducing aquire/release primitives based off the
> existing critical section primitives with the recursion removed.
>

We have two sets of primitives, the queueing semaphore ones, signal and
wait, #85 & #86, which are not recursive, and new ones, #185, #186 & #187
that support ownership, and hence support recursion.  It makes no sense to
remove the recursion support.  That's what the primitives are for, to
provide ownership support as efficiently as possible.  Please don't be led
by the problems others may have in other systems, and instead be led by how
we're doing.  Squeak and Pharo (and Smalltalk) have had recursive owned
critical sections for decades and no one (e.g. from demanding industrial
control applications) has complained that these recursive critical sections
are problematic.  Remember we have closures and hence ensure: and good
unwinding support.  The new primitives give us a streamlined and hack-free
implementation which doesn't depend on the niceties of what are suspension
points as do the pre-primitive implementations.  So I see no rationale for
gelding these primitives.

Show me a paper on a Smalltalk app that says "recursive locks will kill
you" and I'll reconsider, but in thirty years I've not seen a complaint.
The complaints I have seen are

- queueing Semaphores should or should not be priority queues, depending on
requirements.  In VW #85 & #86 were changed so that Semaphore is a priority
queue.  This has been good for some folks but bad for others.  One should
have the choice, but the current Squeak/Pharo/Smalltalk-80 default that
they are /not/ priority queues is the right default

- we suffer from deadlock due to priority inversion.  A high-priority
process spinning trying to gain access to a lock held by a low-priority
process can shut out the low-priority process from releasing its lock.
This is minimised by the new recursive lock primitives, but may still be a
possibility.  We should work on creating test cases.  Positively the test
cases that Andreas developed for me at Qwaq when I developed the recursive
locking primitives no longer fail.  I've not had time to investigate.

- preempting a lower-priority process by a higher-priority process causes
the preempted process to yield to others at the same priority.  This is a
bug which has been fixed in VW and in Cog.  It is an option in Cog but it
should be enabled.  See SmalltalkImage>>processPreemptionYields.
Smalltalk processPreemptionYields should answer false.

cheers -ben
>

HTH
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20160202/925d1a17/attachment-0001.htm