[squeak-dev] Re: Suspending process fix

Michael van der Gulik mikevdg at gmail.com
Tue Apr 28 22:29:28 UTC 2009


On 4/28/09, Andreas Raab <andreas.raab at gmx.de> wrote:
> Igor Stasenko wrote:
>> Ask yourself, why a developer, who may want to suspend any process
>> (regardless of his intents) to resume it later, should make any
>> assertions like "what will be broken if i suspend it?".
>
> Thus my question about use cases. I haven't seen many uses of suspend
> outside of the debugger. And I don't think that's by accident - suspend
> is very tricky to deal with in a realistic setting that needs to deal
> with asynchronous signals. Most of the time it is a last resort solution
> (i.e., don't care too much about what happens afterwards) not something
> that you would do casually and expect to be side-effect free.

IMHO Process>>suspend should never by used in "normal" code. It should
only be used from debuggers and system tools. But it should still be
implemented to exhibit the correct behaviour.

The above is exactly the use case where this bug has bitten me. When I
was trying to debug concurrent code, the debugger would simply ignore
Semaphore>>wait and step right over it! The debugger quickly became
useless when I had to manually keep track of which semaphores were
signalled and which weren't.

Of course, the debugger would also need improvement to make sure that
it doesn't suspend the entire GUI every time its simulated process
waits on a semaphore, but that's another issue.

> The problem is that in a "real" environment signals are asynchronous.
> Unless you have some way of stopping time and other external interrupts
> at the same time you simply cannot guarantee that after the #suspend
> there isn't an external signal which causes some other process waiting
> on the semaphore to execute before the process that "ought" to be released.

By my understanding of how suspending a process should work, if a
Process is suspended (by calling >>suspend) then no force on Earth
other than called >>resume on it should resume it again. Any events or
signals on it should accumulate until it is resumed.

> For example, just consider a mutex where for some reason ordering
> matters like in Tweak (which does break if processes are not put back in
> the same order in which they were taken off the list): You have a
> process which holds the mutex, two more are waiting. You send #suspend
> to the first (waiting) one, it is off-list. Now the current mutex owner
> leaves that mutex. What should happen? Should the entire mutex stall
> because the process that was supposed to go next was suspended? If it
> proceeds it changes the ordering and that can cause all sorts of
> problems (as I found out when testing some earlier versions of the
> semaphore fixes that weren't quite correct ;-)

What list? Are you referring to the linked list that a Semaphore
maintains? I would consider the linked list of Processes that
Semaphores maintain to be an implementation detail of Processes and
Semaphores. Your code should be written to be completely oblivious to
it.

I believe the correct behaviour of Semaphore>>signal should be that
the next process to be run would be either the process doing the
signalling, or any other process waiting on that semaphore. Assuming a
multi-core capable VM, two processes might end up concurrrently
continuing execution. There shouldn't be any guaranteed ordering in
the resuming of processes; that's an implementation detail in the VM
that could potentially change.

And yes, I believe that in your example, it is correct that the entire
mutex should "stall", meaning that the process that entered the mutex
has entered a "suspended" state and all processes still waiting on
that mutex remain in their "waiting" state. If the mutex didn't
"stall", the debugger wouldn't be particularly helpful.

Gulik.

-- 
http://gulik.pbwiki.com/



More information about the Squeak-dev mailing list