[squeak-dev] Re: Suspending process fix

Igor Stasenko siguctua at gmail.com
Tue Apr 28 08:08:44 UTC 2009


2009/4/28 Andreas Raab <andreas.raab at gmx.de>:
> Igor Stasenko wrote:
>>
>> 2009/4/28 Andreas Raab <andreas.raab at gmx.de>:
>>>
>>> One thing I'm curious about is what is the use case are you looking at? I
>>> have never come across this particular problem in practice since explicit
>>> suspend and resume operations are exceptionally rare outside of the
>>> debugger.
>>
>> Well, i discovered this issue when wanted to make a current process be
>> the only process which can be running during a certain period of time
>> - without any chance for being interrupted by another, higher priority
>> process.
>
> I'm missing some context here. How does this issue relate to sending a
> process suspend; resume and expect it to keep waiting on a semaphore? If I'd
> have to solve this problem I would just bump the process' priority
> temporarily.
>

Process suspension is a STRONG guarantee that given process will not
perform any actions,until it receives #resume.
Priority is a wrong way to ensure this: different VMs could break this
contract easily , while breaking a #suspend contract is something what
doesn't fits in my mind :)

>> IMO, the suspend/resume should guarantee that suspended process will
>> not perform any actions under any circumstances and be able to
>> continue normally after issuing corresponding #resume.
>> As my issue illustrates, this is not true for processes which is
>> waiting on semaphore.
>
> Yes, but even with your proposal this wouldn't be true since a process
> suspended from some position in the list wouldn't be put back on in the same
> position. In practice, there are *severe* limits to that statement about how
> the system "ought" to behave when you run hundreds of processes with some
> 50,000 network interrupts per second behind a Tweak UI ;-) I think I can
> prove that your implied definition of "continuing normally" is impossible to
> achieve in any system that has to deal with asynchronous signals.
>

Do not try to scare me with numbers: if things working correctly for
2-3 processes, why they should fail for 50000? ;)
Certainly, the problem is to correctly identify a set of operations
which require atomicity (at language side and at VM side, if its using
many native threads). But if its done right, then who cares about
numbers?

>> Ask yourself, why a developer, who may want to suspend any process
>> (regardless of his intents) to resume it later, should make any
>> assertions like "what will be broken if i suspend it?".
>
> Thus my question about use cases. I haven't seen many uses of suspend
> outside of the debugger. And I don't think that's by accident - suspend is
> very tricky to deal with in a realistic setting that needs to deal with
> asynchronous signals. Most of the time it is a last resort solution (i.e.,
> don't care too much about what happens afterwards) not something that you
> would do casually and expect to be side-effect free.
>

Suspeding process is an explicit way to control on what happens in your system.
Many facilities can benefit from it, is we guarantee a certain
contracts to be fullfilled.
Actually, we are using suspend/resume every day, even without noticing
it - consider an image snapshot/startup :)
Does processes which were waiting for semaphore and saved in image in
such state start working after startup as if semaphore signalled? Do
such processes lose their 'wait' state?

>> Right. My proposal doesn't deals correctly with cases when there are
>> multiple processes waiting on a single semaphore.
>> It is correct only for a single process.
>>
>> If we suppose, that we are running in ideal environment, where
>> processes are running in parallel, then nothing prevents us to
>> implement waiting as following:
>
> The problem is that in a "real" environment signals are asynchronous. Unless
> you have some way of stopping time and other external interrupts at the same
> time you simply cannot guarantee that after the #suspend there isn't an
> external signal which causes some other process waiting on the semaphore to
> execute before the process that "ought" to be released.
>
If you speaking about Squeak VM, and its green threading model then
this is certanly doable, because at primitive (VM) level there is no
other activity at language side, other than VM does.

> For example, just consider a mutex where for some reason ordering matters
> like in Tweak (which does break if processes are not put back in the same
> order in which they were taken off the list): You have a process which holds
> the mutex, two more are waiting. You send #suspend to the first (waiting)
> one, it is off-list. Now the current mutex owner leaves that mutex. What
> should happen? Should the entire mutex stall because the process that was
> supposed to go next was suspended? If it proceeds it changes the ordering
> and that can cause all sorts of problems (as I found out when testing some
> earlier versions of the semaphore fixes that weren't quite correct ;-)
>

It should stall, of course, because first process who waiting on mutex
should obtain it first. The fact that its suspended is not relevant.
This is what i'm trying to say: waiting semantics should be kept
separated from scheduling.
A proof case is:

mutex critical: [
  proc := Processor activeProcess.
  [ proc suspend.  self do something here.  proc resume. ] fork.
  Processor yield.
].

it shows, that process which obtained a mutex can be suspended at any
point of time. And it gives you the right answer: any other processes
who waiting on same mutex will stall forever, until eventually,
suspended process will be resumed and release the mutex.

> That is the main problem with the whole idea of trying to have fine-grained
> control of processes externally - you cannot know the precise circumstances
> when these things happen and whether issues such as ordering matter which is
> why it is generally better to do this either implicitly (using only
> priorities) or by messaging (send a signal/message to the process and have
> it pick it up later). Outside of that transparent external control of
> process execution ranges somewhere between very tricky and plain impossible
> when asynchronous signals are involved.
>
I do not agree. As i said before, priorities is a fluid essence, which
simply shows to VM , which process have a better chance to take
control over computing resources (other VMs can treat priority
differently - like a percentage of computing resources which can be
allocated for a given process and guarantee that all active processes
will not starve during a certain period of time).
I don't like implicit control, explicit is much more better , because
it guarantees that under any circumstances your code will work same as
before.

I will try to implement a VM-side primitives which will guarantee
atomicity for Semaphore wait/signal operations. Then we can continue
our discussion using more grounded arguments. :)

> Cheers,
>  - Andreas
>
>


-- 
Best regards,
Igor Stasenko AKA sig.



More information about the Squeak-dev mailing list