[squeak-dev] Problem with fork

Mariano Martinez Peck marianopeck at gmail.com
Fri Jul 3 14:08:16 UTC 2009


Hi all,
  this is my first post here, just joined this group so let me do a
quick introduction.
I have 10+ years of experience doing full-time development with
VisualWorks (creating trading platforms for the European energy
exchanges).
Also I have near-zero experience with using Squeak or how its
development process works. I'll slowly cut my teeth, starting by
interacting on this mailing list.


Multi threading is something I have spent a lot of quality time with,
so I want to share some thoughts on the following:

|semaphores tr|
semaphores := Array new: 10.
tr := ThreadSafeTranscript new.
tr open.
1 to: 10 do: [ :index | semaphores at: index put: Semaphore new ].

    1 to: 10 do: [:i |
        [
         tr nextPutAll: i printString, ' fork '; cr.
         (semaphores at: i) signal.
        ] fork
    ].

    semaphores do: [:each | each wait ].
    tr show: 'all forks proccesed'; cr.


I have seen this pattern often (allocating a semaphore for every
forked process), I usually interpret this as a signal that such code
is still in its first 'make it work/make it right' stages.
What a lot of people don't realize is that at its heart a semapore is
a thread-safe counter/register (and if you look at the hierarchy it is
implemented on you wouldn't guess that either, since the hierarchy
stresses the implementation part that manages waiting processes rather
than the counter aspect).

So trying to take the code snippet toward 'make it abstract' territory
this could be refactored to lean more on the counter aspect of
semaphores and use only a single semaphore:


|count sem tr|
tr := ThreadSafeTranscript new.
tr open.
count := 10.
sem := Semaphore new.

1 to: count do: [:i |
       [       tr nextPutAll: (i printString, ' fork\') withCRs.
               sem signal.
       ] fork].

count timesRepeat: [sem wait].
tr show: 'all forks proccesed'; cr.



Now the above is about as far as you can go with the current Squeak
and VisualWorks implementations so you can take it as a simple
refactoring advise.




However I want to press on a bit more (and go a bit off-topic for this
list ;-) because I feel it still has a big problem: we need to
maintain a 'count' and pass that between the two loops in the above
example.
In the current example this is not much of a problem but in more
complex applications where the forking is done by yet other forked
processes we will need to make 'count' thread-safe as well -- I find
this very ugly, because you will need an extra semaphore just to make
the original semaphore work as required.
Furthermore you cannot add new forked processes once the second loop
has started running.

So here is an experiment I did a couple of years ago with VisualWorks:
I altered the VM (just one line of its source ;-) so it would react
properly to semaphores that have negative values in the
'excessSignals' instance variable, and I added a method #unsignal to
Semaphore that would decrease the value of that ivar.

In my experiments that yielded many opportunities to simplify
multiprocessing code (not only for thread synchronization but also for
passing around counts in a thread-safe register!).

In the above code that would allow us to 'pre-load' the semaphore at
the place where the threads are created with as result that the
'count' variable can be removed and the bottom loop can be removed too:


|count sem tr|
tr := ThreadSafeTranscript new.
tr open.
sem := Semaphore forMutualExclusion. "We need one excessSignal to
balance the #wait below"

1 to: 10 do: [:i |
        sem unsignal. "outside the forked code"
       [       tr nextPutAll: (i printString, ' fork\') withCRs.
               sem signal. "balance the unsignal"
       ] fork].

sem wait. "no loop, no need to know the count!"
tr show: 'all forks proccesed'; cr.





Above was just a simple refactoring, but look at how I needed it:

|sem tr|
tr := ThreadSafeTranscript new.
tr open.
sem := Semaphore new. "no excessSignal this time"

"set up a monitoring system first(!)"
[       sem wait.
        tr show: 'all forks proccesed'; cr
] fork.

"then create jobs (in my case I had only a single first job that would
recursively create more jobs, not shown here)"

1 to: 10 do: [:i |
        sem unsignal.
       [       tr nextPutAll: (i printString, ' fork\') withCRs.
               sem signal.
       ] fork].
"Now that we are sure at least one job is entered balance the #wait we
started out with"
sem signal.


Since we elided 'count' I can move the code that relied on it up in
front of the thread creation code, I very much like this flavor of
decoupling.


I guess this illustrates that Semaphore is stuck in the 'make it work/
make it right' phase for thirty years now, and that moving it into
'make it abstract' territory will make lots of hairy multi-threading
code much simpler to express...

(And for those thinking this through: yes I did implement a thread-
safe #add: and #valueWithReset on semaphore too ;-)


I hope I didn't bore y'all and stray to far off-topic, but I did want
to share this bit of insight I gained by tinkering with the semaphore
implementation: semaphores are thread-safe counters at their heart.



Cheers,

Reinout
-------

PS: big congrats with the license cleaning milestone, this is what
finally pulled me into this project :-)


_______________________________________________
Pharo-project mailing list
Pharo-project at lists.gforge.inria.fr
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20090703/c60525a8/attachment.htm


More information about the Squeak-dev mailing list