threads

Richard A. O'Keefe ok at atlas.otago.ac.nz
Fri Feb 15 02:48:55 UTC 2002


My news service is so _bad_ I gave up using it a couple of years ago.
Good thing, or I'd never have time to read the Squeak list.
"Lex Spoon" <lex at cc.gatech.edu> cited an article he posted to
comp.lang.erlang about threads.

	Others are strongly in favor of using multiple threads.  I have the
	opposite opinion, mainly because the errors in multi-threaded programs
	are so difficult to cope with.
	
Threads in Erlang are different.

(1) Erlang was designed from the beginning to handle 10s to 100s of
    thousands of threads.  Implementors (and there are a surprising number
    of Erlang implementations) do take this seriously.

(2) Locking is needed so that shared data structures aren't smashed
    incoherently.  Erlang doesn't have mutable data structures (well, there's
    a thing called the "process dictionary" which snuck in by mistake, but
    it's process-local, so there'd never be any point in locking it).  And
    strictly speaking it doesn't have any shared data structures either.

    This really makes a MAJOR difference.  The ONLY way that Erlang threads
    can interact is by sending messages to each other.

(3) Of course as soon as you interact with a data base (like ETS, DETS, or
    the Mnesia distributed database that's built on top of DETS) there is
    mutable state somewhere, and boy is it shared.  Locking a replicated
    database would be hard if the database code didn't do that for you.
    Which it does.

(4) Above all, the average Erlang programmer is expected to be a telecoms
    engineer, NOT a hot programmer.  S/he isn't expected to be able to
    develop highly robust patterns of communicating processes first day on
    the job.  That's why Erlang has "behaviours", design patterns if you
    like, provided as components that set up a communication structure for
    you.  Quite a bit of real Erlang code seems to be plugging fairly
    simple functional code into a framework.

In short, threads are fine, and assignment statements are fine, but put
them together and you have trouble.

My prejudice is that the assignment-free nature of Erlang is what makes
threading work so simply in it, but sometimes I wonder whether the _real_
answer isn't the way Joe Armstrong &co packaged up common working patterns
as "behaviours".  If that's right, then that may be good news for Squeak:
perhaps suitable "behaviours" could be provided in Squeak.

	There are two reasons for this difficulty.  First, the errors are
	harder to find in multi-threaded programs.  In both design approaches,
	you must reason about intermediate states between events in order to
	find the errors, but in multi-threaded programs the intermediate
	states are implicit.  In the event-driven design, those implicit
	states are coded explicitly and can be examined directly.
	
I used to think I liked theory, but "bisimulation" drives me up the wall.

	Second, multi-threaded programs are very difficult to test.  It's hard
	to generate a test case that makes one thread finish a certain step
	just before another thread finishes a certain step; it's much easier
	to generate a test case that makes a certain sequence of events
	happen.
	
In Erlang, the notion of "finish a certain step" is not terribly meaningful.
The only events that matter for process interaction are "send message" and
"receive message".  This makes a lot of the CS theory about processes
(CCS and so on) applicable.
	
	In another response, someone suggested using formal tools such as
	model checkers to cope with these difficulties.  If a programmer is
	using such tools, then I'm sure a lot of these problems go away.

I note that a verifier for Erlang exists and is under active development,
although I personally don't have a copy and haven't a clue how it works.

	However, the question remains interesting for people who don't have
	such tools.  For such people, multi-threaded designs seem to be
	an elegant way to introduce insidious errors.
	
Well, yes.  Although I note that the people who used Concurrent Pascal
back in the old days tended to complain about the language being restrictive
rather than complaining about errors in their programs.

My favourite example of thread problems was using Interlisp-D back in 1984.
I had a FileBrowser open, and saw something interesting, so I popped up
a second FileBrowser.  It came about half-way up and then hung the machine.
(That was actually fixed fairly quickly.)




More information about the Squeak-dev mailing list