Lots of concurrency

Ken Kahn kenkahn at toontalk.com
Mon Oct 29 02:45:18 UTC 2001


David Chase  wrote:
>
> At 10:01 AM 10/25/2001 -0700, Ken Kahn wrote:
> >Thinking sequentially about problems that are inherently concurrent is
> >suboptimal.
>
> I disagree.  My favorite examples, not necessarily relevant
> to Squeak, are:
>
> 1) network (web) servers.
>
> The task is inherently parallel; there can be many
> connections in progress at one time.  However, the
> task is greatly simplified by organizing it as a
> bunch of sequential threads.
>

Maybe. We seem to agree that the problem should be broken down into many
concurrent activities. So we aren't thinking sequentially at the top level.
So then the question that remains is how to think and program at the lower
level. I have been arguing for a uniform and simpler computation model that
has no support for sequentiality (or sequential threads). But there is still
dataflow synchronization - it isn't a chaotic concurrency. So one would need
to look closely at examples to see if dataflow is natural or awkward.

> 2) finite element models (game of life, cellular automata).
>
> The problem is *described* from the point-of-view of a single
> cell, and the parallelism is then added.  The differential
> equation itself describes the behavior *of a point*.  There
> are cases where it is more efficient to think of the aggregate
> acting in parallel (for instance, the physical properties of
> an I-beam) but often the most interesting problems are those
> where we *do not know the aggregate behavior* -- there, we
> must work from first principles.  (It depends, very much,
> one what we are using the computer/program for -- designing
> a house, we work with standard aggregate-behavior-understood
> materials, but designing a scramjet or a high-performance
> propeller, we must work from finite elements.)
>

You seem to be arguing that this is another example of parallelism or
concurrency in the large and sequentiality in the small. But the behavior of
a cell in the game of life or a cellular automata is not like a sequential
thread. There is no need for a stack of procedure calls. Right?

>
> The use of (synchronized) time steps also helps avoid the
> non-uniform progress problem (which is the one that most
> often catches me when I write parallel code).
>

I'll admit to being sloppy about time. And naive concurrent programs can
have a problem of non-uniform progress. As I see it the question to whether
the underlying computation model needs to "know about" time or whether when
appropriate user code can just communicate and synchronize with clock
processes. I prefer to keep the model simple but one needs to look at lots
of examples to see how clumsy some programs might become as a result.

>
> In contrast, working on parallel programs, I have seen/heard of
> programmers across a wide range of skill levels (everyone from
> the more academic members of the Modula-3 design team, to a
> tenured professor at MIT, to Los Alamos rocket scientists, to
> just ordinary programmers)  get wedged numerous times with
> mistaken assumptions about atomicity, progress of computing,
> progress of communication, and equivalence of different
> concurrent programs.
>
I believe much of the complexity that causes these problems is that the
languages they are using mixes concurrent and sequential programming. I'm
proposing eliminating the sequential part.

> Even something as simple as producers-and-consumers is easy
> to get wrong, if the wrong thread does the updates.
>

Young children program producers and consumers in ToonTalk and they can't
get such a thing wrong. They program a robot to keep giving things to a bird
(the producer) and another robot to pick things off a nest and process them
(the consumer). Multiple senders or recievers don't make it harder. In the
model I'm advocating I can't make sense out of "the wrong thread does the
updates".

>
> >I think her problem is that her model of programming is sequential as a
> >result of her year of Logo programming. And the world isn't sequential.
> >Think of sports teams. Think of traffic. Think of the internal
concurrency
> >in walking. In games. Think about bank account transfers. The Incredible
> >Machine. Cooking. An orchestra. And so on.
>
> Think, in general, of how we massage these things to make them
> more sequential.  Traffic -- we have lanes, we have rules, we
> have signals.  Places like Massachusetts, this is less true, but
> I find the driving in Massachusetts is less efficient and more
> stressful than in other places (and to count the dents and dings
> on cars I see, more error-inducing).  If you look at how an
> orchestra is organized, it is designed to present the problem
> as a collection of sequential tasks; only the conductor is
> responsible for the "big picture", and the most difficult
> music is that which lacks familiar timing cues (like a regular
> beat -- tiny time steps, again).  In addition, the musicians
> all generally have their own copies of the music (replication -- a
> common trick for reducing interdependence in concurrent problems)
> even when playing the same part.  Or, consider a marching band --
> with certain exceptions (Stanford, Rice) it's time-step city,
> with each actor running on his own program taking queues from
> his neighbor (he turns left, two ticks, I turn left).  I've
> seen buggy marching bands, too :-).
>

Isn't a programmer more like the composer than the musicians or the
conductor? But maybe you are right that this is an example of "programming"
lots of sequential threads.

Here's how I think of the traffic issue. A concurrent view might give a rule
to each car that it can enter the space in front of it if it is empty. There
is a dataflow (or traffic flow) synchronization between autonomous
activities. A sequential view would be a car that moves forward because it
finished its previous step regardless of whether the space is occupied. I.e.
more dents. ;-)

>
> My working definition for "asynchronous" is "at any time" -- in
particular,
> an "asynchronous interrupt" (or "asynchronous method invocation") could
> occur at any time during the execution of any other method.  This is
> incredibly problematic from the POV of data structure consistency, even
> with locks/critical sections.  The worst parts of Java can occur
> "at any time" -- Thread.stop() delivers an exception (in theory) at
> any point in a another thread's execution, and Thread.suspend() can
> stop a thread from making progress no matter what it is executing,
> no matter what locks it holds, etc.
>
I don't think one needs the concept of an interrupt if there are no
sequential threads. Now one might want an object to give priority to
messages arriving on certain channels but that doesn't cause all these
problems. I haven't thought much about hardware interrupts but can't that be
dealt with by low-level system drivers and the concept of an interrupt
needn't go any further. And maybe there are better ways to deal with
hardware interrupts.

> Asynchrony presents a particular problem if you wish to test a
> program for the presence of bugs (never mind proving the damn
> thing correct in that sort of a world); in practice, you can
> never know that you have tested for every possible asynchronous
> interrupt/invocation that might occur.  In a (compiled) Java system,
> we dealt with this problem (for garbage collection) by defining
> compiler-inserted safe points where threads are preempted,
> including by another thread's need for GC.  By reducing (drastically)
> the number of places where the asynchronous event (GC) could occur,
> we made it actually possible to test that the correct GC
> bookkeeping was maintained for every single place where a GC
> could occur.  Testing was still tedious and slow, but orders of
> magnitude less so than testing at every instruction.
>
Some would argue that it is better to be able to reason about your program
than try to test it exhaustively. And having a simpler cleaner underlying
model of computation facilitates reasoning. It is harder to reason about
programs that are both concurrent and sequential.

>There is a further problem in all of this if you are the least
>bit worried about performance.  Communication is expensive.
>Synchronization is expensive.  Adding layers of abstraction can
>become incredibly expensive if each layer adds its own separate
>layer of locking -- no longer is it a matter of "just dynamically
>inline it to make it fast" -- the synchronization itself (on
>a multiprocessor) can cost an order of magnitude more than
>the procedure call/return that was inlined.  It's interesting
>to look at how unsynchronized versions of various library
>classes were added between Java 1.0 and Java 1.1; even with
>the fastest possible implementations, synchronization is too
>slow to make it the default.

Performance can be high for highly concurrent programs in languages
implemented well to support such a thing. See for example

www.sics.se/~andreas/papers/JavaVsOz.ps

and

http://www.klic.org/software/klic/index.en.html

Enough for now. Thanks for the thought-provoking comments.

Best,

-ken kahn ( www.toontalk.com )





More information about the Squeak-dev mailing list