several threads (was: 3D physical simulation)

Tue Sep 11 00:57:34 UTC 2001

On Saturday 08 September 2001 17:46, Andreas Raab wrote:
> [....] We're actually going to do more work in
> this direction with the plan being to build kind of a mini-APL
> sublanguage in Squeak that can be used to do exactly this kind of
> physical simulations (and a lot of other stuff).

I would like to comment on this, as well as on the thread about 
multiprocessors, future PDAs, BioSqueak and teaching people to think 
about alternatives to low level loops.

The thing that made me switch from Lisp to Smalltalk was #select: and 
friends. For the first time I could work with whole objects instead of 
individual elements. APL shares that power, but is too limited in all 
other ways. I hope Tim Budd (of Little Smalltalk fame) won't mind if I 
cite a long example from his Leda book (from the first chapter, 
available at http://www.cs.orst.edu/~budd/vita/ledatoc.html)

  An example will help illustrate this relationship between computer
  language and problem solution. Several years ago a student working in
  genetic research was faced with a task involved in the analysis of
  DNA sequences. The problem could be reduced to a relatively simple
  form. The DNA is represented as a vector of N integer values, where N
  is very large (on the order of tens of thousands). The problem was to
  discover whether any pattern of length M, where M was a fixed and
  small constant (say five or ten) is ever repeated in the vector of
  values.

  The programmer dutifully sat down and wrote a simple and
  straightforward FORTRAN program, something like the following:

        DO 10 I = 1, N-M
              DO 10 J = 1, N-M
                    FOUND = .TRUE.
                    DO 20 K = 1, M
20                     IF X[I+K-1] .NE. X[J+K-1] THEN FOUND = .FALSE.
                    IF FOUND THEN ....
10     CONTINUE

  The student was somewhat disappointed when trial runs indicated his
  program would need ten or fifteen hours to complete. He discussed his
  problem with a second student, who happened to be proficient in the
  programming language APL. The second student said that she would like
  to try writing a program for his problem.

  The first student was dubious. After all, FORTRAN was known to be one
  of the most "efficient" programming languages. It was compiled, after
  all, and APL was only interpreted. Therefore it was with a certain
  amount of incredulity that he discovered the APL programmer was able
  to write an algorithm that worked in a matter of minutes, not hours.

  What the APL programmer had done was to rearrange the problem. Rather
  than working with a vector of N elements, she reorganized the data
  into a matrix with roughly N rows and M columns:

      X1      X2                 ........            Xm
      X2      X3                 ........            Xm+1
                                  ......
      Xn-m  Xn-m+1        .......            Xn-1
      Xn-m+1  Xn-m+2    ........           Xn

  The APL programmer then sorted this matrix by rows. If any pattern
  was repeated, then two adjacent rows in the sorted matrix would have
  identical values. It was a trivial matter to check for this condition.
  The reason the APL program was faster had nothing to do with the speed
  of APL versus FORTRAN, but was simply because the FORTRAN program
  employed an algorithm that was O(M x N^2), whereas the sorting
  solution used by the APL programmer required approximately O(M x N
  log N) operations.

  The point of this story is not to indicate that APL is in any way a
  "better" programing language than FORTRAN, but to ask why it was that
  the APL programmer was naturally led to discover a better solution.
  The reason, in this case, is that loops are very difficult to write
  in APL, whereas sorting is trivial; it is a built-in operator defined
  as part of the language. Thus, because the operation is so easy to
  perform, good APL programmers tend to look for novel applications for
  sorting. It is in this manner that the programming language in which
  the solution is to be written directs the programmer's mind to view
  the problem in one fashion or another.

I have had a few similar experiences. The first programmer was "penny 
wise, pound foolish". And we might be the same if we look exclusively 
to linking with C libraries to make Squeak fast at number crunching. 
One significant problem is that these low level optimized codes are 
nearly impossible to parallelize. Imagine that we built a great PDA and 
added Chuck Moore's neat $1, 60,000 MIPS, 500mW chip 
(http://www.colorforth.com/25x.html). This code wouldn't be able to use 
it!

Now you might be thinking, neither would pure Squeak or a mini-APL in 
Squeak be able to run on something like this. But I think the work done 
in the MIT ConcurrentSmalltalk holds the answer: Concurrent Aggregates 
(http://www-csag.ucsd.edu/projects/concert/ca.html). These are parallel 
versions of the familiar Collection classes which distribute their 
elements among all available processors. This is very different from 
proxies to access remote objects: the same OID refers to the same 
object in each processor, but a different subset of elements. So when 
you write 

         x := aConcurrentOrderedCollection select: [ :e | e > 9 ]

and you have 25 processors, each one only goes through its local 
elements and you get your result 25 times faster. The result, x, is 
also a ConcurrentOrderedCollection and will be similarly speedy when 
executing messages sent to it.

This isn't perfect, of course. Many times the elements will be 
distributed the wrong way and the operation might even be slower than 
if executed on a single processor. But it is certainly a powerful tool 
and if used well it can let us go where our C and Fortran number 
crunching friends never even imagine.

BTW, back in 1997 I had two programmers work on writing a mini-APL in 
Self to make writing a 3D GUI easier, but they weren't quite up to the 
task. It is certainly a good idea.

-- Jecel