squeak pages and new architecture (was: oo hardware)

Jecel Assumpcao Jr jecel at merlintec.com
Thu Mar 27 00:46:20 UTC 2003


In order not to bother people here with lots of details they may not 
care about, I have updated http://www.merlintec.com:8080/hardware to 
include more current information about what I am doing.

That way the list can focus on more exciting stuff like licenses ;-)

One thing that is totally missing on the page I mentioned above is 
information about a massively parallel, non Von Neuman computer for 
Smalltalk that I call "RNA". Rather than trying to explain such a 
different idea with boring text on a swiki page, I think it would be 
better to do a proper presentation in Squeak and will try to do so this 
weekend.

Looking at the tools available for this I came across the SqueakPages. 
They were introduced in Squeak 2.3 and I seemed to vaguely remember 
something about them being obsolete, but searching through the web I 
didn't find anything. I noticed that SqueakLand uses Projects 
exclusively and that the browser plug-in can't load .sp files. It can 
load .sqo (Squeak object) files and I am wondering if there is any 
relation.

On a related note, I will soon be replacing my server (with a screaming 
233MHz Pentium II :-) and want to move the static pages from Apache to 
Comanche (which has been doing such a good job with the swiki side of 
the site). People have been complaining for years that my site is ugly 
and boring, so it would be nice to replace HTML with Squeak content. I 
am trying to figure out the best way to do that (while still making it 
usable to people without the plug-in, like search engines)...

On Thursday 20 March 2003 21:40, Alan Kay wrote:
> I would only add one thing. At PARC we estimated that we could get
> about a factor of 5 from special low level (HW+firmware) design.  If
> Moore's Law is doubling every 18 months, then this is almost 4 years
> of being ahead if you can compete with ordinary silicon (your factor
> of 8 would be 3 turns of Moore's Law, or about 4.5 to 5 years). The
> Alto was a big success because it took Chuck Thacker and two
> technicians only about 3.5 *months* to make the first machine, and it
> only took another month to move the first Smalltalk over from the
> NOVA to the Alto. So we were off and running.

I used to be that fast. Now it takes me a week to write a three 
paragraph email :-(

Perhaps a vacation would help...

> If we believe Chuck's estimate that we've lost about a factor of a
> thousand in efficiency from the poor design choices (in many
> dimensions) of Intel and Motorola (and Chuck is a very conservative
> estimator), then this is 10 doublings lost, or about 180 months, or
> about *15 years* for Moore's Law to catch up to a really good scheme.
> This is a good argument for trying very different architectures that
> allow a direct target to be highly efficient VHLL *system* execution.

Speaking of "Chuck" and "Moore", the MISC Forth chips implemented in 0.8 
micron technology ran up to 700MHz while the Intel chips built using 
the same technology were limited to 66MHz.

Now Chuck wants to run at 2.4GHz in 0.13 microns while Intel is already 
past that. On the other hand, he can fit 25 processors in just 7 square 
mm so there difference in MIPS compared to a Pentium IV is still 
interesting.

But a factor of 1000 seems a bit high.

> A small group approaching this should try to do everything with
> modern CAD and avoid getting messed up with intricate packaging
> problems of many different types. So I would look at one of the
> modern processes that allows CPU and memory logic to be on the same
> die and try to make what is essentially an entire machine on that die
> (especially with regard to how processing, memories and switching
> relate). Just how the various parallelisms trade off these days and
> what is on and off chip would be interesting to explore. A good
> motivator from some years ago (so it would be done a little
> differently today) is Henry Fuch's "pixel planes" architecture for
> making a renderer as a smart memory system that has a processor for
> each pixel. Such a system can be have a slower clock and still beat
> the pants off a faster single processor von Neumann type
> architecture.

That was a SIMD architecture, like a Connection Machine optimized for 
graphics. Pixel Planes was really interesting and the design is 
described in details in chapter 9 section 5 (pages 448 to 480) of

 Principals of CMOS VLSI Design - A Systems Perspective
 Neil Weste and Kamran Eshraghian
 1985, Addison-Wesley Publishing Company, ISBN 0-201-08222-5

This is very much the kind of thing I am trying to do with RNA (Ring 
Network Architecture). I have decided to publish all the details so 
that anybody who is interested can work on it. That will take a little 
while, as I wrote above, but here are some of its features:

- the "basic block" has 17 transistors and stores 3 bits
- a number of basic blocks (32, for example) and some control logic form 
a "cell"
- each cell has unidirectional connections to two neighbors, so that the 
whole machine forms a giant ring (hence the name)
- there is no clock
- an extra hypercube-like network can help messages skip most of the 
ring in order to reduce latency
- objects are stored in a set of cells and interact with messages sent 
to them (there are no processors or memory - just cells)
- primitive blocks are spread out through the ring and handle special 
messages to do things like integer addition

It isn't possible to build such a machine unless you can fit at least 
some 200 million transistors on a single chip. And it isn't practical 
to simulate this with FPGAs.

More next week,
-- Jecel



More information about the Squeak-dev mailing list