FW: FW: [Newbies] A squeak machine?

Ron Teitelbaum Ron at USMedRec.com
Fri May 5 15:27:08 UTC 2006

-----Original Message-----
From: Jecel Assumpcao Jr [mailto:jecel at merlintec.com] 
Sent: Friday, May 05, 2006 10:55 AM
To: Ron at USMedRec.com
Subject: Re: FW: [Newbies] A squeak machine?

Ron Teitelbaum wrote on Fri, 5 May 2006 09:16:02 -0400
> I got your name from Hans-Martin, I was wondering if you could help Jim
> message below)?  This question came up on our squeak beginners list and I
> thought it would be nice if we could fix Jim up with someone that could
> help.

Ok, though I am not subscribed to the beginners list and the message you
forwarded does not include Jim's email address so this ended up being a
private reply to you. Please forward it where you think it should go.
> From: beginners-bounces at lists.squeakfoundation.org
> [mailto:beginners-bounces at lists.squeakfoundation.org] On Behalf Of Jim
> Sent: Friday, May 05, 2006 4:20 AM
> To: beginners at lists.squeakfoundation.org
> Subject: [Newbies] A squeak machine?
> Hi,
> I'm just a post silicon test engineer for a company that is developing a
> object oriented medium grain parallel processor on a chip. It's quite a
> like a FPGA, but the chip implements ALU state machines and table lookup
> logic. It has multiply / accumulate (dot product) units, datapath muxes
> lots of memory. The kicker is it's a tagged architecture, with 5 bits
> shotgun on the 16 bits of data. That's 8 bits of tag and  2 ready bits for
> real-world 32 bit object.

That sounds very interesting. Some previous Smalltalk computers
(including massively parallel ones like the J-Machine) might give you
more ideas:


> I've heard talk in the office that a
> harvard/princeton machine demo hack implementation would be nice, but 
> seems really 1950's .My first thought was to build a FORTH machine, or a
> bunch of forth machines, But Squeak seems really attractive and it goes
> with the object oriented mindset of the company and the chip.

I am very familiar with the work Chuck Moore has been doing in highly
parallel Forth chips and started my current 16 bit processor project
inspired directly by his MISC designs. I found that a very tiny change
(a SELF register and segmented memory addressing) made it very suitable
for running object oriented variations of Forth. A second small change
(to support blocks) made it into a good Smalltalk processor. Eventually
I replaced the 5 bit opcodes with a more Transputer-like bytecode, but
the basic features have remained the same:


The current bytecodes are very similar to a special version of Squeak
created by Anthony Hannan -


The idea was that version 4.0 of Squeak is going to be a radical change
where new images will not be compatible with older virtual machines and
this would allow everything to be rethought. Later, however, it was
decided that the new bytecodes would not be adopted in the future since
even greater speed improvements will come from compilation techniques
like Jitter or Exupery.

The main problem with the current bytecodes is that the only practical
implementation in hardware is a microcoded one. I have investigated this
and found that it would be possible to do a microcoded machine with just
one extra pipeline stage compared to a hardcoded design (unlike JOP,
which adds two stages - http://www.jopdesign.com/) and essentially the
same performance. But for now my impression is that a hardcoded design
with a small local memory would be a better choice.

> So? do you think I should burn a couple hundred hours of my own time
> to cook this up?

You should take a look at what has been done before and then decide for

> BTW: this thing runs at 1 GHz and has 256 alu's,  64 macs,  tons of
> registers and scratch and  2X266 Mhz 40 bit ddr2 dram interfaces with 256
> Mwords of store..

That sounds great! My own FPGA designs run in the 50 to 100MHz range and
the plan for the commercial version (using larger FPGAs than the current
prototypes) would be for only about 9 processors.

> Just how do I implement a parallel or massivly pipeline(d) squeak machine?

Squeak itself is very sequential, unfortunately. You might get away with
a three processor design where one runs the applications, the other
bitblt and the third the garbage collector and other system stuff. But
that is only if you are interested in running typical Squeak
applications. For special software, like StarSqueak or Kadema  there is
lots of potential parallelism.


I have given some thought to the problem of implementing a massively
parallel Smalltalk at the transistor level, but that is not a project I
am currently working on:


Good luck!
-- Jecel

More information about the Beginners mailing list