-----Original Message----- From: Jecel Assumpcao Jr [mailto:jecel@merlintec.com] Sent: Friday, May 05, 2006 10:55 AM To: Ron@USMedRec.com Subject: Re: FW: [Newbies] A squeak machine?
Ron Teitelbaum wrote on Fri, 5 May 2006 09:16:02 -0400
I got your name from Hans-Martin, I was wondering if you could help Jim
(see
message below)? This question came up on our squeak beginners list and I thought it would be nice if we could fix Jim up with someone that could help.
Ok, though I am not subscribed to the beginners list and the message you forwarded does not include Jim's email address so this ended up being a private reply to you. Please forward it where you think it should go.
From: beginners-bounces@lists.squeakfoundation.org [mailto:beginners-bounces@lists.squeakfoundation.org] On Behalf Of Jim
Davis
Sent: Friday, May 05, 2006 4:20 AM To: beginners@lists.squeakfoundation.org Subject: [Newbies] A squeak machine?
Hi, I'm just a post silicon test engineer for a company that is developing a object oriented medium grain parallel processor on a chip. It's quite a
bit
like a FPGA, but the chip implements ALU state machines and table lookup logic. It has multiply / accumulate (dot product) units, datapath muxes
and
lots of memory. The kicker is it's a tagged architecture, with 5 bits
riding
shotgun on the 16 bits of data. That's 8 bits of tag and 2 ready bits for
a
real-world 32 bit object.
That sounds very interesting. Some previous Smalltalk computers (including massively parallel ones like the J-Machine) might give you more ideas:
http://www.merlintec.com:8080/hardware/26
I've heard talk in the office that a harvard/princeton machine demo hack implementation would be nice, but
that
seems really 1950's .My first thought was to build a FORTH machine, or a bunch of forth machines, But Squeak seems really attractive and it goes
well
with the object oriented mindset of the company and the chip.
I am very familiar with the work Chuck Moore has been doing in highly parallel Forth chips and started my current 16 bit processor project inspired directly by his MISC designs. I found that a very tiny change (a SELF register and segmented memory addressing) made it very suitable for running object oriented variations of Forth. A second small change (to support blocks) made it into a good Smalltalk processor. Eventually I replaced the 5 bit opcodes with a more Transputer-like bytecode, but the basic features have remained the same:
http://www.merlintec.com:8080/hardware/9
The current bytecodes are very similar to a special version of Squeak created by Anthony Hannan -
http://minnow.cc.gatech.edu/squeak/2119
The idea was that version 4.0 of Squeak is going to be a radical change where new images will not be compatible with older virtual machines and this would allow everything to be rethought. Later, however, it was decided that the new bytecodes would not be adopted in the future since even greater speed improvements will come from compilation techniques like Jitter or Exupery.
The main problem with the current bytecodes is that the only practical implementation in hardware is a microcoded one. I have investigated this and found that it would be possible to do a microcoded machine with just one extra pipeline stage compared to a hardcoded design (unlike JOP, which adds two stages - http://www.jopdesign.com/) and essentially the same performance. But for now my impression is that a hardcoded design with a small local memory would be a better choice.
So? do you think I should burn a couple hundred hours of my own time
trying
to cook this up?
You should take a look at what has been done before and then decide for yourself.
BTW: this thing runs at 1 GHz and has 256 alu's, 64 macs, tons of registers and scratch and 2X266 Mhz 40 bit ddr2 dram interfaces with 256 Mwords of store..
That sounds great! My own FPGA designs run in the 50 to 100MHz range and the plan for the commercial version (using larger FPGAs than the current prototypes) would be for only about 9 processors.
Just how do I implement a parallel or massivly pipeline(d) squeak machine?
Squeak itself is very sequential, unfortunately. You might get away with a three processor design where one runs the applications, the other bitblt and the third the garbage collector and other system stuff. But that is only if you are interested in running typical Squeak applications. For special software, like StarSqueak or Kadema there is lots of potential parallelism.
http://minnow.cc.gatech.edu/squeak/2292 http://www.squeakland.org/fun_projects/kedama/kedma_welcome.htm http://www.is.titech.ac.jp/~ohshima/squeak/kedama/
I have given some thought to the problem of implementing a massively parallel Smalltalk at the transistor level, but that is not a project I am currently working on:
http://www.merlintec.com:8080/hardware/19
Good luck! -- Jecel
beginners@lists.squeakfoundation.org