Jecel Assumpcao Jr schrieb:
I am back from a nice vacation (my first since 1982...) and see that you have all been waiting patiently for me to return before starting any interesting discussions ;-)
Ok, now that you're back, I can finally ask the questions which have puzzled me ever since :-) When implementing Smalltalk hardware, what kind of object memory design would you prefer: - object table vs. direct pointer - ref counting vs. generation scavenging cs -whatever- - page-based virtual memory vs. object-based VM vs. none at all - in-band tags vs. out-of-band tags Let me explain the last one. Smalltalk Implementations have typically had in-bad tags, i.e. the tags have only meaning in a word which is known to be an oop. The Burroughs machines had out-of-band tags, that is, every memory word had an additional tag. That way, you could even know whether a memory word contained uninterpreted bytes, a float value, or an "oop" (descriptor).
One interesting side effect of this trip is that I now have a new FPGA development kit - the ML401 from Xilinx.
Just out of curiosity: how much does one pay for such a kit? I don't know whether my wife would approve of a major investment into something of little perceived value :-)
Cheers, Hans-Martin
"Just out of curiosity: how much does one pay for such a kit? I don't
know whether my wife would approve of a major investment into something of little perceived value."
Hans-Martin,
Marriage is based on "give and take" -- sometimes one must give excuses, while at other times it's necessary to take chances.
<G>
The Xilinx ML401 runs about US$500.
Gary
----- Original Message ----- From: "Hans-Martin Mosner" hmm@heeg.de To: hardware@lists.squeakfoundation.org Sent: Thursday, August 02, 2007 2:29 AM Subject: Re: [Hardware] language neutral processors
Jecel Assumpcao Jr schrieb:
I am back from a nice vacation (my first since 1982...) and see that you have all been waiting patiently for me to return before starting any interesting discussions ;-)
Ok, now that you're back, I can finally ask the questions which have puzzled me ever since :-) When implementing Smalltalk hardware, what kind of object memory design would you prefer: - object table vs. direct pointer - ref counting vs. generation scavenging cs -whatever- - page-based virtual memory vs. object-based VM vs. none at all - in-band tags vs. out-of-band tags Let me explain the last one. Smalltalk Implementations have typically had in-bad tags, i.e. the tags have only meaning in a word which is known to be an oop. The Burroughs machines had out-of-band tags, that is, every memory word had an additional tag. That way, you could even know whether a memory word contained uninterpreted bytes, a float value, or an "oop" (descriptor).
One interesting side effect of this trip is that I now have a new FPGA development kit - the ML401 from Xilinx.
Just out of curiosity: how much does one pay for such a kit? I don't know whether my wife would approve of a major investment into something of little perceived value :-)
Cheers, Hans-Martin _______________________________________________ Hardware mailing list Hardware@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/hardware
Hans-Martin Mosner wrote on Thu, 02 Aug 2007 08:29:54 +0200
When implementing Smalltalk hardware, what kind of object memory design would you prefer:
- object table vs. direct pointer
The great thing about the Mushroom-style virtually addressed caches (which Mario Wolczko is still trying to get Sun to adopt today) is that you get the flexibility of the object table and the performance of direct pointers. It is quite simple: you present to the cache an address with two parts, the object ID and the offset, and if there is a hit you get back the data directly. Only when there is a cache miss do you have to look up in some tables to translate the object ID into a physical address to be able to load the data into the cache.
Normally virtually addressed caches are considered a bad thing even though they are faster than physically addressed caches. But that is due to aliasing problems (two different cache lines might have copies of the same data) which make life difficult for C. Since Smalltalk doesn't have pointer math we don't have to worry about that.
http://www.wolczko.com/mushroom/index.html http://research.sun.com/projects/dashboard.php?id=13
- ref counting vs. generation scavenging cs -whatever-
In general I like whatever works in a distributed context. For a single workstation generation scavenging still seems to be the state of the art. Though these surveys are a bit old, they are still my main reference for garbage collection:
http://www.cs.utexas.edu/users/oops/papers.html#bigsurv (but see paper 12 as well) http://citeseer.ist.psu.edu/228113.html
- page-based virtual memory vs. object-based VM vs. none at all
Object-based is nicer. I like not only what the Mushroom guys did but am also a fan of LOOM and even OOZE. That said, my current design swaps whole groups of objects (perhaps I should call them islands?) from/to disk instead of individual objects. This is based on the fact that the observations that led to the design of the Amoeba operating system (size of memory vs size of files and speed of networks) are even more true today.
- in-band tags vs. out-of-band tags
Let me explain the last one. Smalltalk Implementations have typically had in-bad tags, i.e. the tags have only meaning in a word which is known to be an oop. The Burroughs machines had out-of-band tags, that is, every memory word had an additional tag. That way, you could even know whether a memory word contained uninterpreted bytes, a float value, or an "oop" (descriptor).
My own preference would be what you called out-of-band tags. For example, it is easy to implement a 36 bit processor in a modern FPGA (all internal memories are a multiple of 9 bits, not 8 like earlier chips) using ECC memory cards. The math and addresses would be 32 bits so you would have 4 bits for tags. When dealing with Flash, disks, networks and so on you could always compress/decompress your data and so the mismatch between 36 and 8 bits would not be a problem.
When using development boards it isn't so easy to do this. The ML401, for example, has 1MB of 36 bit ZBT SRAM but the 64MB of SDRAM is only 32 bits wide. And since my current focus is on making my project more Squeak (and C) friendly, RISC42 is a 32 bit processor. I am not sure that I would call Squeak's 1 bit tag and 31 bit data organization "in-band", however. It certainly is when seen from the viewpoint of the hardware or a few critical spots in the virtual machine, but for the rest of the system it looks as much as out-of-band as the Burroughs computers (the first real computer I ever used was a B6700, by the way) ever were.
Just out of curiosity: how much does one pay for such a kit? I don't know whether my wife would approve of a major investment into something of little perceived value :-)
to which Gary Fisher replied:
The Xilinx ML401 runs about US$500.
That is exactly why I got this one - there is a limit to how much electronics tourists can bring back to Brazil and that happens to be $500 :-) This allowed me to avoid the absurd import taxes (which add up to over 100%) that I paid on previous kits and the $100 Laptop.
A more modern kit like the ML501 costs twice as much but there are far cheaper options. The two leading FPGA companies have high end families (Virtex for Xilinx and Stratix for Altera) and low end ones (Spartan and Cyclone respectively) and there are several smaller players with many options. The Spartan 3E Starter Kit that Travis Kay mentioned in one of the first messages to this list, for example, only costs $149.
http://www.xilinx.com/xlnx/xebiz/designResources/ip_product_details.jsp?iLan...
You can get a good idea of what boards are available at:
http://www.fpga-faq.com/FPGA_Boards.shtml
I find the boards from Xess pretty interesting, though there are some other good companies out there. The problem with these companies is that they don't sell many boards each year, so their prices are rather high if you only look at what components are in each board. In contrast, some kits from the chip manufacturers are being sold below cost.
And David T. Lewis wrote:
Well, I don't have anything worthwhile to add, but I'm certainly interested in your discussions. Hope you don't mind folks just lurking and enjoying the read :)
That is the whole point of this list. Matthew and I were having a private discussion and realised that other people might be interested in it. Some might have something to add while others might just want to read this stuff. And which ones are which will probably change over time. Having a list that is archived will allow people who come in later (perhaps via a search) to read this.
-- Jecel
On Thu, Aug 02, 2007 at 08:29:54AM +0200, Hans-Martin Mosner wrote:
Jecel Assumpcao Jr schrieb:
I am back from a nice vacation (my first since 1982...) and see that you have all been waiting patiently for me to return before starting any interesting discussions ;-)
Ok, now that you're back, I can finally ask the questions which have puzzled me ever since :-) When implementing Smalltalk hardware, what kind of object memory design would you prefer:
I will attempt to answer based on my current understanding:
- object table vs. direct pointer
Object table is more flexible as it maintains a mapping of objects -> memory locations. However, on traditional machines, this slows things down as it doubles the number of memmory acesses. However, it is possible to optimize this in hardware by changing the addressing scheme of the processor and the cache (this cannot be done in software at all). The idea was first tried in the Mushroom computer [1], and has been incorperated into Jecel's plurion architecture.
What happens is that the processor *only* knows about object hashes/id's/whatever, and knows nothing of the memory layout. When the processor needs to read from memory, it sends out the object id and an internal offset on the address line. If the cache knows the object, it is immediately returned. If not, then a software interupt is raised which is expected to populate the cache with the requested data (exactly like how a TLB interupt works in a virtual paging OS). In the simplest case, this would just mean dereferencing the object table and storing the object in the cache, but it could be more complex. After the interupt returns, the fetch is retried and is found in the cache.
- ref counting vs. generation scavenging cs -whatever-
Garbage collection strategies should probably be left up to the OS/language. Maybe you are talking about object data caches? [2]
- page-based virtual memory vs. object-based VM vs. none at all
Plurion uses an object memory, as I stated above. I think I don't understand your question.
- in-band tags vs. out-of-band tags
Let me explain the last one. Smalltalk Implementations have typically had in-bad tags, i.e. the tags have only meaning in a word which is known to be an oop. The Burroughs machines had out-of-band tags, that is, every memory word had an additional tag. That way, you could even know whether a memory word contained uninterpreted bytes, a float value, or an "oop" (descriptor).
RISC42 uses 32-bit memory with 4 tag bits per word [3]
[1]: Mushroom computer: - http://www.wolczko.com/mushroom/
[2]: An Object-aware memory architecture - http://research.sun.com/techrep/2005/abstract-143.html
[3]: Completely Out-of-date Plurion paper. Page 10 has the tag assignments: - http://www.merlintec.com/download/plurion.pdf
hardware@lists.squeakfoundation.org