Jecel Assumpcao Jr jecel@merlintec.com wrote:
By running the benchmarks for the "green book" and doing a lot of rough extrapolations, my guess is that the Dorado would get between 200K and 400K bytecodes/sec.
That is pretty much what I remember as the claim for Dorados.
That is better than what I got running Squeak 1.16 on a 33MHz 486 machine which was some 13 years newer, but far below what I expected. My impression was that the old 20MHz ECL computer was able to reach a peak of one bytecode per clock which would indicate a number around three or four times better.
I was under the impression that the Dorado was a 70nS cycle machine, ie 14MHz or so. And many bytecodes would take more than one cycle of course.
After looking at a bunch of numbers my conclusion is that Squeak is usable on machines capable of at least 20M bps. If my guess above is corrent, that would be around 60 Dorados.
My Iyonix does about 35mbc/s on the dumb-bytecodes test and about 20 Dorado in the greenbook tests - with a tiny cache (don't imagine Dorado had much of one either!) the longer prims etc suffer relative to a monster watt-sucker like a pentium.
My first ARM system was 4 mips (no cache at all, not even instruction pre- fetch) with 4Mb of slow ram. It scored 27% Dorado but was thoroughly usable as a UI. It could scan and layout nice formatted text in pretty fonts faster than the contemporaneous PCs running Aldus (for example) could do. The ARM 3 upgrade gave 10mips and 127% on the same motherboard/ram. The Iyonix is 600MHz with larger caches and fast memory but only scores 15 times faster, an indication of how much there is still to get out of a simple interpreter.
If both estimates are true, then I wonder if our definition of "acceptable" has changed or if Squeak has become less efficient. Certainly Morphic is always being blamed for slowing everything down, so the latter is probably the case.
I suspect it is largely the rather poor UI responsiveness that is the problem. In MVC on my machine the UI flies, menus are instant, browser bang open etc. In morphic every thing is
really
slowwwwwwwwww
Some of it is simply that morphic is sloshing large bitmaps around with high colour depths. Some of it is probably some dumb algorthmic error that nobody has spotted yet.
tim -- Tim Rowledge, tim@sumeru.stanford.edu, http://sumeru.stanford.edu/tim Useful random insult:- Useful as a hip pocket on a T-shirt.
I run squeak on a 1.3 millon bytecodes/sec and in MVC it's usable.
Not fast, usable.
cheers
bruce
Tim Rowledge tim@rowledge.org wrote:
Date: Wed, 27 Apr 2005 19:51:39 -0700 From: Tim Rowledge tim@rowledge.org Subject: Re: Dorado bytecodes per second To: squeak-dev@lists.squeakfoundation.org reply-to: The general-purpose Squeak developers list squeak-dev@lists.squeakfoundation.org content-length: 2464
Jecel Assumpcao Jr jecel@merlintec.com wrote:
By running the benchmarks for the "green book" and doing a lot of rough extrapolations, my guess is that the Dorado would get between 200K and 400K bytecodes/sec.
That is pretty much what I remember as the claim for Dorados.
That is better than what I got running Squeak 1.16 on a 33MHz 486 machine which was some 13 years newer, but far below what I expected. My impression was that the old 20MHz ECL computer was able to reach a peak of one bytecode per clock which would indicate a number around three or four times better.
I was under the impression that the Dorado was a 70nS cycle machine, ie 14MHz or so. And many bytecodes would take more than one cycle of course.
After looking at a bunch of numbers my conclusion is that Squeak is usable on machines capable of at least 20M bps. If my guess above is corrent, that would be around 60 Dorados.
My Iyonix does about 35mbc/s on the dumb-bytecodes test and about 20 Dorado in the greenbook tests - with a tiny cache (don't imagine Dorado had much of one either!) the longer prims etc suffer relative to a monster watt-sucker like a pentium.
My first ARM system was 4 mips (no cache at all, not even instruction pre- fetch) with 4Mb of slow ram. It scored 27% Dorado but was thoroughly usable as a UI. It could scan and layout nice formatted text in pretty fonts faster than the contemporaneous PCs running Aldus (for example) could do. The ARM 3 upgrade gave 10mips and 127% on the same motherboard/ram. The Iyonix is 600MHz with larger caches and fast memory but only scores 15 times faster, an indication of how much there is still to get out of a simple interpreter.
If both estimates are true, then I wonder if our definition of "acceptable" has changed or if Squeak has become less efficient. Certainly Morphic is always being blamed for slowing everything down, so the latter is probably the case.
I suspect it is largely the rather poor UI responsiveness that is the problem. In MVC on my machine the UI flies, menus are instant, browser bang open etc. In morphic every thing is
really
slowwwwwwwwww
Some of it is simply that morphic is sloshing large bitmaps around with high colour depths. Some of it is probably some dumb algorthmic error that nobody has spotted yet.
tim
Tim Rowledge, tim@sumeru.stanford.edu, http://sumeru.stanford.edu/tim Useful random insult:- Useful as a hip pocket on a T-shirt.
Tim Rowledge wrote on Wed, 27 Apr 2005 19:51:39 -0700
I was under the impression that the Dorado was a 70nS cycle machine, ie 14MHz or so. And many bytecodes would take more than one cycle of course.
Hmmm.... 70ns is what the green book and the SOAR paper say. I found the 50ns number (with the actual clock of 25ns) in
http://www.bitsavers.org/pdf/xerox/dorado/doradoHardwMan.pdf
My calculations have so many errors, of course, that this doesn't really make any difference.
My Iyonix does about 35mbc/s on the dumb-bytecodes test and about 20 Dorado in the greenbook tests - with a tiny cache (don't imagine Dorado had much of one either!) the longer prims etc suffer relative to a monster watt-sucker like a pentium.
The Dorado's cache was 8KB, though the generous number of registers and stack as well as the microcode memory helped this go much further than it normally would.
35/20 = 1.75mbc/s (which is what I expected, not what I had calculated and you confirmed was the quoted number) and Bruce said in another message in this thread that something in this range makes Squeak MVC usable.
I had a lot of trouble calculating speed relative to the Dorado from the green book numbers since there only include total time and many of the tests seem to use a different number of repetitions in the version available on Squeak Map (which crashes in 3.8 on the scan text test). Is there some easier way to get this number?
My first ARM system was 4 mips (no cache at all, not even instruction pre- fetch) with 4Mb of slow ram. It scored 27% Dorado but was thoroughly usable as a UI. It could scan and layout nice formatted text in pretty fonts faster than the contemporaneous PCs running Aldus (for example) could do. The ARM 3 upgrade gave 10mips and 127% on the same motherboard/ram. The Iyonix is 600MHz with larger caches and fast memory but only scores 15 times faster, an indication of how much there is still to get out of a simple interpreter.
(600MHz / 8MHz) * 0.27 Dorados = 20.25 Dorados. That seems about right. Wasn't the ARM3 around 36MHz or something like that?
The last Smalltalk computer I built which is fully working (hardware and simple demo software, not Smalltalk) was an ARM2 one. The idea was to use the adaptive compilation stuff from Self to make it usable. http://www.merlintec.com/lsi/merlin4.html
I suspect it is largely the rather poor UI responsiveness that is the problem. In MVC on my machine the UI flies, menus are instant, browser bang open etc. In morphic every thing is
really
slowwwwwwwwww
Some of it is simply that morphic is sloshing large bitmaps around with high colour depths. Some of it is probably some dumb algorthmic error that nobody has spotted yet.
I normally use Squeak side by side on two machines: a 54 mbc/s Pentium III and a 18 mbc/s UltraSparc. They feel practically the same to me running Morphic applications. But the other day when I tried the HTMLTableMorph thing in Scamper the difference was unbelievable! One felt like 20 times faster than the other instead of merely three times the speed as it actually is. So it is obvious that simple benchmarks are only loosely related to user experience. But since that is the only information that is practical for me to gather I will have to make the most of it.
For the case of manipulating bitmaps we can always have a coprocessor help out to fix the problem. While it would be great if some hotspot could be found and eliminated from Morphic it is far more likely that the overhead is distributed all over the place. On the other hand, I don't think a MVC only Squeak machine is practical for general use.
Thanks for the feedback, -- Jecel
squeak-dev@lists.squeakfoundation.org