[Vm-dev] Re: [squeak-dev] RoarVM: The Manycore SqueakVM

Igor Stasenko siguctua at gmail.com
Thu Nov 4 19:07:31 UTC 2010


On 4 November 2010 20:07, Bert Freudenberg <bert at freudenbergs.de> wrote:
>
> On 03.11.2010, at 14:13, Stefan Marr wrote:
>
>> A small teaser:
>>  1 core   66286897 bytecodes/sec;  2910474 sends/sec
>>  8 cores 470588235 bytecodes/sec; 19825677 sends/sec
>
> I tried your precompiled OS X VM and the Sly3 image.
>
> 1 core:  93,910,491 bytecodes/sec; 4,056,440 sends/sec
> 2 cores: 91,559,370 bytecodes/sec; 4,007,927 sends/sec
> 3 cores: can't start
> 4 cores: 90,844,570 bytecodes/sec; 3,935,516 sends/sec
> 5 cores: can't start
> 6 cores: can't start
> 7 cores: can't start
> 8 cores: 89,698,668 bytecodes/sec; 3,910,787 sends/sec
>
> So it looks like you have to use a power-of-two cores?
>
> And the benchmark invocation should be different if you want to actually use multiple cores. What's the magic incantation?
>
> I tried something myself:
>
> n := 16.
> q := SharedQueue new.
> time := Time millisecondsToRun:
>        [n timesRepeat: [[q nextPut: [30 benchFib] timeToRun] fork].
>        n timesRepeat: [Transcript space; show: q next]].
> Transcript space; show: time; cr
>
> 1 core:  664 664 665 666 667 662 664 664 668 665 667 665 666 669 666 10700
> 2 cores: 675 674 672 669 677 669 669 672 678 670 668 669 674 668 668 5425
> 4 cores: 721 726 729 740 713 728 740 734 731 737 721 737 734 756 788 749 3030
> 8 cores: 786 807 837 847 865 872 916 840 800 873 792 880 846 865 829 1820
>
> Now that scales pretty nicely :) The overhead is about 25% at 8 cores, 12% for 4 cores.
>
i don't like this tendency. for 16 cores it will be 50%, and for 32 - 100% :)
Doesn't sounds like 'designed for manycore systems'.
But i suspect that it's because code you running don't takes new VM
capabilities into account.

> For our regular interpreter (*) I get:
> 1 core: 162 159 157 158 158 160 159 159 159 159 159 158 160 158 159 2585
>
> So RoarVM is about 4 times slower in sends, even more so for bytecodes. It needs 8 cores to be faster the regular interpreter on a single core. To the good news is that it can beat the old interpreter :)  But why is it so much slower than the normal interpreter?
>

I would not care much about single core performance for now. Since
once you got the potential of hundred of cores at your disposal,
you can even run things at a lower clock rate, because it not really
matters anymore.

> Btw, user interrupt didn't work on the Mac.
>
> And in the Squeak-4.1 image, when running on 2 or more cores Morphic gets incredibly sluggish, pretty much unusably so.
>
not a surprise. Image and code, which not aware of new VM
capabilities, usually wins nothing, and even losing comparing to
'standard' VM.

Hydra VM were able to run multiple interpreters in single process
space, and overhead of this are 5-10% performance degradation.
But given that you can run N interpreters in parallel, such slowdown
can be neglected.


> - Bert -
>
> (*) For comparison, a regular interpreter (not Cog) on this machine gets
>    789,514,263 bytecodes/sec; 17,199,374 sends/sec
> and Cog does
>    880,481,513 bytecodes/sec; 70,113,306 sends/sec
>
>



-- 
Best regards,
Igor Stasenko AKA sig.


More information about the Vm-dev mailing list