StackVM with latest sources tinyBenchmarks

List overview All Threads
Download

newer

older

VM Maker: VMMaker.oscog-eem.423.mcz

VM Maker: VMMaker.oscog-eem.422.mcz

Esteban Lorenzano

15 Feb 2013 15 Feb '13

2:49 p.m.

Hi,

I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.

Do you tried to compile a stack vm lately? any idea where to start look for bugs?

thanks, Esteban

Show replies by date

Igor Stasenko

15 Feb 15 Feb

2:58 p.m.

On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:

...

Hi,

I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.

Do you tried to compile a stack vm lately? any idea where to start look for bugs?

just tried on my machine.. the results is discouraging:

1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'

...

thanks, Esteban

-- Best regards, Igor Stasenko.

Nicolas Cellier

17 Feb 17 Feb

6:21 p.m.

This is not confirmed in regular svn cog branch

1 tinyBenchmarks '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM '656 410 256 bytecodes/sec; 67 802 547 sends/sec' Cog VM

Nicolas

2013/2/15 Igor Stasenko siguctua@gmail.com:

...

On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:

...
Hi,

I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.

Do you tried to compile a stack vm lately? any idea where to start look for bugs?

just tried on my machine.. the results is discouraging:

1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'

...
thanks, Esteban

-- Best regards, Igor Stasenko.

Eliot Miranda

19 Feb 19 Feb

7:50 p.m.

Hi All,

kudos to Nicolas for posting some useful numbers in that they provide some context, in this case the other VMs running on the same machine. But wrist slaps to all of you for not specifying:

1. which OS 2. what hardware 3. what C compiler was used to compile the VM

Further kudos for indicating what kind of load the machine is under (one has to run benchmarks on a relatively unstressed machine, even if multicore), and, *really usefully*, what a previous version's benchmark score is on the same machine.

Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter are exactly what one should expect for nfib (the sends/sec part of tinyBenchmarks) with the current Cog architecture.

On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:

...

This is not confirmed in regular svn cog branch

1 tinyBenchmarks '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM '656 410 256 bytecodes/sec; 67 802 547 sends/sec' Cog VM

Nicolas

2013/2/15 Igor Stasenko siguctua@gmail.com:

...
On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:

...
Hi,

I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.

Do you tried to compile a stack vm lately? any idea where to start look for bugs?

just tried on my machine.. the results is discouraging:

1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'

...
thanks, Esteban

-- Best regards, Igor Stasenko.

-- best, Eliot

Nicolas Cellier

11:02 p.m.

2013/2/19 Eliot Miranda eliot.miranda@gmail.com:

...

Hi All,
kudos to Nicolas for posting some useful numbers in that they
provide some context, in this case the other VMs running on the same machine. But wrist slaps to all of you for not specifying:

which OS

what hardware

what C compiler was used to compile the VM

Sure, above numbers were not really meaningful, apart the ratio... Mac OS X (Mac OS 1068 intel) MacMini 2.26 GHz Intel Core 2 Duo Compiler: 4.2.1 (Apple Inc. build 5666) (dot 3)

As for the load, I don't know how to provide a synthetic measurement, but it's low... The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.

Nicolas

...

Further kudos for indicating what kind of load the machine is under (one has to run benchmarks on a relatively unstressed machine, even if multicore), and, *really usefully*, what a previous version's benchmark score is on the same machine.

Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter are exactly what one should expect for nfib (the sends/sec part of tinyBenchmarks) with the current Cog architecture.

On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:

...
This is not confirmed in regular svn cog branch

1 tinyBenchmarks '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM '656 410 256 bytecodes/sec; 67 802 547 sends/sec' Cog VM

Nicolas

2013/2/15 Igor Stasenko siguctua@gmail.com:

...
On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:

...
Hi,

I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.

Do you tried to compile a stack vm lately? any idea where to start look for bugs?

just tried on my machine.. the results is discouraging:

1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'

...
thanks, Esteban

-- Best regards, Igor Stasenko.

-- best, Eliot

Eliot Miranda

11:08 p.m.

On Tue, Feb 19, 2013 at 2:02 PM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:

...

2013/2/19 Eliot Miranda eliot.miranda@gmail.com:

...
Hi All,
kudos to Nicolas for posting some useful numbers in that they
provide some context, in this case the other VMs running on the same machine. But wrist slaps to all of you for not specifying:

which OS

what hardware

what C compiler was used to compile the VM
Sure, above numbers were not really meaningful, apart the ratio... Mac OS X (Mac OS 1068 intel) MacMini 2.26 GHz Intel Core 2 Duo Compiler: 4.2.1 (Apple Inc. build 5666) (dot 3)

As for the load, I don't know how to provide a synthetic measurement, but it's low...

uptime is fine on Mac & Linux. Don't know about Windows.

...

The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.

One simple approach is to run the benchmark three times and to discard the best and the worst results.

...

Nicolas

...
Further kudos for indicating what kind of load the machine is under (one has to run benchmarks on a relatively unstressed machine, even if multicore), and, *really usefully*, what a previous version's benchmark score is on the same machine.

Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter are exactly what one should expect for nfib (the sends/sec part of tinyBenchmarks) with the current Cog architecture.

On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:

...
This is not confirmed in regular svn cog branch

1 tinyBenchmarks '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM '656 410 256 bytecodes/sec; 67 802 547 sends/sec' Cog VM

Nicolas

2013/2/15 Igor Stasenko siguctua@gmail.com:

...
On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:

...
Hi,

I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.

Do you tried to compile a stack vm lately? any idea where to start look for bugs?

just tried on my machine.. the results is discouraging:

1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'

...
thanks, Esteban

-- Best regards, Igor Stasenko.

-- best, Eliot

-- best, Eliot

Camillo Bruni

11:16 p.m.

...

...
The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.

One simple approach is to run the benchmark three times and to discard the best and the worst results.

that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std deviation?

Too much work? use http://www.squeaksource.com/SMark.html

Eliot Miranda

20 Feb 20 Feb

1:25 a.m.

On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni camillobruni@gmail.com wrote:

...

...
...
The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.

One simple approach is to run the benchmark three times and to discard the best and the worst results.

that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std deviation?

If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.

...

Too much work? use http://www.squeaksource.com/SMark.html

-- best, Eliot

Camillo Bruni

8:10 a.m.

On 2013-02-20, at 01:25, Eliot Miranda eliot.miranda@gmail.com wrote:

...

On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni camillobruni@gmail.com wrote:

...
...
...
The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.

One simple approach is to run the benchmark three times and to discard the best and the worst results.

that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std deviation?

If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.

true, but the deviation will most probably give you exactly that feedback. if you increase the runs but the quality of the result doesn't improve you know that you're dealing with some systematic error source.

This approach is simply more scientific and less home-brewed.

Eliot Miranda

6:29 p.m.

On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni camillobruni@gmail.com wrote:

...

On 2013-02-20, at 01:25, Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni camillobruni@gmail.com wrote:

...
...
...
The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.

One simple approach is to run the benchmark three times and to discard the best and the worst results.

that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std deviation?

If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.

true, but the deviation will most probably give you exactly that feedback. if you increase the runs but the quality of the result doesn't improve you know that you're dealing with some systematic error source.

This approach is simply more scientific and less home-brewed.

Of course, no argument here. But what's being discussed is using tinyBenchmarks as a quick smoke test. A proper CI system can be set it up for reliable results, but for IMO for a quick smoke test doing three runs manually is fine. IME, what tends to happen is that the first run is slow (caches heating up etc) and the second two runs are extremely close.

-- best, Eliot

Igor Stasenko

21 Feb 21 Feb

5:09 a.m.

On 20 February 2013 18:29, Eliot Miranda eliot.miranda@gmail.com wrote:

...

On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni camillobruni@gmail.com wrote:

...
On 2013-02-20, at 01:25, Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni camillobruni@gmail.com wrote:

...
...
...
The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.

One simple approach is to run the benchmark three times and to discard the best and the worst results.

that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std deviation?

If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.

true, but the deviation will most probably give you exactly that feedback. if you increase the runs but the quality of the result doesn't improve you know that you're dealing with some systematic error source.

This approach is simply more scientific and less home-brewed.

Of course, no argument here. But what's being discussed is using tinyBenchmarks as a quick smoke test. A proper CI system can be set it up for reliable results, but for IMO for a quick smoke test doing three runs manually is fine. IME, what tends to happen is that the first run is slow (caches heating up etc) and the second two runs are extremely close.

but not in case when you have an order(s) of magnitude speed degradation. This is too significant to be considered as measurement error or deviation. There should be something wrong with VM (cache always fails?).

...

-- best, Eliot

-- Best regards, Igor Stasenko.

Guillermo Polito

26 Jul 26 Jul

12:20 p.m.

Ok, following with this. What I can add to the discussion:

In linux, latest VMs yield the following results (I added a space every three digits just to enhance readability)

"Pharo Cog" 1 tinyBenchmarks '887 348 353 bytecodes/sec; 141 150 557 sends/sec'

"Pharo Stack" 1 tinyBenchmarks '445 217 391 bytecodes/sec; 24 395 999 sends/sec'

While in Mac

"Pharo Cog" 1 tinyBenchmarks '895 104 895 bytecodes/sec; 138 102 772 sends/sec'

"Pharo Stack" 1 tinyBenchmarks '3 319 502 bytecodes/sec; 217 939 sends/sec'

So, I'd say it's a problem in cmake configuration or just compilation in mac :). Though I didn't test on windowze.

Another thing that I noticed is that when compiling my VM on Mac, since I updated Xcode, I was not longer using gnu gcc but llvm one. I tried to go back using the gnu gcc but couldn't make it work so far, he.

On Thu, Feb 21, 2013 at 5:09 AM, Igor Stasenko siguctua@gmail.com wrote:

...

On 20 February 2013 18:29, Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni camillobruni@gmail.com

wrote:

...
...
On 2013-02-20, at 01:25, Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni camillobruni@gmail.com

wrote:

...
...
...
...
...
> The most annoying piece is Time machine and its disk access, I > sometimes forget to suspend it, but it was off during the > tinyBenchmark.

One simple approach is to run the benchmark three times and to

discard

...
...
...
...
...
the best and the worst results.

that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std

deviation?

...
...
...
If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.

true, but the deviation will most probably give you exactly that

feedback.

...
...
if you increase the runs but the quality of the result doesn't improve you know that you're dealing with some systematic error source.

This approach is simply more scientific and less home-brewed.

Of course, no argument here. But what's being discussed is using tinyBenchmarks as a quick smoke test. A proper CI system can be set it up for reliable results, but for IMO for a quick smoke test doing three runs manually is fine. IME, what tends to happen is that the first run is slow (caches heating up etc) and the second two runs are extremely close.

but not in case when you have an order(s) of magnitude speed degradation. This is too significant to be considered as measurement error or deviation. There should be something wrong with VM (cache always fails?).

...
-- best, Eliot

-- Best regards, Igor Stasenko.

Guillermo Polito

15 Aug 15 Aug

8:06 p.m.

So, after digging a bit I've got some results and conclusions:

- The CMake configurations were using gcc to compile, which in mac is llvm-gcc - Xcode uses clang compiler, not gcc (and some different compiling flags also) - I've played changing the configuration to use clang compiler

setGlobalOptions: maker

super setGlobalOptions: maker. maker set: 'CMAKE_C_COMPILER' to: 'clang'. maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.

And to make it work as in gcc I added the following also (in some plugins such as mp3plugin there are functions with return type and return statements with no values specified).

compilerFlagsRelease

^super compilerFlagsRelease, #( '-Wno-return-type' )

And it compiled with the following results in the tinyBenchmarks:

'510723192 bytecodes/sec; -142407 sends/sec'

Which is, in the bytecode part, pretty much close to what we expect, and in the sends, looks buggy :). But the overall performance using the image is far better

Cheers, Guille

On Fri, Jul 26, 2013 at 12:20 PM, Guillermo Polito < guillermopolito@gmail.com> wrote:

...

Ok, following with this. What I can add to the discussion:

In linux, latest VMs yield the following results (I added a space every three digits just to enhance readability)

"Pharo Cog" 1 tinyBenchmarks '887 348 353 bytecodes/sec; 141 150 557 sends/sec'

"Pharo Stack" 1 tinyBenchmarks '445 217 391 bytecodes/sec; 24 395 999 sends/sec'

While in Mac

"Pharo Cog" 1 tinyBenchmarks '895 104 895 bytecodes/sec; 138 102 772 sends/sec'

"Pharo Stack" 1 tinyBenchmarks '3 319 502 bytecodes/sec; 217 939 sends/sec'

So, I'd say it's a problem in cmake configuration or just compilation in mac :). Though I didn't test on windowze.

Another thing that I noticed is that when compiling my VM on Mac, since I updated Xcode, I was not longer using gnu gcc but llvm one. I tried to go back using the gnu gcc but couldn't make it work so far, he.

On Thu, Feb 21, 2013 at 5:09 AM, Igor Stasenko siguctua@gmail.com wrote:

...
On 20 February 2013 18:29, Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni camillobruni@gmail.com

wrote:

...
...
On 2013-02-20, at 01:25, Eliot Miranda eliot.miranda@gmail.com

wrote:

...
...
...
On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni <

camillobruni@gmail.com> wrote:

...
...
...
...
>> The most annoying piece is Time machine and its disk access, I >> sometimes forget to suspend it, but it was off during the >> tinyBenchmark. > > One simple approach is to run the benchmark three times and to

discard

...
...
...
...
> the best and the worst results.

that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg +

std deviation?

...
...
...
If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.

true, but the deviation will most probably give you exactly that

feedback.

...
...
if you increase the runs but the quality of the result doesn't improve you know that you're dealing with some systematic error source.

This approach is simply more scientific and less home-brewed.

Of course, no argument here. But what's being discussed is using tinyBenchmarks as a quick smoke test. A proper CI system can be set it up for reliable results, but for IMO for a quick smoke test doing three runs manually is fine. IME, what tends to happen is that the first run is slow (caches heating up etc) and the second two runs are extremely close.

but not in case when you have an order(s) of magnitude speed degradation. This is too significant to be considered as measurement error or deviation. There should be something wrong with VM (cache always fails?).

...
-- best, Eliot

-- Best regards, Igor Stasenko.

Göran Krampe

26 Sep 26 Sep

9:21 a.m.

New subject: The Mac VM (pharo)

Hey!

(nice to meet at ESUG btw)

On 08/15/2013 08:06 PM, Guillermo Polito wrote:

...

So, after digging a bit I've got some results and conclusions:

The CMake configurations were using gcc to compile, which in mac is

llvm-gcc

Xcode uses clang compiler, not gcc (and some different compiling flags

also)

I've played changing the configuration to use clang compiler

Mmmmm... this is all slightly confusing, so many combos of compilers here (I am a Mac n00b), this is what I have (or more?):

gcc --version Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1 Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix

gcc-4.2 --version i686-apple-darwin11-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)

llvm-gcc --version i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)

llvm-gcc-4.2 --version i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)

clang --version Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix

(I removed some copyright notices from the above)

...

setGlobalOptions: maker

super setGlobalOptions: maker.
maker set: 'CMAKE_C_COMPILER' to: 'clang'.
maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.

AFAICT "clang" is the same as "gcc", no? See my printouts above. The only difference seem to be the added prefix/include-dir config.

...

And to make it work as in gcc I added the following also (in some plugins such as mp3plugin there are functions with return type and return statements with no values specified).

compilerFlagsRelease
^super compilerFlagsRelease, #( '-Wno-return-type' )

Aaaah!! Perfect. I just went through this yesterday and also failed at the mpeg3plugin.

...

And it compiled with the following results in the tinyBenchmarks:

'510723192 bytecodes/sec; -142407 sends/sec'

Which is, in the bytecode part, pretty much close to what we expect, and in the sends, looks buggy :). But the overall performance using the image is far better

Ok, I will try to get the build to use gcc-4.2 (the non LLVM gcc) and compare it to the clang (=gcc) VM.

regards, Göran

PS. I am on Mountain Lion and have Xcode 5 installed + CLI tools + brew apple-gcc4.2.

Eliot Miranda

9:26 a.m.

New subject: The Mac VM (pharo)

Hi Göran,

On Thu, Sep 26, 2013 at 12:21 AM, Göran Krampe goran@krampe.se wrote:

...

Hey!

(nice to meet at ESUG btw)

On 08/15/2013 08:06 PM, Guillermo Polito wrote:

...
So, after digging a bit I've got some results and conclusions:

The CMake configurations were using gcc to compile, which in mac is

llvm-gcc

Xcode uses clang compiler, not gcc (and some different compiling flags

also)

I've played changing the configuration to use clang compiler

Mmmmm... this is all slightly confusing, so many combos of compilers here (I am a Mac n00b), this is what I have (or more?):

gcc --version Configured with: --prefix=/Applications/Xcode.**app/Contents/Developer/usr --with-gxx-include-dir=/usr/**include/c++/4.2.1 Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix

gcc-4.2 --version i686-apple-darwin11-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)

llvm-gcc --version i686-apple-darwin11-llvm-gcc-**4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)

llvm-gcc-4.2 --version i686-apple-darwin11-llvm-gcc-**4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)

clang --version Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix

(I removed some copyright notices from the above)

setGlobalOptions: maker

...
super setGlobalOptions: maker.
maker set: 'CMAKE_C_COMPILER' to: 'clang'.
maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.
AFAICT "clang" is the same as "gcc", no? See my printouts above. The only difference seem to be the added prefix/include-dir config.

No; very different. I'm not an expert but I think essentially clang is Apple's C compiler that uses the LLVM backend, and gcc is good old gcc. Right now Cog doesn't run if compiled with clang. Only gcc will do. No time to debug this right now, and annoyingly clang compiles all static functions with a non-standard calling convention which means one can;t call these functions in gdb, hence lots of debugging functions aren't available without either a) turning off the optimization or b) changing the VM source so they're not static. I prefer a). If anyone knows of a flag to do this *please* let me know asap.

...

And to make it work as in gcc I added the following also (in some

...
plugins such as mp3plugin there are functions with return type and return statements with no values specified).

compilerFlagsRelease
^super compilerFlagsRelease, #( '-Wno-return-type' )
Aaaah!! Perfect. I just went through this yesterday and also failed at the mpeg3plugin.

And it compiled with the following results in the tinyBenchmarks:

...
'510723192 bytecodes/sec; -142407 sends/sec'

...
Which is, in the bytecode part, pretty much close to what we expect, and in the sends, looks buggy :). But the overall performance using the image is far better

Ok, I will try to get the build to use gcc-4.2 (the non LLVM gcc) and compare it to the clang (=gcc) VM.

regards, Göran

PS. I am on Mountain Lion and have Xcode 5 installed + CLI tools + brew apple-gcc4.2.

-- best, Eliot

Tobias Pape

9:56 a.m.

New subject: The Mac VM (pharo)

Hi Eliot

Am 26.09.2013 um 09:26 schrieb Eliot Miranda eliot.miranda@gmail.com:

...

Hi Göran,

On Thu, Sep 26, 2013 at 12:21 AM, Göran Krampe goran@krampe.se wrote:

...
Hey!

(nice to meet at ESUG btw)

[…]

...

...
gcc --version Configured with: --prefix=/Applications/Xcode.**app/Contents/Developer/usr --with-gxx-include-dir=/usr/**include/c++/4.2.1 Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix

[…]

...

...
clang --version Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix

(I removed some copyright notices from the above)

setGlobalOptions: maker

...
super setGlobalOptions: maker. maker set: 'CMAKE_C_COMPILER' to: 'clang'. maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.

AFAICT "clang" is the same as "gcc", no? See my printouts above. The only difference seem to be the added prefix/include-dir config.

No; very different. I'm not an expert but I think essentially clang is Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.

Not on a default Mac Xcode installation. See my OSX 10.8 + Xcode 4.x installation:

$ ls -al $(which gcc) lrwxr-xr-x 1 root wheel 12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2

But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc) ships. gcc is linked to clang.

It is a problem of default naming. From Xcode 5 on, if you don't change a thing, "gcc" will get you "clang".

...

Right now Cog doesn't run if compiled with clang. Only gcc will do. No time to debug this right now, and annoyingly clang compiles all static functions with a non-standard calling convention which means one can;t call these functions in gdb, hence lots of debugging functions aren't available without either a) turning off the optimization or b) changing the VM source so they're not static.

you might want to try lldb, that ships with Xcode and is based on the llvm/clang tool chain. I am not implying it is better than gcc but maybe it can help in your situation?

...

I prefer a). If anyone knows of a flag to do this *please* let me know asap.

Best -Tobias

Eliot Miranda

10:06 a.m.

New subject: The Mac VM (pharo)

Hi Tobias,

On Thu, Sep 26, 2013 at 12:56 AM, Tobias Pape Das.Linux@gmx.de wrote:

...

Hi Eliot

Am 26.09.2013 um 09:26 schrieb Eliot Miranda eliot.miranda@gmail.com:

...
Hi Göran,

On Thu, Sep 26, 2013 at 12:21 AM, Göran Krampe goran@krampe.se wrote:

...
Hey!

(nice to meet at ESUG btw)

[…]

...
...
gcc --version Configured with:

--prefix=/Applications/Xcode.**app/Contents/Developer/usr

...
...
--with-gxx-include-dir=/usr/**include/c++/4.2.1 Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix

[…]

...
...
clang --version Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix

(I removed some copyright notices from the above)

setGlobalOptions: maker

...
super setGlobalOptions: maker. maker set: 'CMAKE_C_COMPILER' to: 'clang'. maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.

AFAICT "clang" is the same as "gcc", no? See my printouts above. The

only

...
...
difference seem to be the added prefix/include-dir config.

No; very different. I'm not an expert but I think essentially clang is Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.

Not on a default Mac Xcode installation. See my OSX 10.8 + Xcode 4.x installation:

$ ls -al $(which gcc) lrwxr-xr-x 1 root wheel 12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2

But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc) ships. gcc is linked to clang.

It is a problem of default naming. From Xcode 5 on, if you don't change a thing, "gcc" will get you "clang".

I think you miss my point, which is that the clang compiler is very different (it uses LLVM for its code generator) than gcc. That apple calls clang gcc is neither here-nor-there. If you get a real gcc it will compile a functional VM. If you get a clang-based compiler it won't. Do you agree?

...

...
Right now Cog doesn't run if compiled with clang. Only gcc will do. No time to debug this right now, and annoyingly clang compiles all static functions with a non-standard calling convention which means one can;t

call

...
these functions in gdb, hence lots of debugging functions aren't

available

...
without either a) turning off the optimization or b) changing the VM

source

...
so they're not static.

you might want to try lldb, that ships with Xcode and is based on the llvm/clang tool chain. I am not implying it is better than gcc but maybe it can help in your situation?

Thanks, that sounds promising!

...

...
I prefer a). If anyone knows of a flag to do this *please* let me know asap.

Best -Tobias

-- best, Eliot

Tobias Pape

10:35 a.m.

New subject: The Mac VM (pharo)

Hi Eliot

Am 26.09.2013 um 10:06 schrieb Eliot Miranda eliot.miranda@gmail.com:

...

Hi Tobias,

...
...
...
[…]

No; very different. I'm not an expert but I think essentially clang is Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.

Not on a default Mac Xcode installation. See my OSX 10.8 + Xcode 4.x installation:

$ ls -al $(which gcc) lrwxr-xr-x 1 root wheel 12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2

But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc) ships. gcc is linked to clang.

It is a problem of default naming. From Xcode 5 on, if you don't change a thing, "gcc" will get you "clang".

I think you miss my point, which is that the clang compiler is very different (it uses LLVM for its code generator) than gcc.

I got that point, but I was under the impression, Göran wanted to make a different one.

...

That apple calls clang gcc is neither here-nor-there. If you get a real gcc it will compile a functional VM. If you get a clang-based compiler it won't. Do you agree?

Yes, I did not want to argue that point :).

But what is with the two-headed hydra, llvm-gcc (gcc frontend with llvm code-gen)? Since Xcode 4, apple by default does _not_ ship a "normal" gcc but only a llvm-based one, and with Xcode 5, even that is gone. My point was not about code-gen but compiler-availability ;) However, yours seem more important ATM.

...

...
...
Right now Cog doesn't run if compiled with clang. Only gcc will do. No time to debug this right now, and annoyingly clang compiles all static functions with a non-standard calling convention which means one can;t

call

...
these functions in gdb, hence lots of debugging functions aren't

available

...
without either a) turning off the optimization or b) changing the VM

source

...
so they're not static.

you might want to try lldb, that ships with Xcode and is based on the llvm/clang tool chain. I am not implying it is better than gcc but maybe it can help in your situation?

Thanks, that sounds promising!

Keep in mind, that not only on the technical level lldb is to gdb what clang is to gcc, but also on the “philosophical”, as in “emulate the interface of gcc/gdb but not quite…” Be prepared for surprises, good ones and bad ones.

Best -Tobias

Göran Krampe

1 Oct 1 Oct

3:11 p.m.

New subject: The Mac VM (pharo)

Hey!

My mail filters were bogged up so I missed this discussion, sorry. Let me clarify some things:

First I wrote that "clang" is essentially the same as "gcc" but what I *MEANT* by that is that given the output from those two commands on a Mountain Lion - they BOTH invoke llvm-gcc.

On 09/26/2013 10:35 AM, Tobias Pape wrote:> Am 26.09.2013 um 10:06 schrieb Eliot Miranda eliot.miranda@gmail.com:

...

...
...
...
No; very different. I'm not an expert but I think essentially clang is Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.

Not on a default Mac Xcode installation. See my OSX 10.8 + Xcode 4.x installation:

$ ls -al $(which gcc) lrwxr-xr-x 1 root wheel 12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2

But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc) ships. gcc is linked to clang.

It is a problem of default naming. From Xcode 5 on, if you don't change a thing, "gcc" will get you "clang".

And Tobias explained exactly what I meant - "gcc" and "clang" resolve to the SAME compiler under Mountain Lion.

...

...
I think you miss my point, which is that the clang compiler is very different (it uses LLVM for its code generator) than gcc.

I got that point, but I was under the impression, Göran wanted to make a different one.

Yes, thanks! :)

...

...
That apple calls clang gcc is neither here-nor-there. If you get a real gcc it will compile a functional VM. If you get a clang-based compiler it won't. Do you agree?

Yes, I did not want to argue that point :).

No, I don't agree! :) Because current PharoVM *DOES* compile and run using clang! Which is quite cool btw.

...

But what is with the two-headed hydra, llvm-gcc (gcc frontend with llvm code-gen)? Since Xcode 4, apple by default does _not_ ship a "normal" gcc but only a llvm-based one, and with Xcode 5, even that is gone. My point was not about code-gen but compiler-availability ;) However, yours seem more important ATM.

...
...
...
Right now Cog doesn't run if compiled with clang. Only gcc will do. No

Nope, it compiles and run :). Performance seems to be the same as the Pharo VM that the Pharo guys build with GCC (not sure, but I think they use 4.2).

I am now working on a build using GCC 4.9 - not through it yet, but almost.

For some interesting silly benchmarks:

Stock binary pharo-vm (presume built with GCC 4.2???): 133 million sends, 800 mill bytecodes.

Built with clang from Xcode 5: 121 million sends, 790 mill bytecodes.

Binary 2776 from Eliot: 114 mill sends, but 980 mill bytecodes.

3 years old OpenQwaq Cog VM compiled with Intel compiler: 138 mill sends and 1000 mill bytecodes.

Hehe. So... will be interesting to see how GCC 4.9 fares in all this - but this is nice - we can compile using clang!

regards, Göran

Göran Krampe

2 Oct 2 Oct

9:22 a.m.

New subject: Compiler testing on Mac (was Re: The Mac VM (pharo))

Hey!

I am struggling on with building VMs on Mac.

On 10/01/2013 03:11 PM, Göran Krampe wrote:

...

I am now working on a build using GCC 4.9 - not through it yet, but almost.

Done, but unfortunately it gives "Bus error: 10" when I start it. Will see if I can get more info from that, I will also test older GCCs.

...

For some interesting silly benchmarks:

Stock binary pharo-vm (presume built with GCC 4.2???): 133 million sends, 800 mill bytecodes.

Built with clang from Xcode 5: 121 million sends, 790 mill bytecodes.

We all know how trustworthy tinyBenchmarks is of course, but still, this is interesting. Bytecodes for the above clang compiled VM seem to land on around 820 mill no matter what I do.

Now I tried this VM some more, starting it up fresh 3 times and running tinyBenchmark a bunch of times giving me these runs:

1: 131, 137, 138, 138 2: 126, 128, 128, 128, 128, 127 3: 123, 138, 136, 138, 138

First run seems often a bit slow as Eliot explained. I have no idea why it "got stuck" on 128 on run #2. :)

BUT... hey! Most often around 138 million sends, and this is clang!

...

Binary 2776 from Eliot: 114 mill sends, but 980 mill bytecodes.

Now I ran this VM several times more and can conclude that it seems to be very similar in performance as the clang VM BUT... with slightly more bytecodes performance, around 980-1000.

And what I find more interesting: It matters what image you use!

I ran using the Pharo 2.0 image that got sucked down in the VM building process, and I also ran using an older Squeak 4.1 based OpenQwaq image and... although they both have the same bytecodes in the benchFib method the Pharo *image* is consistently slower on sends and also more variant.

I am not dwelving into that, but a little "headsup" here. :)

My personal conclusion: clang is a true option now

regards, Göran

Esteban Lorenzano

20 Feb 20 Feb

12:44 p.m.

ok, I forget that

On Feb 19, 2013, at 7:50 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

...

Hi All,

kudos to Nicolas for posting some useful numbers in that they provide some context, in this case the other VMs running on the same machine. But wrist slaps to all of you for not specifying:

which OS

osx 10.8

...

what hardware

i7 8gb

...

what C compiler was used to compile the VM

4.6.3

...

Further kudos for indicating what kind of load the machine is under (one has to run benchmarks on a relatively unstressed machine, even if multicore), and, *really usefully*, what a previous version's benchmark score is on the same machine.

no matter the load. It is a comparative analysis: stackvm pre-merge with latest: 500 msends, after merge: 5msend... since I did not changed my mac for an ipad, the sends cannot be right. Also... 10% passive cpu usage is wrong, no matter the machine load.

...

Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter are exactly what one should expect for nfib (the sends/sec part of tinyBenchmarks) with the current Cog architecture.

yep... so where is the stack vm made with latest sources? (it is not in http://www.mirandabanda.org/files/Cog/VM/VM.r2678/)

...

On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:

...
This is not confirmed in regular svn cog branch

1 tinyBenchmarks '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM '656 410 256 bytecodes/sec; 67 802 547 sends/sec' Cog VM

Nicolas

2013/2/15 Igor Stasenko siguctua@gmail.com:

...
On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:

...
Hi,

I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.

Do you tried to compile a stack vm lately? any idea where to start look for bugs?

just tried on my machine.. the results is discouraging:

1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'

...
thanks, Esteban

-- Best regards, Igor Stasenko.

-- best, Eliot

3865

Age (days ago)

4094

Last active (days ago)

vm-dev@lists.squeakfoundation.org

20 comments

8 participants

tags (0)

participants (8)

Camillo Bruni
Eliot Miranda
Esteban Lorenzano
Guillermo Polito
Göran Krampe
Igor Stasenko
Nicolas Cellier
Tobias Pape