Hi,
I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
Do you tried to compile a stack vm lately? any idea where to start look for bugs?
thanks, Esteban
On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:
Hi,
I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
Do you tried to compile a stack vm lately? any idea where to start look for bugs?
just tried on my machine.. the results is discouraging:
1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'
thanks, Esteban
This is not confirmed in regular svn cog branch
1 tinyBenchmarks '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM '656 410 256 bytecodes/sec; 67 802 547 sends/sec' Cog VM
Nicolas
2013/2/15 Igor Stasenko siguctua@gmail.com:
On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:
Hi,
I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
Do you tried to compile a stack vm lately? any idea where to start look for bugs?
just tried on my machine.. the results is discouraging:
1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'
thanks, Esteban
-- Best regards, Igor Stasenko.
Hi All,
kudos to Nicolas for posting some useful numbers in that they provide some context, in this case the other VMs running on the same machine. But wrist slaps to all of you for not specifying:
1. which OS 2. what hardware 3. what C compiler was used to compile the VM
Further kudos for indicating what kind of load the machine is under (one has to run benchmarks on a relatively unstressed machine, even if multicore), and, *really usefully*, what a previous version's benchmark score is on the same machine.
Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter are exactly what one should expect for nfib (the sends/sec part of tinyBenchmarks) with the current Cog architecture.
On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
This is not confirmed in regular svn cog branch
1 tinyBenchmarks '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM '656 410 256 bytecodes/sec; 67 802 547 sends/sec' Cog VM
Nicolas
2013/2/15 Igor Stasenko siguctua@gmail.com:
On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:
Hi,
I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
Do you tried to compile a stack vm lately? any idea where to start look for bugs?
just tried on my machine.. the results is discouraging:
1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'
thanks, Esteban
-- Best regards, Igor Stasenko.
2013/2/19 Eliot Miranda eliot.miranda@gmail.com:
Hi All,
kudos to Nicolas for posting some useful numbers in that they
provide some context, in this case the other VMs running on the same machine. But wrist slaps to all of you for not specifying:
- which OS
- what hardware
- what C compiler was used to compile the VM
Sure, above numbers were not really meaningful, apart the ratio... Mac OS X (Mac OS 1068 intel) MacMini 2.26 GHz Intel Core 2 Duo Compiler: 4.2.1 (Apple Inc. build 5666) (dot 3)
As for the load, I don't know how to provide a synthetic measurement, but it's low... The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.
Nicolas
Further kudos for indicating what kind of load the machine is under (one has to run benchmarks on a relatively unstressed machine, even if multicore), and, *really usefully*, what a previous version's benchmark score is on the same machine.
Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter are exactly what one should expect for nfib (the sends/sec part of tinyBenchmarks) with the current Cog architecture.
On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
This is not confirmed in regular svn cog branch
1 tinyBenchmarks '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM '656 410 256 bytecodes/sec; 67 802 547 sends/sec' Cog VM
Nicolas
2013/2/15 Igor Stasenko siguctua@gmail.com:
On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:
Hi,
I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
Do you tried to compile a stack vm lately? any idea where to start look for bugs?
just tried on my machine.. the results is discouraging:
1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'
thanks, Esteban
-- Best regards, Igor Stasenko.
-- best, Eliot
On Tue, Feb 19, 2013 at 2:02 PM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
2013/2/19 Eliot Miranda eliot.miranda@gmail.com:
Hi All,
kudos to Nicolas for posting some useful numbers in that they
provide some context, in this case the other VMs running on the same machine. But wrist slaps to all of you for not specifying:
- which OS
- what hardware
- what C compiler was used to compile the VM
Sure, above numbers were not really meaningful, apart the ratio... Mac OS X (Mac OS 1068 intel) MacMini 2.26 GHz Intel Core 2 Duo Compiler: 4.2.1 (Apple Inc. build 5666) (dot 3)
As for the load, I don't know how to provide a synthetic measurement, but it's low...
uptime is fine on Mac & Linux. Don't know about Windows.
The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.
One simple approach is to run the benchmark three times and to discard the best and the worst results.
Nicolas
Further kudos for indicating what kind of load the machine is under (one has to run benchmarks on a relatively unstressed machine, even if multicore), and, *really usefully*, what a previous version's benchmark score is on the same machine.
Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter are exactly what one should expect for nfib (the sends/sec part of tinyBenchmarks) with the current Cog architecture.
On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
This is not confirmed in regular svn cog branch
1 tinyBenchmarks '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM '656 410 256 bytecodes/sec; 67 802 547 sends/sec' Cog VM
Nicolas
2013/2/15 Igor Stasenko siguctua@gmail.com:
On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:
Hi,
I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
Do you tried to compile a stack vm lately? any idea where to start look for bugs?
just tried on my machine.. the results is discouraging:
1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'
thanks, Esteban
-- Best regards, Igor Stasenko.
-- best, Eliot
The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.
One simple approach is to run the benchmark three times and to discard the best and the worst results.
that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std deviation?
Too much work? use http://www.squeaksource.com/SMark.html
On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni camillobruni@gmail.com wrote:
The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.
One simple approach is to run the benchmark three times and to discard the best and the worst results.
that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std deviation?
If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.
Too much work? use http://www.squeaksource.com/SMark.html
On 2013-02-20, at 01:25, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni camillobruni@gmail.com wrote:
The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.
One simple approach is to run the benchmark three times and to discard the best and the worst results.
that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std deviation?
If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.
true, but the deviation will most probably give you exactly that feedback. if you increase the runs but the quality of the result doesn't improve you know that you're dealing with some systematic error source.
This approach is simply more scientific and less home-brewed.
On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni camillobruni@gmail.com wrote:
On 2013-02-20, at 01:25, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni camillobruni@gmail.com wrote:
The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.
One simple approach is to run the benchmark three times and to discard the best and the worst results.
that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std deviation?
If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.
true, but the deviation will most probably give you exactly that feedback. if you increase the runs but the quality of the result doesn't improve you know that you're dealing with some systematic error source.
This approach is simply more scientific and less home-brewed.
Of course, no argument here. But what's being discussed is using tinyBenchmarks as a quick smoke test. A proper CI system can be set it up for reliable results, but for IMO for a quick smoke test doing three runs manually is fine. IME, what tends to happen is that the first run is slow (caches heating up etc) and the second two runs are extremely close.
On 20 February 2013 18:29, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni camillobruni@gmail.com wrote:
On 2013-02-20, at 01:25, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni camillobruni@gmail.com wrote:
The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.
One simple approach is to run the benchmark three times and to discard the best and the worst results.
that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std deviation?
If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.
true, but the deviation will most probably give you exactly that feedback. if you increase the runs but the quality of the result doesn't improve you know that you're dealing with some systematic error source.
This approach is simply more scientific and less home-brewed.
Of course, no argument here. But what's being discussed is using tinyBenchmarks as a quick smoke test. A proper CI system can be set it up for reliable results, but for IMO for a quick smoke test doing three runs manually is fine. IME, what tends to happen is that the first run is slow (caches heating up etc) and the second two runs are extremely close.
but not in case when you have an order(s) of magnitude speed degradation. This is too significant to be considered as measurement error or deviation. There should be something wrong with VM (cache always fails?).
-- best, Eliot
Ok, following with this. What I can add to the discussion:
In linux, latest VMs yield the following results (I added a space every three digits just to enhance readability)
"Pharo Cog" 1 tinyBenchmarks '887 348 353 bytecodes/sec; 141 150 557 sends/sec'
"Pharo Stack" 1 tinyBenchmarks '445 217 391 bytecodes/sec; 24 395 999 sends/sec'
While in Mac
"Pharo Cog" 1 tinyBenchmarks '895 104 895 bytecodes/sec; 138 102 772 sends/sec'
"Pharo Stack" 1 tinyBenchmarks '3 319 502 bytecodes/sec; 217 939 sends/sec'
So, I'd say it's a problem in cmake configuration or just compilation in mac :). Though I didn't test on windowze.
Another thing that I noticed is that when compiling my VM on Mac, since I updated Xcode, I was not longer using gnu gcc but llvm one. I tried to go back using the gnu gcc but couldn't make it work so far, he.
On Thu, Feb 21, 2013 at 5:09 AM, Igor Stasenko siguctua@gmail.com wrote:
On 20 February 2013 18:29, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni camillobruni@gmail.com
wrote:
On 2013-02-20, at 01:25, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni camillobruni@gmail.com
wrote:
> The most annoying piece is Time machine and its disk access, I > sometimes forget to suspend it, but it was off during the > tinyBenchmark.
One simple approach is to run the benchmark three times and to
discard
the best and the worst results.
that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std
deviation?
If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.
true, but the deviation will most probably give you exactly that
feedback.
if you increase the runs but the quality of the result doesn't improve you know that you're dealing with some systematic error source.
This approach is simply more scientific and less home-brewed.
Of course, no argument here. But what's being discussed is using tinyBenchmarks as a quick smoke test. A proper CI system can be set it up for reliable results, but for IMO for a quick smoke test doing three runs manually is fine. IME, what tends to happen is that the first run is slow (caches heating up etc) and the second two runs are extremely close.
but not in case when you have an order(s) of magnitude speed degradation. This is too significant to be considered as measurement error or deviation. There should be something wrong with VM (cache always fails?).
-- best, Eliot
-- Best regards, Igor Stasenko.
So, after digging a bit I've got some results and conclusions:
- The CMake configurations were using gcc to compile, which in mac is llvm-gcc - Xcode uses clang compiler, not gcc (and some different compiling flags also) - I've played changing the configuration to use clang compiler
setGlobalOptions: maker
super setGlobalOptions: maker. maker set: 'CMAKE_C_COMPILER' to: 'clang'. maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.
And to make it work as in gcc I added the following also (in some plugins such as mp3plugin there are functions with return type and return statements with no values specified).
compilerFlagsRelease
^super compilerFlagsRelease, #( '-Wno-return-type' )
And it compiled with the following results in the tinyBenchmarks:
'510723192 bytecodes/sec; -142407 sends/sec'
Which is, in the bytecode part, pretty much close to what we expect, and in the sends, looks buggy :). But the overall performance using the image is far better
Cheers, Guille
On Fri, Jul 26, 2013 at 12:20 PM, Guillermo Polito < guillermopolito@gmail.com> wrote:
Ok, following with this. What I can add to the discussion:
In linux, latest VMs yield the following results (I added a space every three digits just to enhance readability)
"Pharo Cog" 1 tinyBenchmarks '887 348 353 bytecodes/sec; 141 150 557 sends/sec'
"Pharo Stack" 1 tinyBenchmarks '445 217 391 bytecodes/sec; 24 395 999 sends/sec'
While in Mac
"Pharo Cog" 1 tinyBenchmarks '895 104 895 bytecodes/sec; 138 102 772 sends/sec'
"Pharo Stack" 1 tinyBenchmarks '3 319 502 bytecodes/sec; 217 939 sends/sec'
So, I'd say it's a problem in cmake configuration or just compilation in mac :). Though I didn't test on windowze.
Another thing that I noticed is that when compiling my VM on Mac, since I updated Xcode, I was not longer using gnu gcc but llvm one. I tried to go back using the gnu gcc but couldn't make it work so far, he.
On Thu, Feb 21, 2013 at 5:09 AM, Igor Stasenko siguctua@gmail.com wrote:
On 20 February 2013 18:29, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni camillobruni@gmail.com
wrote:
On 2013-02-20, at 01:25, Eliot Miranda eliot.miranda@gmail.com
wrote:
On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni <
camillobruni@gmail.com> wrote:
>> The most annoying piece is Time machine and its disk access, I >> sometimes forget to suspend it, but it was off during the >> tinyBenchmark. > > One simple approach is to run the benchmark three times and to
discard
> the best and the worst results.
that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg +
std deviation?
If the benchmark takes very little time to run and you're trying to avoid background effects then your approach won't necessarily work either.
true, but the deviation will most probably give you exactly that
feedback.
if you increase the runs but the quality of the result doesn't improve you know that you're dealing with some systematic error source.
This approach is simply more scientific and less home-brewed.
Of course, no argument here. But what's being discussed is using tinyBenchmarks as a quick smoke test. A proper CI system can be set it up for reliable results, but for IMO for a quick smoke test doing three runs manually is fine. IME, what tends to happen is that the first run is slow (caches heating up etc) and the second two runs are extremely close.
but not in case when you have an order(s) of magnitude speed degradation. This is too significant to be considered as measurement error or deviation. There should be something wrong with VM (cache always fails?).
-- best, Eliot
-- Best regards, Igor Stasenko.
Hey!
(nice to meet at ESUG btw)
On 08/15/2013 08:06 PM, Guillermo Polito wrote:
So, after digging a bit I've got some results and conclusions:
- The CMake configurations were using gcc to compile, which in mac is
llvm-gcc
- Xcode uses clang compiler, not gcc (and some different compiling flags
also)
- I've played changing the configuration to use clang compiler
Mmmmm... this is all slightly confusing, so many combos of compilers here (I am a Mac n00b), this is what I have (or more?):
gcc --version Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1 Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix
gcc-4.2 --version i686-apple-darwin11-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)
llvm-gcc --version i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
llvm-gcc-4.2 --version i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
clang --version Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix
(I removed some copyright notices from the above)
setGlobalOptions: maker
super setGlobalOptions: maker. maker set: 'CMAKE_C_COMPILER' to: 'clang'. maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.
AFAICT "clang" is the same as "gcc", no? See my printouts above. The only difference seem to be the added prefix/include-dir config.
And to make it work as in gcc I added the following also (in some plugins such as mp3plugin there are functions with return type and return statements with no values specified).
compilerFlagsRelease
^super compilerFlagsRelease, #( '-Wno-return-type' )
Aaaah!! Perfect. I just went through this yesterday and also failed at the mpeg3plugin.
And it compiled with the following results in the tinyBenchmarks:
'510723192 bytecodes/sec; -142407 sends/sec'
Which is, in the bytecode part, pretty much close to what we expect, and in the sends, looks buggy :). But the overall performance using the image is far better
Ok, I will try to get the build to use gcc-4.2 (the non LLVM gcc) and compare it to the clang (=gcc) VM.
regards, Göran
PS. I am on Mountain Lion and have Xcode 5 installed + CLI tools + brew apple-gcc4.2.
Hi Göran,
On Thu, Sep 26, 2013 at 12:21 AM, Göran Krampe goran@krampe.se wrote:
Hey!
(nice to meet at ESUG btw)
On 08/15/2013 08:06 PM, Guillermo Polito wrote:
So, after digging a bit I've got some results and conclusions:
- The CMake configurations were using gcc to compile, which in mac is
llvm-gcc
- Xcode uses clang compiler, not gcc (and some different compiling flags
also)
- I've played changing the configuration to use clang compiler
Mmmmm... this is all slightly confusing, so many combos of compilers here (I am a Mac n00b), this is what I have (or more?):
gcc --version Configured with: --prefix=/Applications/Xcode.**app/Contents/Developer/usr --with-gxx-include-dir=/usr/**include/c++/4.2.1 Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix
gcc-4.2 --version i686-apple-darwin11-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)
llvm-gcc --version i686-apple-darwin11-llvm-gcc-**4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
llvm-gcc-4.2 --version i686-apple-darwin11-llvm-gcc-**4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
clang --version Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix
(I removed some copyright notices from the above)
setGlobalOptions: maker
super setGlobalOptions: maker. maker set: 'CMAKE_C_COMPILER' to: 'clang'. maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.
AFAICT "clang" is the same as "gcc", no? See my printouts above. The only difference seem to be the added prefix/include-dir config.
No; very different. I'm not an expert but I think essentially clang is Apple's C compiler that uses the LLVM backend, and gcc is good old gcc. Right now Cog doesn't run if compiled with clang. Only gcc will do. No time to debug this right now, and annoyingly clang compiles all static functions with a non-standard calling convention which means one can;t call these functions in gdb, hence lots of debugging functions aren't available without either a) turning off the optimization or b) changing the VM source so they're not static. I prefer a). If anyone knows of a flag to do this *please* let me know asap.
And to make it work as in gcc I added the following also (in some
plugins such as mp3plugin there are functions with return type and return statements with no values specified).
compilerFlagsRelease
^super compilerFlagsRelease, #( '-Wno-return-type' )
Aaaah!! Perfect. I just went through this yesterday and also failed at the mpeg3plugin.
And it compiled with the following results in the tinyBenchmarks:
'510723192 bytecodes/sec; -142407 sends/sec'
Which is, in the bytecode part, pretty much close to what we expect, and in the sends, looks buggy :). But the overall performance using the image is far better
Ok, I will try to get the build to use gcc-4.2 (the non LLVM gcc) and compare it to the clang (=gcc) VM.
regards, Göran
PS. I am on Mountain Lion and have Xcode 5 installed + CLI tools + brew apple-gcc4.2.
Hi Eliot
Am 26.09.2013 um 09:26 schrieb Eliot Miranda eliot.miranda@gmail.com:
Hi Göran,
On Thu, Sep 26, 2013 at 12:21 AM, Göran Krampe goran@krampe.se wrote:
Hey!
(nice to meet at ESUG btw)
[…]
gcc --version Configured with: --prefix=/Applications/Xcode.**app/Contents/Developer/usr --with-gxx-include-dir=/usr/**include/c++/4.2.1 Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix
[…]
clang --version Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix
(I removed some copyright notices from the above)
setGlobalOptions: maker
super setGlobalOptions: maker. maker set: 'CMAKE_C_COMPILER' to: 'clang'. maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.
AFAICT "clang" is the same as "gcc", no? See my printouts above. The only difference seem to be the added prefix/include-dir config.
No; very different. I'm not an expert but I think essentially clang is Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.
Not on a default Mac Xcode installation. See my OSX 10.8 + Xcode 4.x installation:
$ ls -al $(which gcc) lrwxr-xr-x 1 root wheel 12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2
But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc) ships. gcc is linked to clang.
It is a problem of default naming. From Xcode 5 on, if you don't change a thing, "gcc" will get you "clang".
Right now Cog doesn't run if compiled with clang. Only gcc will do. No time to debug this right now, and annoyingly clang compiles all static functions with a non-standard calling convention which means one can;t call these functions in gdb, hence lots of debugging functions aren't available without either a) turning off the optimization or b) changing the VM source so they're not static.
you might want to try lldb, that ships with Xcode and is based on the llvm/clang tool chain. I am not implying it is better than gcc but maybe it can help in your situation?
I prefer a). If anyone knows of a flag to do this *please* let me know asap.
Best -Tobias
Hi Tobias,
On Thu, Sep 26, 2013 at 12:56 AM, Tobias Pape Das.Linux@gmx.de wrote:
Hi Eliot
Am 26.09.2013 um 09:26 schrieb Eliot Miranda eliot.miranda@gmail.com:
Hi Göran,
On Thu, Sep 26, 2013 at 12:21 AM, Göran Krampe goran@krampe.se wrote:
Hey!
(nice to meet at ESUG btw)
[…]
gcc --version Configured with:
--prefix=/Applications/Xcode.**app/Contents/Developer/usr
--with-gxx-include-dir=/usr/**include/c++/4.2.1 Apple LLVM version 5.0 (clang-500.2.76) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix
[…]
clang --version Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn) Target: x86_64-apple-darwin12.5.0 Thread model: posix
(I removed some copyright notices from the above)
setGlobalOptions: maker
super setGlobalOptions: maker. maker set: 'CMAKE_C_COMPILER' to: 'clang'. maker set: 'CMAKE_CXX_COMPILER' to: 'clang'.
AFAICT "clang" is the same as "gcc", no? See my printouts above. The
only
difference seem to be the added prefix/include-dir config.
No; very different. I'm not an expert but I think essentially clang is Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.
Not on a default Mac Xcode installation. See my OSX 10.8 + Xcode 4.x installation:
$ ls -al $(which gcc) lrwxr-xr-x 1 root wheel 12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2
But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc) ships. gcc is linked to clang.
It is a problem of default naming. From Xcode 5 on, if you don't change a thing, "gcc" will get you "clang".
I think you miss my point, which is that the clang compiler is very different (it uses LLVM for its code generator) than gcc. That apple calls clang gcc is neither here-nor-there. If you get a real gcc it will compile a functional VM. If you get a clang-based compiler it won't. Do you agree?
Right now Cog doesn't run if compiled with clang. Only gcc will do. No time to debug this right now, and annoyingly clang compiles all static functions with a non-standard calling convention which means one can;t
call
these functions in gdb, hence lots of debugging functions aren't
available
without either a) turning off the optimization or b) changing the VM
source
so they're not static.
you might want to try lldb, that ships with Xcode and is based on the llvm/clang tool chain. I am not implying it is better than gcc but maybe it can help in your situation?
Thanks, that sounds promising!
I prefer a). If anyone knows of a flag to do this *please* let me know asap.
Best -Tobias
Hi Eliot
Am 26.09.2013 um 10:06 schrieb Eliot Miranda eliot.miranda@gmail.com:
Hi Tobias,
[…]
No; very different. I'm not an expert but I think essentially clang is Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.
Not on a default Mac Xcode installation. See my OSX 10.8 + Xcode 4.x installation:
$ ls -al $(which gcc) lrwxr-xr-x 1 root wheel 12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2
But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc) ships. gcc is linked to clang.
It is a problem of default naming. From Xcode 5 on, if you don't change a thing, "gcc" will get you "clang".
I think you miss my point, which is that the clang compiler is very different (it uses LLVM for its code generator) than gcc.
I got that point, but I was under the impression, Göran wanted to make a different one.
That apple calls clang gcc is neither here-nor-there. If you get a real gcc it will compile a functional VM. If you get a clang-based compiler it won't. Do you agree?
Yes, I did not want to argue that point :).
But what is with the two-headed hydra, llvm-gcc (gcc frontend with llvm code-gen)? Since Xcode 4, apple by default does _not_ ship a "normal" gcc but only a llvm-based one, and with Xcode 5, even that is gone. My point was not about code-gen but compiler-availability ;) However, yours seem more important ATM.
Right now Cog doesn't run if compiled with clang. Only gcc will do. No time to debug this right now, and annoyingly clang compiles all static functions with a non-standard calling convention which means one can;t
call
these functions in gdb, hence lots of debugging functions aren't
available
without either a) turning off the optimization or b) changing the VM
source
so they're not static.
you might want to try lldb, that ships with Xcode and is based on the llvm/clang tool chain. I am not implying it is better than gcc but maybe it can help in your situation?
Thanks, that sounds promising!
Keep in mind, that not only on the technical level lldb is to gdb what clang is to gcc, but also on the “philosophical”, as in “emulate the interface of gcc/gdb but not quite…” Be prepared for surprises, good ones and bad ones.
Best -Tobias
Hey!
My mail filters were bogged up so I missed this discussion, sorry. Let me clarify some things:
First I wrote that "clang" is essentially the same as "gcc" but what I *MEANT* by that is that given the output from those two commands on a Mountain Lion - they BOTH invoke llvm-gcc.
On 09/26/2013 10:35 AM, Tobias Pape wrote:> Am 26.09.2013 um 10:06 schrieb Eliot Miranda eliot.miranda@gmail.com:
No; very different. I'm not an expert but I think essentially clang is Apple's C compiler that uses the LLVM backend, and gcc is good old gcc.
Not on a default Mac Xcode installation. See my OSX 10.8 + Xcode 4.x installation:
$ ls -al $(which gcc) lrwxr-xr-x 1 root wheel 12 24 Apr 16:42 /usr/bin/gcc -> llvm-gcc-4.2
But with Xcode 5, no gcc (be it a pure gcc-4.2 or a llvm backed gcc) ships. gcc is linked to clang.
It is a problem of default naming. From Xcode 5 on, if you don't change a thing, "gcc" will get you "clang".
And Tobias explained exactly what I meant - "gcc" and "clang" resolve to the SAME compiler under Mountain Lion.
I think you miss my point, which is that the clang compiler is very different (it uses LLVM for its code generator) than gcc.
I got that point, but I was under the impression, Göran wanted to make a different one.
Yes, thanks! :)
That apple calls clang gcc is neither here-nor-there. If you get a real gcc it will compile a functional VM. If you get a clang-based compiler it won't. Do you agree?
Yes, I did not want to argue that point :).
No, I don't agree! :) Because current PharoVM *DOES* compile and run using clang! Which is quite cool btw.
But what is with the two-headed hydra, llvm-gcc (gcc frontend with llvm code-gen)? Since Xcode 4, apple by default does _not_ ship a "normal" gcc but only a llvm-based one, and with Xcode 5, even that is gone. My point was not about code-gen but compiler-availability ;) However, yours seem more important ATM.
Right now Cog doesn't run if compiled with clang. Only gcc will do. No
Nope, it compiles and run :). Performance seems to be the same as the Pharo VM that the Pharo guys build with GCC (not sure, but I think they use 4.2).
I am now working on a build using GCC 4.9 - not through it yet, but almost.
For some interesting silly benchmarks:
Stock binary pharo-vm (presume built with GCC 4.2???): 133 million sends, 800 mill bytecodes.
Built with clang from Xcode 5: 121 million sends, 790 mill bytecodes.
Binary 2776 from Eliot: 114 mill sends, but 980 mill bytecodes.
3 years old OpenQwaq Cog VM compiled with Intel compiler: 138 mill sends and 1000 mill bytecodes.
Hehe. So... will be interesting to see how GCC 4.9 fares in all this - but this is nice - we can compile using clang!
regards, Göran
Hey!
I am struggling on with building VMs on Mac.
On 10/01/2013 03:11 PM, Göran Krampe wrote:
I am now working on a build using GCC 4.9 - not through it yet, but almost.
Done, but unfortunately it gives "Bus error: 10" when I start it. Will see if I can get more info from that, I will also test older GCCs.
For some interesting silly benchmarks:
Stock binary pharo-vm (presume built with GCC 4.2???): 133 million sends, 800 mill bytecodes.
Built with clang from Xcode 5: 121 million sends, 790 mill bytecodes.
We all know how trustworthy tinyBenchmarks is of course, but still, this is interesting. Bytecodes for the above clang compiled VM seem to land on around 820 mill no matter what I do.
Now I tried this VM some more, starting it up fresh 3 times and running tinyBenchmark a bunch of times giving me these runs:
1: 131, 137, 138, 138 2: 126, 128, 128, 128, 128, 127 3: 123, 138, 136, 138, 138
First run seems often a bit slow as Eliot explained. I have no idea why it "got stuck" on 128 on run #2. :)
BUT... hey! Most often around 138 million sends, and this is clang!
Binary 2776 from Eliot: 114 mill sends, but 980 mill bytecodes.
Now I ran this VM several times more and can conclude that it seems to be very similar in performance as the clang VM BUT... with slightly more bytecodes performance, around 980-1000.
And what I find more interesting: It matters what image you use!
I ran using the Pharo 2.0 image that got sucked down in the VM building process, and I also ran using an older Squeak 4.1 based OpenQwaq image and... although they both have the same bytecodes in the benchFib method the Pharo *image* is consistently slower on sends and also more variant.
I am not dwelving into that, but a little "headsup" here. :)
My personal conclusion: clang is a true option now
regards, Göran
ok, I forget that
On Feb 19, 2013, at 7:50 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi All,
kudos to Nicolas for posting some useful numbers in that they provide some context, in this case the other VMs running on the same machine. But wrist slaps to all of you for not specifying:
- which OS
osx 10.8
- what hardware
i7 8gb
- what C compiler was used to compile the VM
4.6.3
Further kudos for indicating what kind of load the machine is under (one has to run benchmarks on a relatively unstressed machine, even if multicore), and, *really usefully*, what a previous version's benchmark score is on the same machine.
no matter the load. It is a comparative analysis: stackvm pre-merge with latest: 500 msends, after merge: 5msend... since I did not changed my mac for an ipad, the sends cannot be right. Also... 10% passive cpu usage is wrong, no matter the machine load.
Nicolas' results, Cog ~= 6.5x Interpreter, Stack ~= 1.75x Interpreter are exactly what one should expect for nfib (the sends/sec part of tinyBenchmarks) with the current Cog architecture.
yep... so where is the stack vm made with latest sources? (it is not in http://www.mirandabanda.org/files/Cog/VM/VM.r2678/)
On Sun, Feb 17, 2013 at 9:21 AM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
This is not confirmed in regular svn cog branch
1 tinyBenchmarks '380 669 144 bytecodes/sec; 10 473 620 sends/sec' Interpreter VM '371 014 492 bytecodes/sec; 18 512 525 sends/sec' Stack VM '656 410 256 bytecodes/sec; 67 802 547 sends/sec' Cog VM
Nicolas
2013/2/15 Igor Stasenko siguctua@gmail.com:
On 15 February 2013 14:49, Esteban Lorenzano estebanlm@gmail.com wrote:
Hi,
I just compiled a cog vm and a stack vm with latest sources. While the cog works well, the stack vm is having a serious performance drawback, tinyBenchmarks is giving me 5m bytecodes/s, while it should be around 500m... also I see the cpu charge increases to double.
Do you tried to compile a stack vm lately? any idea where to start look for bugs?
just tried on my machine.. the results is discouraging:
1 tinyBenchmarks '4648460 bytecodes/sec; 337199 sends/sec'
thanks, Esteban
-- Best regards, Igor Stasenko.
-- best, Eliot
vm-dev@lists.squeakfoundation.org