[Vm-dev] Re: [squeak-dev] 64bit VMs some thoughts.
David T. Lewis
lewis at mail.msen.com
Thu Dec 3 22:31:37 UTC 2009
On Wed, Dec 02, 2009 at 05:28:33PM -0800, John M McIntosh wrote:
> So you sit there smug about the fact you built a 64bit VM, likely for hosting on your 64bit Linux OS.
> {Or the unix one for Darwin, or that new fangled cocoa one}
>
> However it's possible that it's running 1/3 the performance of the 32bit VM.
> Did you check? Thought not...
>
> So let's talk.
>
> Are you using the gnuifed version of interp.c? If you don't know, well go check.
> Are you using GCC 4.1 or higher?
>
> The interpreter loop is highly tuned monster that suffers from compiler optimization issues. With
> careful tuning parms as found in the macintosh xcode build project for the carbon VM using gcc 4.0
> you'll get the most optimum performance.
>
> GCC 4.2+ ?
>
> Michael Rueger and I spent a few days attempting to get good performance out of GCC 4.2
> WITHOUT success. I think that can account for at least a 33% slowdown.
>
> So where does the other 33% slowdown come from?
>
> Well when we compile the VM in 64bit to use a 32bit image each reference to an oops requires us
> to add a 64bit memory start address to the 32bit oops number to resolve to a 64bit memory address.
> Unfortunately GCC 4.2 growls, and produces the lousiest code possible to do this.
> Maybe higher versions of GCC are better? Anyone care to test?
>
> So some solutions.
>
> (a) Ensure the squeak oops memory block loads within the 0-4GB address space.
> See pagezero size for Darwin. Then alter the logic a bit so that sqMemoryBase is zero
> and that the squeak memory accessors don't do the add of sqMemoryBase=0 to the oops address.
> Although you might have to use GCC 4.2 you'll run 100% faster.
>
> (b) Use the (non-free) Intel compiler
>
Hi John,
I get very different results, but they certainly support your observation
that newer GCC compilers are a problem.
If I compare a 64-bit VM built on my computer to a 32-bit VM downloaded from
Ian's site, running both on the same hardware and OS (AMD Turion, 64 bit Linux),
the 64-bit VM is running about twice as fast as the 32-bit VM.
In the past (over several years), I have never measured this carefully, but
I have the general impression that 64-bit and 32-bit VMs run at similar speeds
on my hardware and OS (I guess I should figure out how to compile in 32-bit
mode so I can really find out).
I would guess that the difference I am seeing now is due to compiler version.
Ian's VM was compiled with gcc 4.3.3 and I am using an older gcc 4.1.2 compiler.
For the record, here are the results I got (copied from CommandShell windows
in a Squeak trunk image).
For a 64-bit VM that I compiled locally, installed in /usr/local:
$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 36
model name : AMD Turion(tm) 64 Mobile Technology ML-34
stepping : 2
cpu MHz : 1600.000
cache size : 1024 KB
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni lahf_lm
bogomips : 3203.59
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
$ cat /proc/version
Linux version 2.6.18.2-34-default (geeko at buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
$ /usr/local/bin/squeak -version
SQUEAK_ENCODING=UTF-8
SQUEAK_PATHENC=UTF-8
SQUEAK_PLUGINS=/usr/local/lib/squeak/3.11.9-2145
+ exec /usr/local/lib/squeak/3.11.9-2145/squeakvm -version
3.11.9-2145 #1 XShm Thu Dec 3 10:54:44 EST 2009 gcc 4.1.2
Linux linux-6xfc 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux
plugin path: /usr/local/lib/squeak/3.11.9-2145 [default: /usr/local/lib/squeak/3.11.9-2145/]
$ strings /usr/local/lib/squeak/3.11.9-2145/squeakvm | grep gcc
gcc 4.1.2
$ 0 tinyBenchmarks
154031287 bytecodes/sec; 5145368 sends/sec
$ 0 tinyBenchmarks
153201675 bytecodes/sec; 5183202 sends/sec
$ 0 tinyBenchmarks
151658767 bytecodes/sec; 5268426 sends/sec
$
For a 32-bit VM from Ian's site, running the same image from a local directory:
$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 36
model name : AMD Turion(tm) 64 Mobile Technology ML-34
stepping : 2
cpu MHz : 1600.000
cache size : 1024 KB
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni lahf_lm
bogomips : 3203.59
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
$ cat /proc/version
Linux version 2.6.18.2-34-default (geeko at buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
$ pwd
/home/lewis/squeak/VMM-Ian/Squeak-3.11.3.2135-linux_i386/lib/squeak/3.11.3-2135
$ ls -l squeakvm
-rwxr-xr-x 1 lewis users 2376017 2009-09-16 17:46 squeakvm
$ strings squeakvm | grep gcc
gcc 4.3.3
$ ./squeakvm -version
3.11.3-2135 #1 XShm Wed Sep 16 14:25:10 PDT 2009 gcc 4.3.3
Linux ubuntu 2.6.28-15-generic #49-Ubuntu SMP Tue Aug 18 18:40:08 UTC 2009 i686 GNU/Linux
plugin path: /usr/local/lib/squeak/3.11.9-2145 [default: /home/lewis/squeak/VMM-Ian/Squeak-3.11.3.2135-linux_i386/lib/squeak/3.11.3-2135/]
$ 0 tinyBenchmarks
62135922 bytecodes/sec; 3330746 sends/sec
$ 0 tinyBenchmarks
62256809 bytecodes/sec; 3425013 sends/sec
$ 0 tinyBenchmarks
62317429 bytecodes/sec; 3346096 sends/sec
$
Dave
More information about the Vm-dev
mailing list