Pending 3.2.7b1 Mac VM

John M McIntosh johnmci at
Thu Apr 18 07:30:06 UTC 2002

I'm about to build/ship and CVS update the mac VM for 3.2.7b1.

This will contain due to performance reasons the following changes
which might not get into the 3.2. update stream at this late date, however
I'll look at posting on sourceforge if needbe.

To build a mac VM you need


Anthony Hannan change set makes the code generator produce unique 
variables in each case block in interpret() versus temporary 
variables. This makes a marked difference in how GCC compilers view 
the code and do their optimizations. Might affect all architectures. 
Negative impact on CodeWarrior.


A change set to convert interp.c global variables to a structure. 
This results in better code for PPC GCC compiled software and perhaps 
other RISC based platforms. This change set also adds two globals to 
Interpreter to fix a problem with Anthony's generator changes and 
modified how interp.c is build (global structure or not, indicated by 
platform dependency information via the VMMaker build process based 
on changes to the platform specific VMMaker subclass. So for example 
to turn on for z80 cpu's you need to alter the z80vmmaker subclass.

This feature could be used for plugins, but I've not tried that yet 
or even considered if there is a need.

Note that variables that require initialization or other 
specialization like currently hard coded in a ccode: invocation 
aren't included in the global structure (lazy on my part). Also I've 
found that GCC produces some odd ball code for arrays within a 
structure so those are now excluded. However with a little more work 
it could be feasible to embed all globals in a structure, either 
allocated at runtime or static to solve unique platform issues.


A changeset that adds some accessors/mutators to interp.c that are 
used by platform specific code. Right now some of the internal 
platform specific code refers to global variables by reference! That 
breaks when you convert to a structure and a pointer (ie 
foo->fullScreenFlag). Thus we provide some accessors like get and 
setFullScreenFlag. This only affects you if you choose to build a VM 
with global structure support. But really should you be directly 
referring to a VM Global variable?

MMm I seem to remember a note about making globals static in ppc 
files versus non-static if you could (Oh way to late to consider 
that, maybe after Smalltalk Solutions 2002).


A pending change to sqGnu.h that improved jumptable performance for 
PPC or perhaps other RISC based architectures.

These changes result in a mach-o OS-X VM that gives these types of numbers.

'51364365 bytecodes/sec; 1762131 sends/sec'
'50793650 bytecodes/sec; 1762131 sends/sec'
'50914876 bytecodes/sec; 1765598 sends/sec'
'51446945 bytecodes/sec; 1780778 sends/sec'
'51405622 bytecodes/sec; 1779601 sends/sec'
'51488334 bytecodes/sec; 1776079 sends/sec'
'51323175 bytecodes/sec; 1762131 sends/sec'
'50753370 bytecodes/sec; 1757530 sends/sec'
'50914876 bytecodes/sec; 1765598 sends/sec'
'51446945 bytecodes/sec; 1778426 sends/sec'

Base 3.2.6B8 VM typically gave me

'44849334 bytecodes/sec; 1464858 sends/sec'
'44755244 bytecodes/sec; 1435788 sends/sec'
'44755244 bytecodes/sec; 1432081 sends/sec'
'44352044 bytecodes/sec; 1433315 sends/sec'
'44352044 bytecodes/sec; 1433315 sends/sec'
'44661549 bytecodes/sec; 1429621 sends/sec'

The classic mac os folks should know that we get
(under os-x!)

'49155145 bytecodes/sec; 1691292 sends/sec'
'48892284 bytecodes/sec; 1676548 sends/sec'
'49079754 bytecodes/sec; 1673422 sends/sec'
'49306625 bytecodes/sec; 1662059 sends/sec'
'49306625 bytecodes/sec; 1661034 sends/sec'
'49382716 bytecodes/sec; 1662416 sends/sec'
'49268668 bytecodes/sec; 1657449 sends/sec'
'49004594 bytecodes/sec; 1661034 sends/sec'
'49192928 bytecodes/sec; 1663086 sends/sec'
'49192928 bytecodes/sec; 1639486 sends/sec'

Also compare to the base line. But you should note I think under 
native os-9 these might be higher, and they compare quite well to the 
highest numbers I've seen under os-9 with a 2.8 VM -> 50,274,941 
bytecodes/sec; 1,610,918 sends/sec
3.x on a good day would do 45,165,843 bytecodes/sec; 1,523,881 sends/sec

The CW Pro 5 compiler optimizes a bit differently and some of the 
code changes just don't have the same impact, but they do affect 
John M. McIntosh <johnmci at> 1-800-477-2659
Corporate Smalltalk Consulting Ltd.

More information about the Squeak-dev mailing list