Status Report: 64-bit Squeak
Dan at SqueakLand.org
Fri Jun 25 14:53:31 UTC 2004
A while ago I promised a status report on the 64-bit project. Well, here it is.
The first thing I learned is that this is actually much harder than I
thought. The good news is that it's definitely doable, and the VM
will be much better for the changes. Moreover, we've already made
good progress. I say "we" because, Ian Piumarta has been of great
help, and has actually done half the work so far.
What Ian has done
Ian is strongly rooted in the world of C and C compilers (I am not),
and he has performed a wonderful service that I could not. If you are
not familiar with the Slang code for the interpreter, I will tell you
that it has always been full of coercions that allow complete anarchy
to rein. This was never elegant, but it also never caused a lot of
trouble when almost everything was 32-bit pointers. Now, however, in
addition to bytes (and halfwords), there are things that are supposed
to be 32 bits, things that are either 32 or 64 bits depending on the
machine, and things that are supposed to be 64 bits.
Ian's great contribution was to re-cast the basic operations in the
Interpreter and ObjectMemory in terms of these different types, in a
way that is consistent and that isn't everywhere undermined by
coercions to 'char *'. This has produced two results of immediate
value. The first is that, with these changes and a few load and store
macros, it is now possible to build a VM that will run a 32-bit image
on a 64-bit machine. The second is that now, if you make a mistake in
usage, e.g., between 32-bit quantities and things that are the size of
a native machine word, the compiler will diagnose it before you crash
and have to debug it. Ian's work was done almost entirely by
eliminating the damaging coercions, and then using well-typed access
routines until the compiler errors went away. Ian's goal is that,
simply by changing load and store macros, it should be possible to
generate VMs that will run any of the 4 combinations of 32- or 64-bit
images on 32- or 64-bit platforms.
In addition to the Interpreter and ObjectMemory changes, Ian has also
rewritten parts of his Unix support code to make it all 64-bit clean.
The end result of all of these changes is that Ian is now able to
generate and compile a 32-bit VM that runs on both 32- and 64-bit
What I have done
On my side, I have written a true 64-bit image in a format that makes
the minimum possible changes that can possibly work. I have also made
changes equivalent to Ian's throughout the Simulator and Interpreter,
but mine have to address all the changes related to different word
sizes in the image. While some of these details were anticipated in
symbolic constants, many are pragmatic. There are no automatic ways
to find the problems, other than, e.g., searching for the integer 4.
Or 32. Or 2 (the amount by which something may have been shifted to
get a word offset). Or 3 (a baaad folding of wordsize-1). And so on.
Then run the simulator and see where it breaks. It is a deep
immersion in the debugger. That said, today I can run over 12000
bytecodes, up from only 300 a couple of days ago, so I think I'm
What is yet to be done
I am now in the process of merging Ian's and my changes, after which I
will verify that the result works as well as before in both
environments (running 32 on 64 natively and running 64 on 64 in the
Simulator). After this, I need to finish finding the 64-bit offset
bugs, and produce a VM that will run 64 on 64 natively.
The VM that Ian has made to run on a 64 host, and the Simulations that
I have made of running a 64-bit image, are both "kernel" VMs. That
is, they include the kernel primitives and a couple of the heavily
used plugins, but there are a number of large and important plugins
that have not yet been touched by our conversions. We simply let them
fail, and they run the failure code as an emulation.
Calling All Cars
We are coming to a time when the 64-bit image runs, and we will have
finished the kernel conversion, but there will still be much work to
do on the plugins. It is my hope to enlist some help from the larger
Squeak community to complete this task. Ian and I will document
precisely the new conventions for data access in the VM, along with an
example conversion of a couple of simple plugins. As a test-bed, we
will produce a 64-on-32 release whose kernel works, but whose plugins
need conversion. A 64-on-32 is a Simulator and VM that can run a
64-bit image on a 32-bit machine. It is important because it can be
tested on any old 32-bit machine, and yet, because the image word size
is different from the machine word size, we believe that if code works
in that configuration, it is very likely to work in any of the other
So stay tuned. If you know any of the plugins well, or if you know
the VM generally and would like to help us finish the job, please let
us know (reply to me) and we'll plan to include you in the big plugin party,
probably in about a month.
Remember the Version 4 format changes? Well, I haven't even thought
about them since I wrote my first 64-bit image. But I can tell you
that they are mostly small compared to this task, and it should be
easy to fold them into this project toward the end. My plan is to
work on these during the period after the kernel works fully and
before all the plugins have been converted.
We talked about this all getting folded into the 3.8 changes so that
3.8 could essentially be the same as 4.0 except for the VM changes for
64 bits and associated image-side tweaks. I originally thought that
we might have to rush 3.8 to sync the two schedules, but it now looks
like a fairly consistent time frame. I think it will take until the
end of August, and maybe even into the fall for the last of the plugin
conversions to be done and the bug tail to have died down.
Onward and upward...
More information about the Squeak-dev