[Squeak Installer] The Compiler, The Final Frontier(?)
Mark van Gulik
ghoul6 at home.com
Sat Aug 25 08:30:36 UTC 2001
On Saturday, August 25, 2001, at 02:34 am, PhiHo Hoang wrote:
[...]
Is it true that the Squeak compiler is implemented in Squeak only,
there
> is no plugin ? Is it because there is no need for speed in compiling ?
> Or is
> it impossible to implement the compiler outside of the image ?
I haven't been up on Squeak lately, but unless I'm really mistaken, the
compiler within Squeak is the only one out there.
> I need a compiler outside of the Squeak image for bootstrapping
> purpose.
> Is there a mechanism to translate the 'System-Compiler' category into a
> plugin ? If not, how else can I get a standalone Squeak compiler from
> the
> available Squeak codes ?
My fourth year project for my B.C.S. was bootstrapping a Smalltalk
system. I wrote a *simple* Smalltalk compiler in C and used it to
directly grow an image from nothing more than a text file with a
parenthesized hierarchical list of classes (with instance variable
names), and all method source code for all the classes.
Seriously, writing the compiler in C shouldn't be hard at all (a day to
a week, depending on experience). The parser can be simple recursive
descent, and your tokens don't even have to be allocated as Squeak
objects. You don't have to produce "optimal" code, using all the latest
and greatest bytecodes. My Smalltalk-in-C compiler didn't even bother
optimizing conditionals (or maybe I added that to it later). That kind
of thing can be dealt with as a final "linking" stage, after all your
modules have been compiled.
It's been a while since I wrote that code ('88-'89), but I recall an
issue was how to survive a garbage collection during initial image
construction (the C code had to point into the image a lot while it was
being constructed). If I had it to do over today, I would simply use
smart pointers that add themselves to a global bi-directional ring in
their constructor, and remove themselves in their destructor (I use this
technique in my Avail primitives). I think that's not a good idea with
Squeak, due to unavailability of C++ compilers on some platforms. In my
old Smalltalk system I simply banned garbage collection during image
construction -- it wasn't a serious problem, even in 1MB (Atari 1040ST).
Here's an idea: Extend Slang to be able to translate the Squeak
compiler. Most of it is fairly simple code, and the stuff that's more
complex can be made simple. Even if everything won't translate, you can
always fake the rest with a few C functions. Don't worry about memory
leaks initially. Eventually you can use your own malloc substitute that
allocates a "space" for the temporary structures, and then bulk
deallocates the whole space after each method compilation. The
advantage of translating the existing compiler is that as bytecodes
change, your code will continue to work.
Here's an alternative: Use a Squeak image to grow your fetal Squeak
image. The compiler produces a CompiledMethod which you can then trace
through and copy into the new image. SystemTracer might help you with
that (and you might help SystemTracer with that, too). The Smalltalk
compiler probably runs within an order of magnitude as fast as a
compiler written hastily in C, and that should be fast enough.
You don't need to simulate image memory in an Array or anything so
severe. Just keep a few roots pointing to the key data structures of
your fetal "image", and be prepared to do a little extra work separating
your data from the running image when producing an image file. If you
want to do this live, create all your data structures (sharing
immutables like Symbols if you want), then invoke a new magic primitive
whose purpose is to do a big "context switch" of all the key Smalltalk
roots (Processor, etc). Two images worth of data can live in one actual
image without much trouble. Hm. On second thought, don't share Symbols
or you'll run into method lookup problems. Hm. Even SmallIntegers will
be a problem (and you can't really build a class like that). You'll
have to switch method dictionaries for all the "known to the VM" classes
atomically inside the context switch primitive.
-Mark
More information about the Squeak-dev
mailing list
|