[squeak-dev] Re: Burn the Squeak Image! (Why I am running for board)

Sun Mar 1 19:55:48 UTC 2009

On Sun, Mar 1, 2009 at 3:19 AM, Stéphane Rollandin
<lecteur at zogotounga.net>wrote:

> Eliot Miranda a écrit :
>
>> I need to point out that unless the various communities can start building
>> their disparate and diverging images form a micro-kernel image I don't see
>> how improved execution technology is going to be adopted by the community.
>>  I'm working hard on a VM that will be potentially 10x the current Squeak
>> VM
>> for Smalltalk intensive benchmarks.  This VM will be source code
>> compatible
>> and bytecode compatible but likely it will not be image compatible as it
>> will use a streamlined object representation that doesn't use compact
>> classes.  The only way I can see this being adopted by the community at
>> large is if the community starts building images form microkernels.
>>
>
> maybe a silly question (I have no idea of what is involved): would
> converting an image from the current format to the one your VM will require
> be an option ?

One can convert images from one format to another using the SystemTracer.
 This is a program that from within an image traces all the objects it can
find and writes out a new image.  But its tricky.  The system isn't usable
while the thing is running and the system inevitably includes the
SystemTracer which one must later strip out if one wants e.g. a minimal
deployment image.  So the SystemTracer approach isn't great. (It also
suffers from an issue explained below).

One can convert images from one format to another using a program that reads
an image, transforms it, and writes it, (I'll call this ImageRewriter) but
this is tricky.  The image may contain user code that has constraints the
ImageRewriter isn't aware of.  VisualWorks uses the ImageWriter approach to
convert 32-bit images to 64bit images.  It does succeed in rewriting the
base development image but occasionally will fail to produce a working
rewrite of some complicated image.

Even then the image that ImageRewriter produces still needs to contain
special support and to be saved to be ready for production.  One thing the
image does for itself on startup is check if the size of the identity hash
field has changed (in 64-bit VW images there is a larger id hash than in
32-bit VW images).  If the image finds the id hash filed has changed it
rehashes all hashed collections except MethodDictionary.  The ImageRewriter
(and SystemTracer) knows enough to be able to rehash MethodDictionary and
IdentityDictionary.  But because the default implementation of #hash in
Object is to answer identityHash a change in id hash can potentially affect
equal-hashed collections, not just id-hashed ones.  But to be able to rehash
an equal-hashed collection one must be able to evaluate #hash and #= and
these are arbitrarily complex and it gets tricky to get either ImageRewriter
or SystemTracer to rehash.  Note that they have to compute what the hash
would be in the new image, not what it evaluates to in the current image
 Hence it is much easier to have the new image rehash its
non-MethodDictionary collections on start-up.

Clarly this is slow enough that one does it once when starting up the output
of ImageRewriter, then saves.  The saved image then starts up without needng
to rehash because the id-hash size won't have changed.

So both SystemTracer and ImageRewriter approaches have significant
difficulties when trying to produce images in which things lie the id hash
has changed.  They also have difficulties if the instruction set, class
implementation, block implementation etc etc of the target has changed
because it may be difficult to set-up the necessary invariants.

With the micro-kernel approach the real image is produced by loading code
into the microkernel.  So the image transformers (be they SystemTracer,
ImageRewriter or MicroKernelGenerator) only have to function on the known
quantity which is the microkernel image, not on an arbitrarily complex
development or product image.  Great.  But if the microkernel image is
simple enough then why not generate it directly from a source specification
(as John Maloney's MicroSqueak does)?  Instead of stripping code form an
existing image, as one must do with both SystemTracer and ImageRewriter one
produces just what the microkernel needs to contain, and it i exactly
reproducible.  In any image that tries to prodeuce a microkernel exactly the
same microkernel will be produced whereas with the SystemTracer and
ImageRewriter approaches what one gets depends on the image one starts with.

So I much prefer the microkernel approach (I didn't used to; enlightenment
comes slowly if at all).  It is not absolutely necessary but turns out to
have significant advantages.  The only downside (and I don't even think it
is a downside) is needing to build up a development or production image by
loading code into the microkernel (I actually think this is a feature :) ).

HTH

est
Eliot

>
>
> Stef
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20090301/a69a6858/attachment.htm