[squeak-dev] Creating an image from first principles

Tue Jul 8 02:49:18 UTC 2008

Folks -

Eliot and I had a great lunch conversation today and it convinced me 
that I really should write up an idea that I had earlier and that is 
actually pretty simple: How to create your own image from scratch.
Here is how it goes.

Start with the interpreter simulator and a (literally) empty object 
memory. Read a series of class definitions (you can use either MC class 
defs or simply parse simple class definitions from sources) that are 
sufficient to define all of the kernel structures that are required by 
the running VM (incl. Object, Behavior, Class, Integer, Array, Process, 
CompiledMethod, ContextPart, Semaphore etc. etc. etc.). Create those by 
calling the allocators explicitly and set them up such that the 
structure is correct (format, superclasses, metaclasses etc). Create 
nil, true and false based on these definitions.

At this point we have a skeleton of classes that we can use to 
instantiate all behaviors required by a running image.

Next, make a modification to the compiler that allows one to create a 
compiled method in the simulator from a MethodNode (which should be 
straightforward since the simulator exposes all of the good stuff for 
creating new objects and instances).

Now we can create new compiled methods in the new image as long as they 
don't refer to any globals.

Next, find a way of dealing with two issues: a) adding the compiled 
method "properly" (e.g., deal with symbol interning and modifying 
MethodDictionaries) and b) global name lookups performed by the compiler 
(since the image is prototypical we can't have it send actual messages; 
not even simulated ones ;-)

The latter issue is the only one that doesn't seem completely obvious 
which is why I would advocate that a bootstrap kernel mustn't use class 
variables or shared pools (in which case the lookup is again trivial 
since you know all the possible names from compiling the original 
structure).

Now we can load all the source we want to be in our bootstrap image.

Lastly, do the bootstrap: Instantiate the first process, its first 
context, the first message. Run it in the simulator to set up the 
remaining parts of the kernel image (Delay, ProcessorScheduler etc).

Voila, at this point we have a fully functioning kernel image, created 
completely from first principles.

Once you have the kernel image there is no end to the fun: Since you can 
now start sending messages "into" the image (by way of the simulator) 
you can compile any code you want (incl. pools and class vars) and 
lookup the names properly by sending a message to the interpreter 
simulator. And then you just save the image and are ready to go.

Anyone interested?

Cheers,
   - Andreas

PS. Oh, and I'd be also interested in defining a good interface to do 
this by means of Hydra, i.e., instead of having to run the simulator run 
the compiled VM on an "empty image" to do all of this "for real" instead 
of in the simulator.