[squeak-dev] Creating an image from first principles

Wed Jul 9 01:10:08 UTC 2008

2008/7/8 Eliot Miranda <eliot.miranda at gmail.com>:
>
>
> On Mon, Jul 7, 2008 at 7:49 PM, Andreas Raab <andreas.raab at gmx.de> wrote:
>>
>> Folks -
>>
>> Eliot and I had a great lunch conversation today and it convinced me that
>> I really should write up an idea that I had earlier and that is actually
>> pretty simple: How to create your own image from scratch.
>> Here is how it goes.
>>
>> Start with the interpreter simulator and a (literally) empty object
>> memory. Read a series of class definitions (you can use either MC class defs
>> or simply parse simple class definitions from sources) that are sufficient
>> to define all of the kernel structures that are required by the running VM
>> (incl. Object, Behavior, Class, Integer, Array, Process, CompiledMethod,
>> ContextPart, Semaphore etc. etc. etc.). Create those by calling the
>> allocators explicitly and set them up such that the structure is correct
>> (format, superclasses, metaclasses etc). Create nil, true and false based on
>> these definitions.
>>
>> At this point we have a skeleton of classes that we can use to instantiate
>> all behaviors required by a running image.
>>
>> Next, make a modification to the compiler that allows one to create a
>> compiled method in the simulator from a MethodNode (which should be
>> straightforward since the simulator exposes all of the good stuff for
>> creating new objects and instances).
>>
>> Now we can create new compiled methods in the new image as long as they
>> don't refer to any globals.
>>
>> Next, find a way of dealing with two issues: a) adding the compiled method
>> "properly" (e.g., deal with symbol interning and modifying
>> MethodDictionaries) and b) global name lookups performed by the compiler
>> (since the image is prototypical we can't have it send actual messages; not
>> even simulated ones ;-)
>>
>> The latter issue is the only one that doesn't seem completely obvious
>> which is why I would advocate that a bootstrap kernel mustn't use class
>> variables or shared pools (in which case the lookup is again trivial since
>> you know all the possible names from compiling the original structure).
>
> I don't understand why this is difficult.  Here's how I think it works.
> Every time the compiler to simulated objects creates an object that is a
> global it also creates an association for the global in the simulator's heap
> and adds the global to a suitable scope dictionary it maintains.  So it
> maintains shadow scopes for Smalltalk (or nemaspaces when we have them) and
> class pools etc.  Then the scope lookup mechanism uses these scopes when
> compiling methods.  Lookups for globals will find the right associations
> even though the dictionaries holding those associations don't yet exist in
> the simlulator's heap.  Once enough of the bootstrap is complete the
> compiler can then create the globals (Smalltalk, non-empty class pools) and
> populate them using the associations.  The creation and hashing of the
> dictionaries is done by the simulator, but the compiler generates the
> invocations of the dictionary creation code using sequences of associations
> it extracts from its shadow scope dictionaries.
>

Right. Exactly in the way how i done things in CorruptVM and it works just fine.
I starting a bootstrap by sending single message, which leads to
creating a first object, which is then tries fill all its slots, like
vtable(class), which triggers creating a class, which triggers
creating a bunch of another objects, compiling and installing methods
in new classes, interning symbols etc etc.

I think same could be done for Hydra easily. Even easier, because you
don't have to deal with different object format.
If you need, you may use my code as reference or as reference, how not
to do things :)
http://www.squeaksource.com/CorruptVM

>
>> Now we can load all the source we want to be in our bootstrap image.
>>
>> Lastly, do the bootstrap: Instantiate the first process, its first
>> context, the first message. Run it in the simulator to set up the remaining
>> parts of the kernel image (Delay, ProcessorScheduler etc).
>>
>> Voila, at this point we have a fully functioning kernel image, created
>> completely from first principles.
>>
>> Once you have the kernel image there is no end to the fun: Since you can
>> now start sending messages "into" the image (by way of the simulator) you
>> can compile any code you want (incl. pools and class vars) and lookup the
>> names properly by sending a message to the interpreter simulator. And then
>> you just save the image and are ready to go.
>>
>> Anyone interested?
>
> Oh no.  No.  Not at all.  Not in the least.  No, really, no.  Um, ah, no.
>
>
>> Cheers,
>>  - Andreas
>>
>> PS. Oh, and I'd be also interested in defining a good interface to do this
>> by means of Hydra, i.e., instead of having to run the simulator run the
>> compiled VM on an "empty image" to do all of this "for real" instead of in
>> the simulator.
>>
>
>
>
>
>

-- 
Best regards,
Igor Stasenko AKA sig.