[squeak-dev] Forks, forks, forks

Thu Jul 2 23:29:35 UTC 2009

2009/7/3 Eliot Miranda <eliot.miranda at gmail.com>:
> Hi Igor,
>
> On Thu, Jul 2, 2009 at 1:26 PM, Igor Stasenko <siguctua at gmail.com> wrote:
>>
>> 2009/7/2 Eliot Miranda <eliot.miranda at gmail.com>:
>> >
>> >
>> > On Thu, Jul 2, 2009 at 1:43 AM, Igor Stasenko <siguctua at gmail.com>
>> > wrote:
>> >>
>> >> 2009/7/2 Eliot Miranda <eliot.miranda at gmail.com>:
>> >> >
>> >> >
>> >> > On Thu, Jul 2, 2009 at 12:06 AM, Ralph Johnson <johnson at cs.uiuc.edu>
>> >> > wrote:
>> >> >>
>> >> >> The tendency to fork is a product of all Smalltalks, not just
>> >> >> Squeak.
>> >> >
>> >> > as others have observed in the current spate of discussion the need
>> >> > to
>> >> > fork
>> >> > can be minimised or avoided altogether by providing a small kernel
>> >> > image
>> >> > that can load packages, and getting into the habit of writing one's
>> >> > application as a set of packages and regularly building them up into
>> >> > an
>> >> > image (nightly, weekly) starting from the kernel.
>> >> > This approach also allows the VM and image representation to evolve
>> >> > (and
>> >> > even fork!) because the kernel is amenable to transformation.  So one
>> >> > can
>> >> > create different forms of the kernel image with exactly the same
>> >> > apparent
>> >> > image contents, but with object representations adapted for specific
>> >> > uses,
>> >> > such as an extremely compact representation for embedded applications
>> >> > and a
>> >> > simple one for performance and normal development, and one can have
>> >> > purely
>> >> > interpreted VMs for hostile or memory-limited machines (iPhone does
>> >> > not
>> >> > allow one to enable execution permission on mmap'ed memory) and JITs
>> >> > for performance and normal development.
>> >> > IMO it is also easier building up a kernel than carving it out of a
>> >> > monolithic image to architect the necessary modularity support to
>> >> > allow
>> >> > packages to be loaded that make major transformations such as adding
>> >> > a
>> >> > GUI
>> >> > and replacing a standard i/o stack dump with an interactive debugger.
>> >> > I hope that if such a beast becomes available that the community will
>> >> > make
>> >> > the effort to port to it, which involves packaging their code, and we
>> >> > can
>> >> > branch to our heart's content because images will be constructed not
>> >> > constricting.
>> >>
>> >> Yes, and all Squeak forks could use such kernel as a base, except
>> >> those who making own changes to VM.
>> >
>> > I'm actually working on a kernel image _so_ I can make changes to the
>> > base
>> > VM.  If the kernel is expressed first as source from which an image is
>> > generated then one can create a different image format from that source.
>> >  So
>> > my route to a faster object format, and to a fast 64-bit format, is
>> > through
>> > the kernel.  So I disagree.  I think the kernel can also be used by
>> > those
>> > working on their own VMs, and it can be much easier than using e.g. a
>> > SystemTracer-based approach.
>> >
>>
>> Ageed.
>> Except cases, like changing the core primitives (don't ask me why one
>> would do that),
>> and other basic VM<->image interfaces.
>> This is actually what i had in mind , saying about 'own changes to VM'.
>
> OK, so let me expand a little and I think you'll see that there aren't many
> exceptional cases.
> This is all being derived from John Maloney's MicroSqueak, but my kernel
> image will be much more of a MilliSqueak, because it'll include the
> compiler, a command-line interface, etc.  i.e. it'll resemble Gnu Smalltalk.
> In the image is a "shadow" class hierarchy rooted in MObject or
> MProtoObject.  This code can be in the bytecode set of the host system or
> not.  If not, you can't run it directly but instead have to interpret it
> with a special version of ContextPart.  This interpreter can allow one to
> test the code before producing a kernel; image form it, and the interpreter
> can allow one to test new system primitives etc.

Exactly! I know exactly what you are talking about.
Because i doing so in Moebuis. This is good way (keeping a core class
hierarchy as a source in image)
- enables you to use common smalltalk tools (browser & friends) to develop it.
And of course, enables to maintain it as a package.

The one, huge, difference between us in this regard, that i'm using a
'CV' prefix, not 'M' :)

> An additional tool (the MilliSqueak image builder) reads this class
> hierarchy and constructs an image in whatever format one defines.  It
> renames all the MFoo classes on generating the image so that in the new
> image they're called Object, ProtoObject et al, and fixes up any special
> methods such as compilerClass that have to be hacked to work as one desires
> in the host system.

Right, and again, i have envision nearly the same things. :)

> If one wants e.g. to be able to run MCompiler to produce methods in a
> bytecode set then there are a couple of ways to approach it.  One way is for
> MBehavior to effectively provide two method dictionaries, one holding
> methods in the host system's bytecode set, one in the new system's.  One
> wouldn't need to store the extra methodDictionary in the MBehavior
> instances, instead it would be some global dictionary maintained by the
> MilliSqueak building/maintennance code, and the MilliSqueak image builder
> would substitute the new methodDictionary for the old when producing an
> image.  The compilerClass method would be special.  It would invoke a
> compiler that compiled two versions, one using MCompiler, and one using
> Compiler, to populate the two dictionaries.  The image builder would fixup
> the compilerClass method when producing the image.  Another way is simply to
> have two copies of MCompiler, one the MCompiler from which the kernel image
> with be generated, and one, say, HostMCompiler, and just keep the two in
> sync.  HostMCompiler's methods are in the host format but it generates
> methods in the new format.  MBehavior compilerClass answers HostMCompiler,
> but the image generator fixes this up.

Nearly the same what i did. Yes there should be two compilers - one
serves for host platform, as usual, while other is for producing new
code & methods in different format.  ( Btw i looking & developing
towards a unified compiler model, which could serve for both
environments, and depending on providing it a different 'environment'
object - it could produce methods for one or another environment,
without much hassle). I'd call it: a compiler module :)

In order to minimize the coding which glueing the stuff (like things
you describing - compilerClass, host format & other),
a format of objects in a newly generated memory is extracted from
behavior of special classes, which i called Species.
So, by gathering stuff together:
- class hierarchy
- modular compiler
- environment (aka 'image builder')
- species

you are know everything about object formats and can work
transparently in both domains (host/target), because the code i put in
Species is made ultimately invariant to environment it runs in, and
depends only on arguments which you passing to them.

In this way, i implemented things like #ivarAt: , #ivarAt:put: ,
#header , and rest such stuff in species using functional(stateless)
way - so all what you doing is compiling (and in case of VMMaker -
translating) this code to VM , and voila , you have your own object
format.

> An additional VMMaker package allows one to develop a new VM for that
> format, and at least check if it starts up.  The VM simulator could actually
> provide i/o if the image is headless.  It would be easy to write a little
> console for the VM simulator to do i/o with the simulated headless image.
>  Then the new VM would be the means by which one executed the new bytecode
> and primitive set.
> This is all half-baked but you get the idea.  There's lots of ways to pull
> this off.  And I'm sure there are some tricky bits here (e.g. making
> senders/implementors work in the host on the new format methods).  But the
> idea of having a source tree for the image one wants is a good one, and it
> means that the kernel image is produced from source, not by transformation
> of some existing image.  One can choose precisely the base class library,
> the bytecode set, etc and one can test it in image using either the VM
> simulator and/or by constructing a special Context interpreter.
> If one uses a special Context interpreter, which one needs to develop anyway
> if one is developing a new bytecode set because in the new kernel
> InstructionStream et al must interpret the new bytecode instruction set,
> then it also needs either the two dictionary or the duplication approach.  I
> haven't decided which is best yet.
>
> If this scheme can be made to work I think one will be able to experiment to
> ones heart's content and have considerable freedom experimenting with
> different kernel images, for example ones that implement a novel namespace
> scheme, or a new language, or simply a well-defined ANSI subset, or a format
> making good use of 64-bits, or...
>

Eliot, just say, if you wish any help on that. I see we have a lot of
synergy about an idea, how things should be made.
So, i'd like to offer my help to make this bright future a bit closer :)

>> >>

[snip]

-- 
Best regards,
Igor Stasenko AKA sig.