Squid plan

Sat May 10 00:42:31 UTC 2003

On Thursday 08 May 2003 23:21, Anthony Hannan wrote:
> The plan is to implement the following modules (and related ones not
> yet realized).  If anyone wants to help, please let me know. 

This has a lot in common with some of the stuff I am doing, so we might 
be able to work together. In addition, I might be able to hire somebody 
to help with this in the second half of this year.

> Also,
> if anyone has any feedback, please respond.  All modules except the
> Boot modules shall be implemented in Smalltalk.

Why not Boot as well? You can bootstrap it from another Smalltalk as was 
done originally for Squeak.

> After we have a
> working Squid kernel then we can port Squeak code to it.

"Port Squeak code" can have several different meanings, all interesting. 
It might mean being able to file in .cs files created in Squeak. Or 
being able to read project files. Or even whole images.

> Boot
>
> Parses the command line, loads the named squid file, and executes its
> boot method with the rest of the command line string as it argument.

One interesting variation I had in Merlin was to check if the system was 
already running. If so, the command line is passed on to it and the 
boot module dies. That made the other module files look like double 
clickable applications in Windows or Mac. Of course, for this illusion 
to work you need to be able to have multiple "native" windows.

> Method Execution
>
> Method bytecodes are translated to machine code, if not already, and
> then executed.  Each method maintains a pointer to its machine code.
> Translated code still manipulates its own Smalltalk object stacks.
> Bytecodes are low-level like machine code.

Would this be essentially the send cache?

> Translator
>
> The translator translates bytecodes to machine code using a
> platform-specific machine code generator (conversion table).  The
> translator and generator is written in Smalltalk.

What kinds of optimizations are you considering? Since the translator 
translates itself (I suppose), this will have a great impact in the 
system performance.

> Segments
>
> Each squid file contains a segment of objects.  Objects can point to
> objects in other segments, using cross pointers.  Following a cross
> pointer causes the target segment to be loaded, if not already. 
> Files are kept in sync with with their loaded segments (mmap)
> maintaining persistence without explicit saving.  Cross pointers may
> cross machine boundaries.

In my design, segments are compressed. When it is loaded into memory, 
only those objects actually referenced get expanded into the heap. When 
they are changed, the get written into a log instead of going back into 
the compressed segment (probably wouldn't fit anyway).

Even mmap takes a finite time to update the disk. Please consider the 
effects of a crash.

> Remote Pointers
>
> Cross pointers across machines must be established and maintain and
> robust against failure.  When accessing a remote field or invoking a
> remote method execution can move to the remote machine or the
> object can move or be replicated to the local machine.

How do you decide which of the three options should be chosen? Are 
remote pointers handled explicitly in the applications or do all 
objects look alike?

> Garbage Collection
>
> Garbage collection is performed on each segment individually.  It is
> written in Smalltalk and runs in its own segment, thus allowing the
> algorithm to utilize full Smalltalk power, ie. object creation.

What are the roots for this GC? What about inter segment cycles? There 
are some very good papers at INRIA about this kind of thing.

In my own project I simply don't ever collect at all (in theory, at 
least).

> C Interface
>
> There is no VM.  The native OS and machine are accessed via C library
> functions called directly from Smalltalk.  C calls transfer args from
> the Smalltalk stack to the C stack and vice versa.

How is that library linked to the rest of the runtime system?

> Modules
>
> A Smalltalk module defines a set of selector methods, classes, and
> class methods, and imports a set of other modules whose public
> methods and classes are accessible.  A selector method is a method
> that stands alone and is called directly.  A class method overrides a
> selector method and cannot be called directly.  A selector method may
> delegate to the receiver's class if desired.  Only class method can
> access fields of the receiver.  Delegation looks up the class method
> in the sender's module (including visible imports) and the class's
> module.

"A class method...cannot be called directly" ever? From outside of the 
module? But it seems that instances of a class are outside of the 
module that contains it. I am confused.

> Compiler
>
> Translate Smalltalk source into Squid bytecodes.  The closure
> compiler is a start for this.

This is the easy part :-)

> Dynamic Optimizer
>
> Add profilers and inline heavily used code.  The inlined code is
> bytecode which is then translated to machine code.

Shouldn't this be part of the Translator?

-- Jecel