Squid plan

Mon May 12 23:31:11 UTC 2003

On Saturday 10 May 2003 15:01, Anthony Hannan wrote:
> By Smalltalk I mean reall Smalltalk not Slang.

Me too.

> But your right we can
> implement the Boot module in real Smalltalk, it just has to be
> translated and stored in the native executable format (eg. ELF).

The hard part is making sure it includes enough stuff so it can actually 
boot. Different versions might be interesting: booting from the network 
doesn't require being able to read local files, while booting from 
local files doesn't require being able to access the network, for 
example.

> > ["double clickable modules"]
>
> We probably still want to support launching multiple Squids.  Your
> idea above can be implemented using an alternative boot program.

Indeed, several variations could be useful as I wrote above. Multiple 
"VMs" are needed if you don't have enough security or if you can't take 
advantage of multiple processors. Otherwise I don't see the point.

> The send cache is the responsibility of the compiler not the VM.

I thought you said "there is no VM!".... oh wait... that was a movie ;-)

> Bytecodes are a universal machine code.  Its call instruction doesn't
> get translated to the native call instruction, however, because the
> stack is not in its own protected segment and it grow up not down.
> Calls are implemented in their long form of pushing, jumping, then
> testing overflow.  The stack is a regular Smalltalk object.

So several stacks can live in a single segment? We are talking about MMU 
segments or Squid segments here? Or is there no difference?

> The translator is like an assembler, its a one-to-one translator from
> bytecodes to machine-code.  I'm hoping the dynamic optimizer will
> generate sufficient fast bytecodes.  If not we can write some methods
> in straight bytecode (Smalltalk assembly).  Smalltalk assembly is how
> "primitives" will be implemented.

How about Eliot Miranda's bytecode-to-bytecode optimization project?

  http://www.stanford.edu/class/ee380/Abstracts/030312.html

> When you write your log, you have to flush to disk, don't you.  What
> about just flushing the mmap in my scheme instead?

If you have A which points to B, then make C point to B and A to nil, it 
is possible that you may flush the page where A lives and crash before 
getting to the page with C. When the system recovers, B will be garbage 
collected by mistake.

> If the application requires transaction support, this can be
> implemented at a higher level using objects. 

True, but the low level stuff should help instead of making this hard to 
implement.

> So all we care about at
> this low level is maintaining object pointer consistency.  Ie.
> suppose object A gets a pointer to object B, if object A is flushed
> to disk, then object B must be flushed to disk as well.  If a crash
> happens before the flush, the system will resume with object A
> pointing to its previous object (the state before the new pointer
> assignment).

Things are being flushed all the time, so a crash will almost certainly 
results in a "partial flush" as in my example above.

> > > Remote Pointers
> > >
> > > Cross pointers across machines must be established and maintain
> > > and robust against failure.  When accessing a remote field or
> > > invoking a remote method, execution can move to the remote
> > > machine or the object can move or be replicated to the local
> > > machine.
> >
> > How do you decide which of the three options should be chosen?
>
> That is a hard decision and I believe it's an active area of
> research. In the meantime we can just choose one or come up with some
> simple heuristic.

In a previous design I had "segment owners", as suggested by Michael. My 
policy was:

 - if the owner of a read-only (for others) segment was not currently 
logged in the local network, then the segment is replicated

 - if the owner is logged in and tries to access an object, the segment 
is moved to his node(s)

 - if the owner is logged in and another user tries to access and 
object, the message is forwarded to the current location of the segment

This was based on the observation that a "regular" user does not want to 
change most objects in an image. You probably won't be changing the 
bitmaps for a font you are using today, for example. So everyone in the 
network gets their own read-only copy. When the FontSmith user logs in, 
all these copies are flushed and any further access to the font is done 
remotely. So everything still works as before, though many times 
slower. And the text on your screen automatically changes as the font 
is being edited. When the FontSmith user logs out, copies are again 
handed out to everyone else and things go back to normal speed.

Too bad my papers about these kind of things are in Portugues....

> > What are the roots for this GC? What about inter segment cycles?
> > There are some very good papers at INRIA about this kind of thing.
>
> CrossPointers point to a roots array in the target segment (a la
> ImageSegments), so all roots will be in the target segment.  So the
> GC algorithm will look similar to today.

It is trivial to create new CrossPointers once you have at least one. 
Imagine A in one segment points to B in another. A sends a message to B 
which returns C, an object in B's segment. Though C wasn't part of the 
roots of its segment before, it has to be now.

The hard part is knowing when an object can be removed from the roots 
array (when the last CrossPointer to it has been eliminated).

> > In my own project I simply don't ever collect at all (in theory, at
> > least).
>
> Interesting.  I guess this is possible thanks to your selective
> reading and writing of segments.

I keep all objects around forever.

> The runtime system is written in Smalltalk and Smalltalk assembly,
> instead of Slang and C.  Calls it makes to native libraries are done
> through this C interface.

Ok, but my question was how are the native libraries linked to this? 
Dynamically (DLLs and friends)?

> > This has a lot in common with some of the stuff I am doing, so we
> > might be able to work together. In addition, I might be able to
> > hire somebody to help with this in the second half of this year.
>
> That would be great.  I think we should continue this discussion so
> we can agree on the overall design.  Do you have any links I should
> check out.

  http://www.merlintec.com:8080/software/8

This is a bit outdated, but better than nothing.

-- Jecel