[squeak-dev] The Primitive: I am not a number- I am a named prim!

Wed Jul 2 00:06:17 UTC 2008

On Tue, Jul 1, 2008 at 4:12 PM, Igor Stasenko <siguctua at gmail.com> wrote:

> +1 to moving named primitive into image
>
> +1 to make use of primitive returning value (this would affect the
> code generator, but hell, its way better to have a notion why it
> failed than simply guessing in the dark room,  like any currently
> primitive failure code does)
>
> +1 to eliminating pushRemappableOop:/popRemappableOop . (A question
> however, how you would handle heap overflows, when evil primitive
> generates massive object or lots of tiny objects?)

It is much the same as it is now.  A large allocation attempt (e.g. in
primitiveNewWithArg) either succeed, possibly causing a garbage collection
very soon after, or fails.  The garbage collector is allowed to run in
primitiveNewWithArg et al.  But allocations at any other time cannot cause
the garbage collector to run and are known to fit within the extra memory
reserved for the interpreter.  These allocations always succeed but may
exceed some threshold which will cause a garbage collection very soon
thereafter.

Memory will look something like this:
    ->  start of heap
           ... lots of objects ...
    -> young start
           ... not so many objects ...
    -> free start/eden
           ... small amount of free space (a few hundred k) ...
    -> scavenge threshold
           ... free memory (megabytes)...
    -> VM reserve start
           .... very few k (e.g. 8k) ...
    -> fwd block reserve
    -> end

The allocation pointer is free start/eden.  A new object is allocated here
and advances free start/eden.  We will dispense with freeBlock and its
header for speed.

primitiveNewWithArg (et al) can allocate up to VM reserve start.  An attempt
beyond that can trigger a garbage collection but the retry cannot exceed VM
reserve start.

Any other allocation can exceed VM reserve start but not fwd block reserve.
 Exceeding fwd block reserve is an internal fatal error that should not
happen if VM reserve is large enough.

Any allocation that pushes free start/eden beyond scavenge threshold sets a
flag that will cause the VM to perform an incrementalGC on the next send or
backward branch.

The net result is that pointers only move during primitives that do
substantial allocations (primitiveNewWithArg, anything that allocates lots
of objects) and only these primitives need to use
pushRemappableOop:/popRemappableOop.

The details can vary but the essentials are to arrange that for allocations
in the course of execution (e.g. creating a block,creating the message
argument of a doesNotUnderstand:, creating a primitive failure value,
flushing a stack page to heap contexts, etc, etc) the garage collector will
not run and so the VM does not need to manage pointers in local variables.
 It can assume these values will not change and remain valid until the next
send or backward branch.

(a little OT .. )
>
> Another question, is how would you sync there changes with Hydra (if
> you having plans of course)?

This *must* be done.  I expect we'll spend quite a lot of time communicating
to make this happen.  I am about to extend the translator for the stack
interpreter.  But basically its my job to fit Cog into Hydra.

My own OT:

BTW, did you see my comment about renaming Hydra?  I think "String" is a
much better name. Strings are composed of threads.  Smalltalk comes from
Alan Kay's distaste for the kinds of names people were using for programming
languages in the 60's & 70's, names like Zeus and Thor.  Hence Smalltalk.
 Hydra is another mythic name.  Cog is a small element in a larger whole
(and a cool advert by Honda).  Anyway, think it over...

I have many changes in CodeGenerator and some in VMMaker , to simplify
> my day's job :)

Could you email me your current VMMaker package, or a link to it?  And could
you summarise the changes you've made and why?  I'll do the same.

In fact, it would be good to make a clean refactoring of it:
> - make inst vars of Interpreter/ObjectMemory be the interpreter instance
> state
> - make class vars be the part of VM global state
> - make pool vars be the constants
> - same could be applied to methods (instance/class side)
>
> Currently things are little messy, since i made incremental changes to
> existing model.
>
> Another thing, which doesn't makes me sleep well, is that by using
> thread specific storage, it would be possible to avoid passing
> interpreter instance to each function/primitive. Too bad, GCC (on
> windows) support a __thread__ specifier only in latest release , which
> is 'technology preview' how they calling it :)
> If we could use this, it would be possible to make Hydra nearly as
> fast as current Squeak VM.
> And there is no point to argue do we have right to use thread specific
> storage or not: consider that Hydra can run on platforms which can
> support threads. And since we already using threads, why we can't use
> thread-specific storage.

You need is an abstraction layer, e.g. implemented with macros that
insulates the Slang code form the platform thread system details.  You then
implement the abstraction layer as thinly as possible.  I've done this with
the threaded API extension of the VW FFI.  Its not hard.  You may be able to
find GPL versions out there.

best
Eliot

... (stopping OT)
>
> 2008/7/1 Eliot Miranda <eliot.miranda at gmail.com>:
> >
> >
> > On Tue, Jul 1, 2008 at 1:34 PM, tim Rowledge <tim at rowledge.org> wrote:
> >>
> >> On 1-Jul-08, at 1:20 PM, Eliot Miranda wrote:
> >>>
> >>> One doesn't have to *use* the FFI.   If the FFI isn't exposed via a
> >>> primitive then no FFI.  One can still have named primitives supported
> by the
> >>> image and not by the VM and not use the FFI.  To call a named primitive
> in a
> >>> primitive plugin the following sequence occurs:
> >>>
> >>> the method containing a named primitive spec is activated and the
> >>> primitive call fails because its function pointer is null.
> >>> the failure code extracts the plugin name and invokes a primitive to
> load
> >>> the plugin library
> >>> the failure code extracts the primitive name and uses the lookup
> >>> primitive to find the function in the loaded plugin library
> >>> the failure code uses a primitive to slam the function pointer into the
> >>> method
> >>> the failure code uses the executeMethodWithArgs primitive to retry the
> >>> bound named primitive method
> >>>
> >>> So the FFI is an optional extra.  One needs four primitives, load
> >>> library, lookup name in library, insert primitive function pointer. and
> >>> executemethodWithArgs (thanks Tim!).  Slamming the function into the
> method
> >>> could also be done using, say, objectAt:.
> >>>
> >>> So one can still have a nice small safe VM and have no direct support
> for
> >>> named primitives in the VM.
> >>>
> >>
> >>
> >> Leaving aside my like of all-named-prims, I *like* this enhancement to
> the
> >> support for named prims.
> >>
> >> It would slightly complicate the post-prim-call code in each method
> >> because you would need to handle the failed-to-find-prim case as well as
> all
> >> the prim-failed cases. It would be helped by an old idea that I'm pretty
> >> sure eliot has plans for anyway (as indeed I have written about a few
> times)
> >> to add primitive error return values. For those that haven't heard of
> them,
> >> this is just a way of having the first temp in a context involved in a
> prim
> >> call be a slot for a return value from the prim if any error occurs.
> This
> >> means you can actually know wtf went wrong instead of guessing - "ooh,
> was
> >> the first arg a SmallInteger? Um, if not, was it a Float that might
> round
> >> nicley? Err, was the second arg a ByteArray with the first byte = 255?"
> etc.
> >> Instead you get "ah, errorValue was 1, so we have to explode the reactor
> >> core in order to stop the Dreen from eating the children". Much nicer.
> >
> > I did primitive error codes at Cadence and they'll very probably b making
> it
> > into Cog real soon now.  They're simpler than VisualWorks', being only
> > symbols.  So extracting more information, such as a related error code
> > requires a subsequent call.  But I think the work I'm doing right now on
> > eliminating pushRemappableOop:/popRemappableOop will enable me to have a
> > structured object with a name and parameters, which is more generally
> > useful.
> >
> > A nice thing is that the code is forwards and backewards compatible.  One
> > can use the VM to run older images.  One can run images that contain the
> > primitive error code on older VMs, where one simply gets a nil error code
> on
> > primitive failure.
> >
> >
> >
> >
> >
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20080701/60bfdeac/attachment.htm

[squeak-dev] The Primitive: I am not a number- I am a named prim! - SqueakPeople article