<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    <blockquote

cite="mid:CAC20JE1xD45HtH8G16Fv3c3rKcXiZ4MJ1mAf=c=LZX3B90u3GQ@mail.gmail.com"

      type="cite">

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <div dir="ltr">Hi Stephane, Hi All,

        <div><br>

        </div>

        <div>    let me talk a little about the ParcPlace experience,

          which led to David Leibs' parcels, whose architecture Fuel

          uses.</div>

        <div><br>

        </div>

        <div>In the late 80's 90's Peter Deutsch write BOSS (Binary

          Object Storage System), a traditional interpretive pickling

          system defined by a little bytecoded language. Think of a

          bytecode as something like "What follows is an object

          definition, which is its id followed by size info followed by

          the definitions or ids of its sub-parts, including its class",

          or "What follows is the id of an already defined object".  So

          the loading interpreter looks at the next byte in the stream

          and that tells it what to do.  So the storage is a recursive

          definition of a graph, much like a recursive grammar for a

          programming language.</div>

        <div><br>

        </div>

        <div>This approach is slow (its a bytecode interpreter) and

          fragile (structures in the process of being built aren't valid

          yet, imagine trying to take the hash of a Set that is only

          half-way through being materialized).  But this architecture

          was very common at the time (I wrote something very similar). 

          The advantage BOSS had was a clumsy hack for versioning.  One

          could specify blocks that were supplied with the version and

          state of older objects, and these blocks could effect shape

          change etc to bring loaded instances up-to-date.</div>

        <div><br>

        </div>

        <div>David Leibs has an epiphany as, in the early 90's, ParcPlae

          was trying to decompose the VW image (chainsaw was the code

          name of the VW 2.5 release).  If one groups instances by

          class, one can instantiate in bulk, creating all the instances

          of a particular class in one go, followed by all the instances

          of a different class, etc.  Then the arc information (the

          pointers to objects to be stored in the loaded objects inst

          vars) can follow the instance information.  So now the file

          looks like header, names of classes that are referenced (not

          defined), definitions of classes, definitions of instances

          (essentially class id, count pairs), arc information.  And

          materializing means finding the classes in the image, creating

          the classes in the file, creating the instances, stitching the

          graph together, and then performing any post-load actions

          (rehashing instances, etc).</div>

        <div><br>

        </div>

        <div>Within months we merged with Digitalk (to form

          DarcPlace-Dodgytalk) and were introduced to TeamV's loading

          model which was very much like ImageSegments, being based on

          the VM's object format.  Because an ImageSegment also has

          imports (references to classes and globals taken from the host

          system, not defined in the file) performance doesn't just

          depend on loading the segment into memorty.  It also depends

          on how long it takes to search the system to find imports,

          etc.  In practice we found that a) Parcels were 4 times faster

          than BOSS, and b) they were no slower than Digitalk's image

          segments.  But being independent of the VM's heap format

          Parcels had BOSS's flexibility and could support shape change

          on load, something ImageSegments *cannot do*.  I went on to

          extend parcels with support for shape change, plus support for

          partial loading of code, but I won't describe that here.  Too

          detailed, even thought its very important.</div>

        <div><br>

        </div>

        <div>Mariano spent time talking with me and Fuel's basic

          architecture is that of parcels, but reimplemented to be

          nicer, more flexible etc.  But essentially Parcels and Fuel

          are at their core David Leibs' invention.  He came up with the

          ideas of a) grouping objects by class and b) separating the

          arcs from the nodes.</div>

      </div>

    </blockquote>

    <br>

    Indeed it was never our intention to say that it was our idea. I

    still remember the first time I loaded RB in VW30.... 2 s while

    normally loading <br>

    code was taking the time to cook pasta. I remember that I was still

    waiting but the code was already loaded. It was a cool feeling. <br>

    So I always wanted to experiment with that and one day mariano came

    and needed a fast loader and martin was working on ... a pickle

    format...<br>

    What a coincidence :)<br>

    <blockquote

cite="mid:CAC20JE1xD45HtH8G16Fv3c3rKcXiZ4MJ1mAf=c=LZX3B90u3GQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>Now, where ImageSegments are faster than Parcels is *not*

          loading.  Our experience with VW vs TeamV showed us that.  But

          they are faster in collecting the graph of objects to be

          included.  ImageSegments are dead simple.  So IMO the right

          architecture is to use Parcels' segregation, and Parcels'

          "abstract" format (independent of the heap object format) with

          ImageSegment's computation of the object graph.  Igor Stasenko

          has suggested providing the tracing part of ImageSegments (Dan

          Ingalls' cool invention of mark the segment root objects, then

          mark the heap, leaving the objects to be stored unmarked in

          the shadow of the marked segment roots) as a separate

          primitive.  Then this can be quickly partitioned by class and

          then written by Smalltalk code.</div>

      </div>

    </blockquote>

    may be. For me if the use of IS is tructured (ie you control the

    fact that there will no pointer to the graph from elements that are

    not in the roots)<br>

    then you may have a stable system on reload else you will have to

    decide what to do on reload and this can be a real pain. <br>

    <br>

    <blockquote

cite="mid:CAC20JE1xD45HtH8G16Fv3c3rKcXiZ4MJ1mAf=c=LZX3B90u3GQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>The loader can then materialize objects using Smalltalk

          code, can deal with shape change, and not be significantly

          slower than image segments.  Crucially this means that one has

          a portable, long-lived object storage format; freeing the VM

          to evolve its object format without breaking image segments

          with every change to the object format.</div>

      </div>

    </blockquote>

    Oh yes! This was what was also worrying to me. <br>

    <br>

    <blockquote

cite="mid:CAC20JE1xD45HtH8G16Fv3c3rKcXiZ4MJ1mAf=c=LZX3B90u3GQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>I'd be happy to help people working on Fuel by providing

          that primitive for anyone who wants to try and reimplement the

          ImageSegment functonality (project saving, class faulting,

          etc) above Fuel.</div>

      </div>

    </blockquote>

    <br>

    We do not have the resources for that now and will get probably less

    in the future because student cost doubled for internships :(<br>

    <br>

    Stef<br>

    <blockquote

cite="mid:CAC20JE1xD45HtH8G16Fv3c3rKcXiZ4MJ1mAf=c=LZX3B90u3GQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div><br>

          <div class="gmail_extra"><br>

            <div class="gmail_quote">On Wed, Oct 22, 2014 at 11:56 AM,

              Stéphane Ducasse <span dir="ltr">&lt;<a

                  moz-do-not-send="true"

                  href="mailto:stephane.ducasse@inria.fr"

                  target="_blank">stephane.ducasse@inria.fr</a>&gt;</span>

              wrote:<br>

              <blockquote class="gmail_quote" style="margin:0 0 0

                .8ex;border-left:1px #ccc solid;padding-left:1ex">What I

                can tell you is that instability raised by just having

                one single pointer not in the root objects<br>

                pointing to an element in the segment and the

                implication of this pointer on reloaded segments, (yes I

                do not want to have two objects in memory after loading)

                makes sure that we will not use IS primitive in Pharo in

                any future. For us this is a non feature.<br>

                <br>

                IS was a nice trick but since having a pointer to an

                object is so cheap and the basis of our computational

                model<br>

                so this is lead fo unpredictable side effects. We saw

                that when mariano worked during the first year of his

                PhD (which is a kind of LOOM revisit).<br>

                <br>

                Stef<br>

              </blockquote>

            </div>

            <br>

            <br clear="all">

            <div><br>

            </div>

            -- <br>

            best,

            <div>Eliot</div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

  </body>

</html>