<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    On 22.09.2011 21:57, Eliot Miranda wrote:
    <blockquote
cite="mid:CAC20JE0CV86FKa3YPf06JBW18FPtcxaddfc1yPasZxOkMBgbUw@mail.gmail.com"
      type="cite">
      <pre wrap=""> </pre>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <meta http-equiv="Context-Type" content="text/html;
        charset=ISO-8859-1">
      <br>
      <br>
      <div>On Thu, Sep 22, 2011 at 12:29 PM, Henrik Sperre Johansen <span>&lt;<a
            moz-do-not-send="true"
            href="mailto:henrik.s.johansen@veloxit.no">henrik.s.johansen@veloxit.no</a>&gt;</span>
        wrote:<br>
        <blockquote>
          &nbsp;<br>
          <div> On 22.09.2011 20:20, Eliot Miranda wrote:
            <blockquote type="cite">
              <pre> </pre>
              <br>
              <br>
              <br>
              <br>
              <div>On Thu, Sep 22, 2011 at 11:06 AM, Eliot Miranda <span>&lt;<a
                    moz-do-not-send="true"
                    href="mailto:eliot.miranda@gmail.com">eliot.miranda@gmail.com</a>&gt;</span>
                wrote:<br>
                <blockquote> Hi Igor,<br>
                  <br>
                  <div>
                    <div>
                      <div>On Thu, Sep 22, 2011 at 10:53 AM, Igor
                        Stasenko <span>&lt;<a moz-do-not-send="true"
                            href="mailto:siguctua@gmail.com">siguctua@gmail.com</a>&gt;</span>
                        wrote:<br>
                        <blockquote>
                          <div>
                            <div>On 22 September 2011 19:16, Eliot
                              Miranda &lt;<a moz-do-not-send="true"
                                href="mailto:eliot.miranda@gmail.com">eliot.miranda@gmail.com</a>&gt;

                              wrote:<br>
                              &gt; (apologies for the duplicate reply;
                              someone needs to sort out their<br>
                              &gt; threading for the benefit of the
                              community ;) )<br>
                              &gt;<br>
                              &gt; On Thu, Sep 22, 2011 at 2:36 AM,
                              Marcus Denker &lt;<a
                                moz-do-not-send="true"
                                href="mailto:marcus.denker@inria.fr">marcus.denker@inria.fr</a>&gt;<br>
                              &gt; wrote:<br>
                              &gt;&gt;<br>
                              &gt;&gt; Hi,<br>
                              &gt;&gt;<br>
                              &gt;&gt; There are two changesets waiting
                              for integrating in 1.4 that have serious<br>
                              &gt;&gt; consequences:<br>
                              &gt;&gt;<br>
                              &gt;&gt; - Ephemerons. The VM level
                              changes are in the Cog VMs build on
                              Jenkins,<br>
                              &gt;&gt; but have not<br>
                              &gt;&gt; &nbsp;been integrated in the VMMaker
                              codebase.<br>
                              &gt;&gt;<br>
                              &gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp;<a moz-do-not-send="true"
href="http://code.google.com/p/pharo/issues/detail?id=4265">http://code.google.com/p/pharo/issues/detail?id=4265</a><br>
                              &gt;<br>
                              &gt; I would *really* like to back out
                              these changes. &nbsp;The Ephemeron<br>
                              &gt; implementation is very much a
                              prototype, requiring a hack to determine<br>
                              &gt; whether an object is an ephemeron
                              (the presence of a &nbsp;marker class in the<br>
                              &gt; first inst var) that I'm not at all
                              happy with. &nbsp;There is a neater<br>
                              &gt; implementation available via using an
                              unused instSpec which IMO has<br>
                              &gt; significant advantages (much simpler
                              &amp; faster, instSpec is valid at all<br>
                              &gt; times, including during compaction,
                              less overhead, doesn't require a marker<br>
                              &gt; class), and is the route I'm taking
                              with the new GC/object-representation<br>
                              &gt; I'm working on now. &nbsp;Note that other
                              than determining whether an object is<br>
                              &gt; an ephemeron (instSpec/format vs inst
                              var test) the rest of Igor's code<br>
                              &gt; remains the same. &nbsp;I'd like to avoid
                              too much VM forking. &nbsp;Would you all<br>
                              &gt; consider putting these changes on
                              hold for now?<br>
                              &gt; If so, I'll make the effort to
                              produce prototype changes (in the area of<br>
                              &gt; ClassBuilder and class definition; no
                              VM code necessary as yet) to allow<br>
                              &gt; defining Ephemerons via the int spec
                              route by next week at the latest.<br>
                              &gt;<br>
                              <br>
                            </div>
                          </div>
                          i agree that in my implementation this is a
                          weak point. But its hard<br>
                          to do anything without<br>
                          making changes to object format to identify
                          these special objects.<br>
                          <br>
                          The main story behind this is can we afford to
                          change the internals of<br>
                          VM without being beaten hard<br>
                          by "backwards compatibility" party? :)<br>
                        </blockquote>
                        <div><br>
                        </div>
                      </div>
                    </div>
                    <div>I don't think we get stuck in this at all. &nbsp;The
                      instSpec/format field has an unused value (5 i
                      believe) and this can easily be used for
                      Ephemerons. All that is needed is a little image
                      work on these methods:</div>
                    <div><br>
                    </div>
                    <div>&nbsp; &nbsp; Behavior&gt;&gt;typeOfClass</div>
                    <div>&nbsp; &nbsp; &nbsp; &nbsp; needs to answer e.g. #ephemeron for
                      ephemeron classes</div>
                    <div><br>
                    </div>
                    <div>&nbsp; &nbsp;
                      ClassBuilder&gt;&gt;computeFormat:instSize:forSuper:ccIndex:</div>
                    <div> &nbsp; &nbsp; &nbsp; &nbsp; needs to accept e.g. #ephemeron for
                      type and pass variable: false and weak: true for
                      ephemerons
                      to&nbsp;format:variable:words:pointers:weak:.</div>
                    <div><br>
                    </div>
                    <div>&nbsp; &nbsp;
                      ClassBuilder&gt;&gt;format:variable:words:pointers:weak:</div>
                    <div>&nbsp; &nbsp; &nbsp; &nbsp; needs to respond to&nbsp;variable: false and
                      weak: true by computing the ephemeron instSpec.</div>
                    <div><br>
                    </div>
                    <div>&nbsp; &nbsp;
Class&gt;&gt;weakSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category:</div>
                    <div>&nbsp; &nbsp;
ClassBuilder&gt;&gt;superclass:weakSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category:</div>
                    <div>&nbsp; &nbsp; &nbsp; &nbsp; need siblings, e.g.</div>
                    <div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
ephemeronSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category&nbsp;</div>
                    <div>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
superclass:ephemeronSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category:</div>
                    <div><br>
                    </div>
                    <div>Right? &nbsp;This is easy. &nbsp;Then in the VM there are
                      a few places where pointer indexability (formats 3
                      and 4) need to be firmed up to exclude 5, but
                      nothing difficult. &nbsp;We talked about this in email
                      last week.</div>
                  </div>
                </blockquote>
                <div><br>
                </div>
                <div>Here's the format field (Behavior&gt;instSpec at
                  the image level) as currently populated:</div>
                <div>&nbsp;</div>
                <div>
                  <div>&nbsp; 0 = 0 sized objects (UndefinedObject True False
                    et al)</div>
                  <div>&nbsp; 1 = non-indexable objects with inst vars (Point
                    et al)</div>
                  <div>&nbsp; 2 = indexable objects with no inst vars (Array
                    et al)</div>
                  <div>&nbsp; 3 = indexable objects with inst vars
                    (MethodContext AdditionalMethodState et al)</div>
                  <div>&nbsp; 4 = weak indexable objects with inst vars
                    (WeakArray et al)</div>
                </div>
                <div>&nbsp; 6 = 32-bit indexable objects (Float, Bitmap ert
                  al)</div>
                <div>&nbsp; 8 = 8-bit indexable objects (ByteString,
                  ByteArray et al)</div>
                <div>12 = CompiledMethod</div>
                <div><br>
                </div>
                <div>N.B. in the VM the least two bits of the
                  format/instSpec for byte objects (formats 8 and 12) is
                  used to encode the number of odd bytes in the object,
                  so that a 1 character ByteString has a format of 11, =
                  8 + 3, size = 1 word - 3 bytes.</div>
                <div><br>
                </div>
                <div><br>
                </div>
                <div>For the future (i.e. the new GC/object
                  representation, /not/ for the first implementation of
                  ephemerons which we can do now, for Pharo 1.4 or 1.5)
                  we need to extend format/instSpec to support 64 bits.
                  &nbsp;I think format needs to be made a 5 bit field with
                  room for 4 bits of odd bytes for 64-bit images. &nbsp;[For
                  VMers, the Size4Bit is a horrible hack). &nbsp;So then</div>
                <div><br>
                </div>
                <div>0 = 0 sized objects (UndefinedObject True False et
                  al)</div>
                <div>1 = non-indexable objects with inst vars (Point et
                  al)</div>
                <div>2 = indexable objects with no inst vars (Array et
                  al)</div>
                <div>3 = indexable objects with inst vars (MethodContext
                  AdditionalMethodState et al)</div>
                <div>4 = weak indexable objects with inst vars
                  (WeakArray et al)</div>
                <div>5 = weak non-indexable objects with inst vars
                  (ephemerons) (Ephemeron)</div>
                <div><br>
                </div>
                <div>and we need 8 CompiledMethod values, 8 byte values,
                  4 16-bit values, 2 32-bit values and a 64-bit value, =
                  23 values, 23 + 5 = 30, so there is room, e.g.</div>
                <div><br>
                </div>
                <div>9 (?) 64-bit indexable</div>
                <div>10 - 11 32-bit indexable</div>
                <div>12 - 15 16-bit indexable</div>
                <div>16 - 23 byte indexable</div>
                <div>24 - 31 compiled method</div>
                <div><br>
                </div>
                <div>In 32-bit images only the least significant 2 bits
                  would be used for formats 16 &amp; 24, and the least
                  significant bit for format 12.<br>
                </div>
              </div>
            </blockquote>
            If we are changing the format for 64bit images anyways, why
            not simplify it/ be more consistent by spending a full byte?<br>
            <br>
            Bit: 8&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 7 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 6 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 5 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;
            4&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1 <br>
            | 64bit&nbsp; | 32bit |16bit&nbsp; | 8bit |compiled | weak |
            indexable&nbsp; | instVars&nbsp; | <br>
            (Odd number encoded in remaining indexable bit fields)<br>
          </div>
        </blockquote>
        <div><br>
        </div>
        <div>I used to prefer this approach but I've realised that the
          format/instSpec approach (I think Dan came up with) makes
          better use of bits because so many of the bit combinations are
          mutually exclusive. &nbsp;For example, pointers excludes all the
          byte/short/32-bit/64-bit indexability combinations. &nbsp;Also, see
          below...</div>
        <div>&nbsp;</div>
        <blockquote>
          <div> <br>
            Could get away with 7 if you put f.ex. the unused indexable
            weak combination (6) as compiled method/8bit<br>
            <br>
            Or is the header space in your new 64bit format already
            quite filled, so this is a bad idea?<br>
          </div>
        </blockquote>
        <div><br>
        </div>
        <div>Yes, ish. &nbsp;But they're scarce, and very useful for
          experiments etc. &nbsp;Right now I have&nbsp;</div>
        <div><br>
        </div>
        <div>
          <div>typedef struct {</div>
          <div><span> </span>unsigned short<span> </span>classIndex;</div>
          <div><span> </span>unsigned<span> </span>unused0 : 6;</div>
          <div><span> </span>unsigned<span> </span>isPinned : 1;</div>
          <div><span> </span>unsigned<span> </span>isImmutable : 1;</div>
          <div><span> </span>unsigned<span> </span>format : 5; &nbsp; &nbsp; &nbsp; &nbsp;
            &nbsp; &nbsp; &nbsp; /* on a byte boundary */</div>
          <div><span> </span>unsigned<span> </span>isMarked : 1;</div>
          <div><span> </span>unsigned<span> </span>isGrey : 1;</div>
          <div><span> </span>unsigned<span> </span>isRemembered : 1;</div>
          <div><span> </span>unsigned<span> </span>objHash : 24; &nbsp; &nbsp; &nbsp;
            &nbsp; &nbsp;/* on a 32-bit word boundary */</div>
          <div><span> </span>unsigned char<span> </span>slotSize; &nbsp; &nbsp;
            &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;/* on a byte boundary */</div>
          <div>&nbsp;} CogObjectHeader;</div>
        </div>
        <div><br>
        </div>
        <div>Where classIndex is 16-bits simply for efficiency and will
          grow to 20 or 22 bits as needed. &nbsp;So one could steal one or
          two bits from unused0 and two bits from objHash, and give
          these to format, but it would be a waste. &nbsp;Better keep these
          back for other uses.</div>
        <div><br>
        </div>
        <div>Also, can I ask the assembled company exactly how many bits
          you'd spend on the objHash (identityHash)? &nbsp;Think forward to
          64-bits. &nbsp;Is 24 bits about all we can afford or still too
          generous? &nbsp;Anybody have any data to contribute?<br>
        </div>
      </div>
    </blockquote>
    This is probably a stupid question, but where is the variable size
    in words stored?<br>
    In a quadword preceding the header like it is in 32bit format?<br>
    <br>
    The reason I'm asking is that to me, the main application of
    identityHash is for HashedCollection's, and the max size of those
    thus impact what a reasonable answer would be...<br>
    <br>
    Cheers,<br>
    Henry<br>
  </body>
</html>