<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
On 22.09.2011 21:57, Eliot Miranda wrote:
<blockquote
cite="mid:CAC20JE0CV86FKa3YPf06JBW18FPtcxaddfc1yPasZxOkMBgbUw@mail.gmail.com"
type="cite">
<pre wrap=""> </pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<meta http-equiv="Context-Type" content="text/html;
charset=ISO-8859-1">
<br>
<br>
<div>On Thu, Sep 22, 2011 at 12:29 PM, Henrik Sperre Johansen <span><<a
moz-do-not-send="true"
href="mailto:henrik.s.johansen@veloxit.no">henrik.s.johansen@veloxit.no</a>></span>
wrote:<br>
<blockquote>
<br>
<div> On 22.09.2011 20:20, Eliot Miranda wrote:
<blockquote type="cite">
<pre> </pre>
<br>
<br>
<br>
<br>
<div>On Thu, Sep 22, 2011 at 11:06 AM, Eliot Miranda <span><<a
moz-do-not-send="true"
href="mailto:eliot.miranda@gmail.com">eliot.miranda@gmail.com</a>></span>
wrote:<br>
<blockquote> Hi Igor,<br>
<br>
<div>
<div>
<div>On Thu, Sep 22, 2011 at 10:53 AM, Igor
Stasenko <span><<a moz-do-not-send="true"
href="mailto:siguctua@gmail.com">siguctua@gmail.com</a>></span>
wrote:<br>
<blockquote>
<div>
<div>On 22 September 2011 19:16, Eliot
Miranda <<a moz-do-not-send="true"
href="mailto:eliot.miranda@gmail.com">eliot.miranda@gmail.com</a>>
wrote:<br>
> (apologies for the duplicate reply;
someone needs to sort out their<br>
> threading for the benefit of the
community ;) )<br>
><br>
> On Thu, Sep 22, 2011 at 2:36 AM,
Marcus Denker <<a
moz-do-not-send="true"
href="mailto:marcus.denker@inria.fr">marcus.denker@inria.fr</a>><br>
> wrote:<br>
>><br>
>> Hi,<br>
>><br>
>> There are two changesets waiting
for integrating in 1.4 that have serious<br>
>> consequences:<br>
>><br>
>> - Ephemerons. The VM level
changes are in the Cog VMs build on
Jenkins,<br>
>> but have not<br>
>> been integrated in the VMMaker
codebase.<br>
>><br>
>> <a moz-do-not-send="true"
href="http://code.google.com/p/pharo/issues/detail?id=4265">http://code.google.com/p/pharo/issues/detail?id=4265</a><br>
><br>
> I would *really* like to back out
these changes. The Ephemeron<br>
> implementation is very much a
prototype, requiring a hack to determine<br>
> whether an object is an ephemeron
(the presence of a marker class in the<br>
> first inst var) that I'm not at all
happy with. There is a neater<br>
> implementation available via using an
unused instSpec which IMO has<br>
> significant advantages (much simpler
& faster, instSpec is valid at all<br>
> times, including during compaction,
less overhead, doesn't require a marker<br>
> class), and is the route I'm taking
with the new GC/object-representation<br>
> I'm working on now. Note that other
than determining whether an object is<br>
> an ephemeron (instSpec/format vs inst
var test) the rest of Igor's code<br>
> remains the same. I'd like to avoid
too much VM forking. Would you all<br>
> consider putting these changes on
hold for now?<br>
> If so, I'll make the effort to
produce prototype changes (in the area of<br>
> ClassBuilder and class definition; no
VM code necessary as yet) to allow<br>
> defining Ephemerons via the int spec
route by next week at the latest.<br>
><br>
<br>
</div>
</div>
i agree that in my implementation this is a
weak point. But its hard<br>
to do anything without<br>
making changes to object format to identify
these special objects.<br>
<br>
The main story behind this is can we afford to
change the internals of<br>
VM without being beaten hard<br>
by "backwards compatibility" party? :)<br>
</blockquote>
<div><br>
</div>
</div>
</div>
<div>I don't think we get stuck in this at all. The
instSpec/format field has an unused value (5 i
believe) and this can easily be used for
Ephemerons. All that is needed is a little image
work on these methods:</div>
<div><br>
</div>
<div> Behavior>>typeOfClass</div>
<div> needs to answer e.g. #ephemeron for
ephemeron classes</div>
<div><br>
</div>
<div>
ClassBuilder>>computeFormat:instSize:forSuper:ccIndex:</div>
<div> needs to accept e.g. #ephemeron for
type and pass variable: false and weak: true for
ephemerons
to format:variable:words:pointers:weak:.</div>
<div><br>
</div>
<div>
ClassBuilder>>format:variable:words:pointers:weak:</div>
<div> needs to respond to variable: false and
weak: true by computing the ephemeron instSpec.</div>
<div><br>
</div>
<div>
Class>>weakSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category:</div>
<div>
ClassBuilder>>superclass:weakSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category:</div>
<div> need siblings, e.g.</div>
<div>
ephemeronSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category </div>
<div>
superclass:ephemeronSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category:</div>
<div><br>
</div>
<div>Right? This is easy. Then in the VM there are
a few places where pointer indexability (formats 3
and 4) need to be firmed up to exclude 5, but
nothing difficult. We talked about this in email
last week.</div>
</div>
</blockquote>
<div><br>
</div>
<div>Here's the format field (Behavior>instSpec at
the image level) as currently populated:</div>
<div> </div>
<div>
<div> 0 = 0 sized objects (UndefinedObject True False
et al)</div>
<div> 1 = non-indexable objects with inst vars (Point
et al)</div>
<div> 2 = indexable objects with no inst vars (Array
et al)</div>
<div> 3 = indexable objects with inst vars
(MethodContext AdditionalMethodState et al)</div>
<div> 4 = weak indexable objects with inst vars
(WeakArray et al)</div>
</div>
<div> 6 = 32-bit indexable objects (Float, Bitmap ert
al)</div>
<div> 8 = 8-bit indexable objects (ByteString,
ByteArray et al)</div>
<div>12 = CompiledMethod</div>
<div><br>
</div>
<div>N.B. in the VM the least two bits of the
format/instSpec for byte objects (formats 8 and 12) is
used to encode the number of odd bytes in the object,
so that a 1 character ByteString has a format of 11, =
8 + 3, size = 1 word - 3 bytes.</div>
<div><br>
</div>
<div><br>
</div>
<div>For the future (i.e. the new GC/object
representation, /not/ for the first implementation of
ephemerons which we can do now, for Pharo 1.4 or 1.5)
we need to extend format/instSpec to support 64 bits.
I think format needs to be made a 5 bit field with
room for 4 bits of odd bytes for 64-bit images. [For
VMers, the Size4Bit is a horrible hack). So then</div>
<div><br>
</div>
<div>0 = 0 sized objects (UndefinedObject True False et
al)</div>
<div>1 = non-indexable objects with inst vars (Point et
al)</div>
<div>2 = indexable objects with no inst vars (Array et
al)</div>
<div>3 = indexable objects with inst vars (MethodContext
AdditionalMethodState et al)</div>
<div>4 = weak indexable objects with inst vars
(WeakArray et al)</div>
<div>5 = weak non-indexable objects with inst vars
(ephemerons) (Ephemeron)</div>
<div><br>
</div>
<div>and we need 8 CompiledMethod values, 8 byte values,
4 16-bit values, 2 32-bit values and a 64-bit value, =
23 values, 23 + 5 = 30, so there is room, e.g.</div>
<div><br>
</div>
<div>9 (?) 64-bit indexable</div>
<div>10 - 11 32-bit indexable</div>
<div>12 - 15 16-bit indexable</div>
<div>16 - 23 byte indexable</div>
<div>24 - 31 compiled method</div>
<div><br>
</div>
<div>In 32-bit images only the least significant 2 bits
would be used for formats 16 & 24, and the least
significant bit for format 12.<br>
</div>
</div>
</blockquote>
If we are changing the format for 64bit images anyways, why
not simplify it/ be more consistent by spending a full byte?<br>
<br>
Bit: 8 7 6 5
4 3 2 1 <br>
| 64bit | 32bit |16bit | 8bit |compiled | weak |
indexable | instVars | <br>
(Odd number encoded in remaining indexable bit fields)<br>
</div>
</blockquote>
<div><br>
</div>
<div>I used to prefer this approach but I've realised that the
format/instSpec approach (I think Dan came up with) makes
better use of bits because so many of the bit combinations are
mutually exclusive. For example, pointers excludes all the
byte/short/32-bit/64-bit indexability combinations. Also, see
below...</div>
<div> </div>
<blockquote>
<div> <br>
Could get away with 7 if you put f.ex. the unused indexable
weak combination (6) as compiled method/8bit<br>
<br>
Or is the header space in your new 64bit format already
quite filled, so this is a bad idea?<br>
</div>
</blockquote>
<div><br>
</div>
<div>Yes, ish. But they're scarce, and very useful for
experiments etc. Right now I have </div>
<div><br>
</div>
<div>
<div>typedef struct {</div>
<div><span> </span>unsigned short<span> </span>classIndex;</div>
<div><span> </span>unsigned<span> </span>unused0 : 6;</div>
<div><span> </span>unsigned<span> </span>isPinned : 1;</div>
<div><span> </span>unsigned<span> </span>isImmutable : 1;</div>
<div><span> </span>unsigned<span> </span>format : 5;
/* on a byte boundary */</div>
<div><span> </span>unsigned<span> </span>isMarked : 1;</div>
<div><span> </span>unsigned<span> </span>isGrey : 1;</div>
<div><span> </span>unsigned<span> </span>isRemembered : 1;</div>
<div><span> </span>unsigned<span> </span>objHash : 24;
/* on a 32-bit word boundary */</div>
<div><span> </span>unsigned char<span> </span>slotSize;
/* on a byte boundary */</div>
<div> } CogObjectHeader;</div>
</div>
<div><br>
</div>
<div>Where classIndex is 16-bits simply for efficiency and will
grow to 20 or 22 bits as needed. So one could steal one or
two bits from unused0 and two bits from objHash, and give
these to format, but it would be a waste. Better keep these
back for other uses.</div>
<div><br>
</div>
<div>Also, can I ask the assembled company exactly how many bits
you'd spend on the objHash (identityHash)? Think forward to
64-bits. Is 24 bits about all we can afford or still too
generous? Anybody have any data to contribute?<br>
</div>
</div>
</blockquote>
This is probably a stupid question, but where is the variable size
in words stored?<br>
In a quadword preceding the header like it is in 32bit format?<br>
<br>
The reason I'm asking is that to me, the main application of
identityHash is for HashedCollection's, and the max size of those
thus impact what a reasonable answer would be...<br>
<br>
Cheers,<br>
Henry<br>
</body>
</html>