New subject: [Pharo-project] Integrating Changes in 1.4 that require a new VM

22 Sep 2011

      On Thu, Sep 22, 2011 at 11:06 AM, Eliot Miranda eliot.miranda@gmail.comwrote:
...
Hi Igor,
On Thu, Sep 22, 2011 at 10:53 AM, Igor Stasenko siguctua@gmail.comwrote:
...
On 22 September 2011 19:16, Eliot Miranda eliot.miranda@gmail.com
wrote:
...
(apologies for the duplicate reply; someone needs to sort out their
threading for the benefit of the community ;) )
On Thu, Sep 22, 2011 at 2:36 AM, Marcus Denker marcus.denker@inria.fr
wrote:
...
Hi,
There are two changesets waiting for integrating in 1.4 that have
serious
...
...
consequences:

Ephemerons. The VM level changes are in the Cog VMs build on Jenkins,

but have not
 been integrated in the VMMaker codebase.
   http://code.google.com/p/pharo/issues/detail?id=4265

I would *really* like to back out these changes.  The Ephemeron
implementation is very much a prototype, requiring a hack to determine
whether an object is an ephemeron (the presence of a  marker class in
the
...
first inst var) that I'm not at all happy with.  There is a neater
implementation available via using an unused instSpec which IMO has
significant advantages (much simpler & faster, instSpec is valid at all
times, including during compaction, less overhead, doesn't require a
marker
...
class), and is the route I'm taking with the new
GC/object-representation
...
I'm working on now.  Note that other than determining whether an object
is
...
an ephemeron (instSpec/format vs inst var test) the rest of Igor's code
remains the same.  I'd like to avoid too much VM forking.  Would you all
consider putting these changes on hold for now?
If so, I'll make the effort to produce prototype changes (in the area of
ClassBuilder and class definition; no VM code necessary as yet) to allow
defining Ephemerons via the int spec route by next week at the latest.
i agree that in my implementation this is a weak point. But its hard
to do anything without
making changes to object format to identify these special objects.
The main story behind this is can we afford to change the internals of
VM without being beaten hard
by "backwards compatibility" party? :)
I don't think we get stuck in this at all.  The instSpec/format field has
an unused value (5 i believe) and this can easily be used for Ephemerons.
All that is needed is a little image work on these methods:
Behavior>>typeOfClass
    needs to answer e.g. #ephemeron for ephemeron classes

ClassBuilder>>computeFormat:instSize:forSuper:ccIndex:
    needs to accept e.g. #ephemeron for type and pass variable: false

and weak: true for ephemerons to format:variable:words:pointers:weak:.
ClassBuilder>>format:variable:words:pointers:weak:
    needs to respond to variable: false and weak: true by computing the

ephemeron instSpec.
Class>>weakSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category:
ClassBuilder>>superclass:weakSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category:
        need siblings, e.g.
ephemeronSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category
superclass:ephemeronSubclass:instanceVariableNames:classVariableNames:poolDictionaries:category:
Right?  This is easy.  Then in the VM there are a few places where pointer
indexability (formats 3 and 4) need to be firmed up to exclude 5, but
nothing difficult.  We talked about this in email last week.
Here's the format field (Behavior>instSpec at the image level) as currently
populated:
0 = 0 sized objects (UndefinedObject True False et al)
  1 = non-indexable objects with inst vars (Point et al)
  2 = indexable objects with no inst vars (Array et al)
  3 = indexable objects with inst vars (MethodContext AdditionalMethodState
et al)
  4 = weak indexable objects with inst vars (WeakArray et al)
  6 = 32-bit indexable objects (Float, Bitmap ert al)
  8 = 8-bit indexable objects (ByteString, ByteArray et al)
12 = CompiledMethod
N.B. in the VM the least two bits of the format/instSpec for byte objects
(formats 8 and 12) is used to encode the number of odd bytes in the object,
so that a 1 character ByteString has a format of 11, = 8 + 3, size = 1 word
- 3 bytes.
For the future (i.e. the new GC/object representation, /not/ for the first
implementation of ephemerons which we can do now, for Pharo 1.4 or 1.5) we
need to extend format/instSpec to support 64 bits.  I think format needs to
be made a 5 bit field with room for 4 bits of odd bytes for 64-bit images.
 [For VMers, the Size4Bit is a horrible hack).  So then
0 = 0 sized objects (UndefinedObject True False et al)
1 = non-indexable objects with inst vars (Point et al)
2 = indexable objects with no inst vars (Array et al)
3 = indexable objects with inst vars (MethodContext AdditionalMethodState et
al)
4 = weak indexable objects with inst vars (WeakArray et al)
5 = weak non-indexable objects with inst vars (ephemerons) (Ephemeron)
and we need 8 CompiledMethod values, 8 byte values, 4 16-bit values, 2
32-bit values and a 64-bit value, = 23 values, 23 + 5 = 30, so there is
room, e.g.
9 (?) 64-bit indexable
10 - 11 32-bit indexable
12 - 15 16-bit indexable
16 - 23 byte indexable
24 - 31 compiled method
In 32-bit images only the least significant 2 bits would be used for formats
16 & 24, and the least significant bit for format 12.
...
...
Ephemerons are versatile way to get notifications of objects which are
about to die,
and there are certain parts in language which is hard (or even
impossible) to implement without ephemerons.
I got stuck with it earlier, when realized that we cannot afford to
have weak subscriptions in announcement framework
for blocks (which is most convenient and most easy way to define
subscriptions) without having ephemerons.
And of course, by having ephemerons we can completely review the weak
finalization scheme and make it
much simpler, and faster.
I think we should do something in this regard, even at cost of
backward compatibility.
Because as to me it blocks us from moving forward.
I wanted to remind to people, that it took me around a day to
implement ephemerons in VM. And then few more days
to actually make a correct implementation and write tests to cover it.
Unfortunately, we yet don't have a well established process, which
could make VM + language side changes to go in sync,
when its required, and go much faster and don't fear to
introduce/change functionality.
One of the reasons for having a continuous integration setup for VM
was exactly for that:
 having new VMs every day (comparing to having new VMs every year).
...
...

Finalization code checks for #hasNewFinalization

This is true in the current VMs build in Jenkins, but in older VMs
this
...
...
is not in.
   http://code.google.com/p/pharo/issues/detail?id=4483

There are two options:
   a) integrate in
   b) not integrate it

a) means that the image runs on older VMs, too.
b) means we accept that we can never improve anything for real.
There will be more changes coming... e.g. imagine we have a Vector
Graphics Canvas
as some point next year... what will we do? use it or not use it to
stay
...
...
compatible?
   Marcus

--
Marcus Denker -- http://marcusdenker.de
--
best,
Eliot
--
Best regards,
Igor Stasenko.
--
best,
Eliot
-- 
best,
Eliot

Re: [Pharo-project] Integrating Changes in 1.4 that require a new VM