Finalization (was: Re: [Seaside] WeakArray (again))

Sat Mar 25 12:40:29 UTC 2006

Andreas,
        It sounds to me like you are talking about Ephemerons, am I
right? VW has support to it, what about Squeak? I have easily
implemented a Finalizer like the one you mentioned, relying in the
Ephemeron mechanism. I could give you more details if you want.

Best regards,
             Cani

----- Original Message -----
From: Andreas Raab <andreas.raab at gmx.de>
To: The general-purpose Squeak developers list <squeak-dev at lists.squeakfoundation.org>
Date: Saturday, March 25, 2006, 12:23:12 AM
Subject: Finalization (was: Re: [Seaside] WeakArray (again))

> David Shaffer wrote:
>> \begin{amateurHour}
>> 
>> It seems to me that the notification needs to be changed to actually
>> queueing information about the objects which the GC deams
>> un(strongly)reachable.  I spent some time staring at
>> ObjectMemory>>sweepPhase, #finalizeReference: and #signalFinalization:
>> which seem to be the cornerstones of this process.  All that
>> #signalFinalization: is currently doing is signaling a semaphore (well,
>> indicating that one should be "signaled" later).  Why not keep a list of
>> (oop,i) [i is the offset of the weak reference in the oop] pairs and
>> somehow communicate those back to a Smalltalk object?  As a total VM
>> novice it just seems too simple ;-)  What I think I would do is
>> associate a queue like thing with every weak reference container.  Then
>> when an object becomes GC-able I'd place the (oop,i) pair in that shared
>> queue.  What I need is someone to hold my hand through...
>> 
>> ...designing this "queue like thing".  How about a circular array which
>> can only be "read" (move the read index) by ST code and only be written
>> by the VM code?  This avoids a lot of concurrency issues.  Are there any
>> examples like this in the VM?
>> 
>> \end{amateurHour}

> What you've described is not a bad idea in general (and it's probably
> what VW does) but there are things that I don't like about it. For 
> example, part of why the finalization process takes so much time is that
> there are so many weak references lost that we don't care about - the
> whole idea that just because you use a weak array you need to know when
> its contents goes away is just bogus. Secondly, once you start relying
> on "accurate" finalization information you should really make sure it's
> accurate (e.g., one signal/entry per finalized object). And once you do
> that you need to deal with the ugly corner cases of an overflow of the
> finalization queue (and the effect that you probably can't allocate any
> larger one because the GC you're currently in was triggered by a low
> space condition to begin with ;-) Nasty, nasty issues.

> Having said that, let me propose a mechanism that (I think) is 
> fundamentally different and fundamentally simpler. Namely, to make the
> requirement that you only get notifications for the finalization of 
> objects that you explicitly register for by creating a "finalizer" 
> object, e.g., an observer which is allocated before it's ever needed.
> This simple change avoids both the problem of GC needing to allocate
> memory when there is none as well as sending notifications about 
> finalizations that nobody cares about, which are both very desirable
> properties. When the object becomes eligible for garbage collection, the
> finalizer is then put into a list of objects that have indeed been 
> finalized and the finalization process simply pulls them out of the 
> queue and sends #finalize to them.

> In its simplest form, this could mean a finalizer is a structure with
> (besides the prev and next links for putting it into a structore) two
> slots a "weak" slot for the object being guarded and a "strong" slot for
> the object performing the finalization (its #finalizer). When the 
> garbage collector runs across a Finalizer and notices its observed value
> is being collected, it can simply put the finalizer into the 
> finalization list and is done. (btw, this scheme is *vastly* easier to
> implement than your proposed scheme since everything is pre-allocated
> and you only move the object from one list to another).

> But while we're at it, we could also shoot a little bit further and get
> away from post-mortem finalization (which I find a highly overrated 
> concept in practice). The only thing we'd change in the above is that
> the garbage collector would now also transfer the object from the "weak"
> into the "strong" slot[*1]. This makes the finalizer the sole last 
> reference to the object. If the finalizer drops it, it's gone. If the
> finalizer decides to store it, it will survive. Lots of interesting 
> possibilities and much cleaner since you gain access to the full context
> of the object and its state.

> [*1] The easiest way to do this would be to simply clone the object but
> unfortunately this also has the unbounded memory problem so something a
> bit more clever might be required. Basically we really want *all* 
> references to the object except from the finalizer to be cleaned up.

> Note that weak arrays or other weak classes wouldn't be affected at all
> by this since only Finalizers get the notifications - all other weak
> classes would simply drop the references when they get collected and
> never get notified about anything.

> Cheers,
>    - Andreas

X --------------------------