Finalization

Sun Mar 26 01:28:03 UTC 2006

Hi -

No I'm not talking about Ephemerons - having done an implementation for 
fun in the past I'm quite aware about the differences ;-) What I'm 
proposing here is simply a (precise) notification mechanism for object 
finalization.

Cheers,
   - Andreas

Nicolás Cañibano wrote:
> Andreas,
>         It sounds to me like you are talking about Ephemerons, am I
> right? VW has support to it, what about Squeak? I have easily
> implemented a Finalizer like the one you mentioned, relying in the
> Ephemeron mechanism. I could give you more details if you want.
> 
> Best regards,
>              Cani
> 
> ----- Original Message -----
> From: Andreas Raab <andreas.raab at gmx.de>
> To: The general-purpose Squeak developers list <squeak-dev at lists.squeakfoundation.org>
> Date: Saturday, March 25, 2006, 12:23:12 AM
> Subject: Finalization (was: Re: [Seaside] WeakArray (again))
> 
>> David Shaffer wrote:
>>> \begin{amateurHour}
>>>
>>> It seems to me that the notification needs to be changed to actually
>>> queueing information about the objects which the GC deams
>>> un(strongly)reachable.  I spent some time staring at
>>> ObjectMemory>>sweepPhase, #finalizeReference: and #signalFinalization:
>>> which seem to be the cornerstones of this process.  All that
>>> #signalFinalization: is currently doing is signaling a semaphore (well,
>>> indicating that one should be "signaled" later).  Why not keep a list of
>>> (oop,i) [i is the offset of the weak reference in the oop] pairs and
>>> somehow communicate those back to a Smalltalk object?  As a total VM
>>> novice it just seems too simple ;-)  What I think I would do is
>>> associate a queue like thing with every weak reference container.  Then
>>> when an object becomes GC-able I'd place the (oop,i) pair in that shared
>>> queue.  What I need is someone to hold my hand through...
>>>
>>> ...designing this "queue like thing".  How about a circular array which
>>> can only be "read" (move the read index) by ST code and only be written
>>> by the VM code?  This avoids a lot of concurrency issues.  Are there any
>>> examples like this in the VM?
>>>
>>> \end{amateurHour}
> 
>> What you've described is not a bad idea in general (and it's probably
>> what VW does) but there are things that I don't like about it. For 
>> example, part of why the finalization process takes so much time is that
>> there are so many weak references lost that we don't care about - the
>> whole idea that just because you use a weak array you need to know when
>> its contents goes away is just bogus. Secondly, once you start relying
>> on "accurate" finalization information you should really make sure it's
>> accurate (e.g., one signal/entry per finalized object). And once you do
>> that you need to deal with the ugly corner cases of an overflow of the
>> finalization queue (and the effect that you probably can't allocate any
>> larger one because the GC you're currently in was triggered by a low
>> space condition to begin with ;-) Nasty, nasty issues.
> 
>> Having said that, let me propose a mechanism that (I think) is 
>> fundamentally different and fundamentally simpler. Namely, to make the
>> requirement that you only get notifications for the finalization of 
>> objects that you explicitly register for by creating a "finalizer" 
>> object, e.g., an observer which is allocated before it's ever needed.
>> This simple change avoids both the problem of GC needing to allocate
>> memory when there is none as well as sending notifications about 
>> finalizations that nobody cares about, which are both very desirable
>> properties. When the object becomes eligible for garbage collection, the
>> finalizer is then put into a list of objects that have indeed been 
>> finalized and the finalization process simply pulls them out of the 
>> queue and sends #finalize to them.
> 
>> In its simplest form, this could mean a finalizer is a structure with
>> (besides the prev and next links for putting it into a structore) two
>> slots a "weak" slot for the object being guarded and a "strong" slot for
>> the object performing the finalization (its #finalizer). When the 
>> garbage collector runs across a Finalizer and notices its observed value
>> is being collected, it can simply put the finalizer into the 
>> finalization list and is done. (btw, this scheme is *vastly* easier to
>> implement than your proposed scheme since everything is pre-allocated
>> and you only move the object from one list to another).
> 
>> But while we're at it, we could also shoot a little bit further and get
>> away from post-mortem finalization (which I find a highly overrated 
>> concept in practice). The only thing we'd change in the above is that
>> the garbage collector would now also transfer the object from the "weak"
>> into the "strong" slot[*1]. This makes the finalizer the sole last 
>> reference to the object. If the finalizer drops it, it's gone. If the
>> finalizer decides to store it, it will survive. Lots of interesting 
>> possibilities and much cleaner since you gain access to the full context
>> of the object and its state.
> 
>> [*1] The easiest way to do this would be to simply clone the object but
>> unfortunately this also has the unbounded memory problem so something a
>> bit more clever might be required. Basically we really want *all* 
>> references to the object except from the finalizer to be cleaned up.
> 
>> Note that weak arrays or other weak classes wouldn't be affected at all
>> by this since only Finalizers get the notifications - all other weak
>> classes would simply drop the references when they get collected and
>> never get notified about anything.
> 
>> Cheers,
>>    - Andreas
> 
> X --------------------------
> 
> 
>