[squeak-dev] Ephemerons and Dictionaries

Levente Uzonyi leves at caesar.elte.hu
Thu Oct 1 19:55:00 UTC 2020


Hi Eliot,

On Thu, 1 Oct 2020, Eliot Miranda wrote:

> 
> 
> On Thu, Oct 1, 2020 at 11:41 AM Levente Uzonyi <leves at caesar.elte.hu> wrote:
>       Hi Eliot,
>
>       On Thu, 1 Oct 2020, Eliot Miranda wrote:
>
>       > Hi All,
>       >
>       >     to be able to ease EphemeronDicitonary into the system easily I'd like to clean up adding associations to Dictionary.  It seems to me there's a partial implementation of choosing an association class
>       appropriate for a
>       > dictionary in the implementors of associationClass: Dictionary>>#associationClass, WeakKeyDictionary>>#associationClass, WeakValueDictionary>>#associationClass, (& in my image STON class>>#associationClass). 
>       This seems
>       > workable; an EphemeronDictionary would simply add associationClass ^ Ephemeron and we're done, except not quite...
>
>       What's the definition of Ephemeron?
> 
> 
> An Ephemeron is an association known to the garbage collection system., allowing it to function as a pre-mortem finalizer.
> 
> A Ephemeron is intended for uses such as associating an object's dependents with an object without preventing garbage collection.
> 
> Consider a traditional implementation of dependents in non-Model classes.  There is a Dictionary in Object, DependentsFields.  Objects wishing to have dependents are entered as keys in DependentsFields and the value is a
> sequence of their dependents.  Since their dependents (if they are like views/morphs, etc in MVC) will refer directly to the key object (in their model inst var etc), there is no way to use weak collections in
> DependentsFields to allow the cycle of an object and its dependents to be collected.  If DependentsFields uses a WeakArray to hold the associations from objects to their dependents then those associations, and the
> dependencies with it will simply be lost since the only reference to the associations is in DependentsFields.
> 
> Ephemeron differs from a normal association in that it is known to the garbage collector and it is involved in tracing.  First, note that an Ephemeron is a *strong* referrer.  The objects it refers to cannot be garbage
> collected.  It is not weak.  But it is able to discover when it is the *only* reference to an object.  To be accurate, an Ephemeron is notified by the collector when its key is only references from the transitive closure of
> references from ephemerons.  i.e. when an ephemeron is notified we know that there are no reference paths to the ephemeron's key other than through ephemerons.
> 
> Ephemerons are notified by the garage collector placing them in a queue and signalling a semaphore for each element in the queue.  An image level process (the extended finalization process) extracts them from the queue and
> sends mourn to each ephemeron (since its key is effectively dead).  What an Ephemeron does in response to the notification is programmable (one can add subclasses of Ephemeron).  But the default behaviour is to send finalize
> to the key, and then to remove itself from the dictionary it is in, allowing it and the transitive closure of objects reachable form it, to be collected in a subsequent garbage collection.
> 
> Implementation: both in scavenging, and in scan-mark, if an ephemeron is encountered its key is examined.  If the key is reachable from the roots (has already been scavenged, or is already marked), then the ephemeron marked
> and treated as an ordinary object. If the key is not yet known to be reachable the ephemeron is held in an internal queue of maybe triggerable ephemerons, and its objects are not traced.
> 
> At the end of the initial scavenge or scan-mark phase, this queue of triggerable ephemerons is examined.  All ephemerons in the list whose key is reachable are traced, and removed from the list.  This then leaves the list
> populated only with ephemerons whose keys are as yet untraced, and hence only referenced from the ephemerons in the triggerable ephemeron queue, which now becomes the triggered ephemeron queue.  All these ephemerons are
> placed in the finalization queue for processing in the image above, and all objects reachable from the ephemerons are traced (scavenged, marked).  This phase may encounter new triggerable ephemerons which will be added to the
> triggerable ephemeron queue (not likely in practice, but essential for sound semantics).  So the triggering phase continues until the system nds at a fixed point with an empty triggerable ephemeron queue.
> 
> Implications and advantages:
> Because ephemerons do not allow their object to be collected, they can be, and are, used to implement pre-mortem finalization.  So e.g. a file can flush its buffers and then close its file descriptor before being collected
> (which may also imply that the system runs the garbage collector *before* snapshotting, not as part of the snapshot primitive).  Ephemerons are conceptually more simple than WeakKeyDictionary et al, since they are about
> reference path, not merely the existence of strong references.  The back reference from a dependent to an object renders a weak key dictionary useless in enabling an isolated cycle to be collected since the back reference is
> string, and keeps the reference from the weak key alive.
> 
> History: Ephemerons are like guardians.  They were invented by George Bosworth in the early '90's, to provide pre-mortem finalization and to solve the problem of DependentsFields retaining garbage.

Sorry for not being clear. I was asking about the class definition to see 
what fields it would have. I presume the first line is something like:

Object ephemeronSubclass: #Ephemeron
...

> 
>
>       >
>       > First, HashedCollection does not use associationClass, but it implements atNewIndex:put: and it strikes me that atNewIndex:put: for Dictionary really should check for the thing being added at least
>       includingBehavior: self
>
>       HashedCollection does not use associationClass because HashedCollections
>       in general (e.g. Sets) may store any object in their internal array not
>       just Associations.
>       Dictionary introduces #associationClass because it only stores
>       associations (except for MethodDictionary of course).
>
>       #atNewIndex:put: is a private method. Its senders must ensure that the
>       arguments will not corrupt the receiver's internal state
>
>       > associationClass.  So that means Dictionary should override atNewIndex:put:.
>
>       Can you give a bit more information about how EphemeronDictionary should
>       work?
> 
> 
> If one wants an object to be sent finalize before it is collected one simply stores it in an EphemeronDictionary, which uses instances of Ephemeron as its associations.  So e.g.
> 
> StandardFileStream>>initialize
>      ...
>      self registerForFinalization.
>      ...
> 
> Object>>registerForFinalization
>     FinalizationDictionary at: self put: nil.
> 
> and DependentsFields becomes a variant that uses a subclass of Ephemeron that does not send finalize (or vice verce).
> 
> Or we could keep it simple and use DependentsFields, have finalize sent to objects in DependentsFields when no longer reachable, but have a null finalize method in Object.

Again, sorry for not being clear. I would like to know how the current 
implementation of #atNewIndex:put: could prevent EphemeronDictionary to 
work as intended.
Does EphemeronDictionary do something special other dictionaries don't 
that is not compatible with how #atNewIndex:put: currently works?


Levente

>
>       Levente
>
>       >
>       > But what should happen in atNewIndex:put: if the object being added isn't appropriate?  Do we
>       > - raise an error? (that's my preference, but I've got limited use cases in my head)
>       > - replace the association with one of assocationClass? (seems dangerous to me but maybe someone needs this or the existing system does this anyway)
>       > - ignore it and hope the user knows what they're doing?
> 
> 
> _,,,^..^,,,_
> best, Eliot
> 
>


More information about the Squeak-dev mailing list