[squeak-dev] Ephemerons and Dictionaries

Eliot Miranda eliot.miranda at gmail.com
Thu Oct 1 19:17:37 UTC 2020


On Thu, Oct 1, 2020 at 11:41 AM Levente Uzonyi <leves at caesar.elte.hu> wrote:

> Hi Eliot,
>
> On Thu, 1 Oct 2020, Eliot Miranda wrote:
>
> > Hi All,
> >
> >     to be able to ease EphemeronDicitonary into the system easily I'd
> like to clean up adding associations to Dictionary.  It seems to me there's
> a partial implementation of choosing an association class appropriate for a
> > dictionary in the implementors of
> associationClass: Dictionary>>#associationClass, WeakKeyDictionary>>#associationClass, WeakValueDictionary>>#associationClass,
> (& in my image STON class>>#associationClass).  This seems
> > workable; an EphemeronDictionary would simply add associationClass ^
> Ephemeron and we're done, except not quite...
>
> What's the definition of Ephemeron?
>

An Ephemeron is an association known to the garbage collection system.,
allowing it to function as a pre-mortem finalizer.

A Ephemeron is intended for uses such as associating an object's dependents
with an object without preventing garbage collection.

Consider a traditional implementation of dependents in non-Model classes.
There is a Dictionary in Object, DependentsFields.  Objects wishing to have
dependents are entered as keys in DependentsFields and the value is a
sequence of their dependents.  Since their dependents (if they are like
views/morphs, etc in MVC) will refer directly to the key object (in their
model inst var etc), there is no way to use weak collections in
DependentsFields to allow the cycle of an object and its dependents to be
collected.  If DependentsFields uses a WeakArray to hold the associations
from objects to their dependents then those associations, and the
dependencies with it will simply be lost since the only reference to the
associations is in DependentsFields.

Ephemeron differs from a normal association in that it is known to the
garbage collector and it is involved in tracing.  First, note that an
Ephemeron is a *strong* referrer.  The objects it refers to cannot be
garbage collected.  It is not weak.  But it is able to discover when it is
the *only* reference to an object.  To be accurate, an Ephemeron is
notified by the collector when its key is only references from the
transitive closure of references from ephemerons.  i.e. when an ephemeron
is notified we know that there are no reference paths to the ephemeron's
key other than through ephemerons.

Ephemerons are notified by the garage collector placing them in a queue and
signalling a semaphore for each element in the queue.  An image level
process (the extended finalization process) extracts them from the queue
and sends mourn to each ephemeron (since its key is effectively dead).
What an Ephemeron does in response to the notification is programmable (one
can add subclasses of Ephemeron).  But the default behaviour is to
send finalize to the key, and then to remove itself from the dictionary it
is in, allowing it and the transitive closure of objects reachable form it,
to be collected in a subsequent garbage collection.

Implementation: both in scavenging, and in scan-mark, if an ephemeron is
encountered its key is examined.  If the key is reachable from the roots
(has already been scavenged, or is already marked), then the ephemeron
marked and treated as an ordinary object. If the key is not yet known to be
reachable the ephemeron is held in an internal queue of maybe triggerable
ephemerons, and its objects are not traced.

At the end of the initial scavenge or scan-mark phase, this queue of
triggerable ephemerons is examined.  All ephemerons in the list whose key
is reachable are traced, and removed from the list.  This then leaves the
list populated only with ephemerons whose keys are as yet untraced, and
hence only referenced from the ephemerons in the triggerable ephemeron
queue, which now becomes the triggered ephemeron queue.  All these
ephemerons are placed in the finalization queue for processing in the image
above, and all objects reachable from the ephemerons are traced (scavenged,
marked).  This phase may encounter new triggerable ephemerons which will be
added to the triggerable ephemeron queue (not likely in practice, but
essential for sound semantics).  So the triggering phase continues until
the system nds at a fixed point with an empty triggerable ephemeron queue.

Implications and advantages:
Because ephemerons do not allow their object to be collected, they can be,
and are, used to implement pre-mortem finalization.  So e.g. a file can
flush its buffers and then close its file descriptor before being collected
(which may also imply that the system runs the garbage collector *before*
snapshotting, not as part of the snapshot primitive).  Ephemerons are
conceptually more simple than WeakKeyDictionary et al, since they are about
reference path, not merely the existence of strong references.  The back
reference from a dependent to an object renders a weak key
dictionary useless in enabling an isolated cycle to be collected since the
back reference is string, and keeps the reference from the weak key alive.

History: Ephemerons are like guardians.  They were invented by George
Bosworth in the early '90's, to provide pre-mortem finalization and to
solve the problem of DependentsFields retaining garbage.


> >
> > First, HashedCollection does not use associationClass, but it implements
> atNewIndex:put: and it strikes me that atNewIndex:put: for Dictionary
> really should check for the thing being added at least includingBehavior:
> self
>
> HashedCollection does not use associationClass because HashedCollections
> in general (e.g. Sets) may store any object in their internal array not
> just Associations.
> Dictionary introduces #associationClass because it only stores
> associations (except for MethodDictionary of course).
>
> #atNewIndex:put: is a private method. Its senders must ensure that the
> arguments will not corrupt the receiver's internal state
>
> > associationClass.  So that means Dictionary should override
> atNewIndex:put:.
>
> Can you give a bit more information about how EphemeronDictionary should
> work?
>

If one wants an object to be sent finalize before it is collected one
simply stores it in an EphemeronDictionary, which uses instances of
Ephemeron as its associations.  So e.g.

StandardFileStream>>initialize
     ...
     self registerForFinalization.
     ...

Object>>registerForFinalization
    FinalizationDictionary at: self put: nil.

and DependentsFields becomes a variant that uses a subclass of Ephemeron
that does not send finalize (or vice verce).

Or we could keep it simple and use DependentsFields, have finalize sent to
objects in DependentsFields when no longer reachable, but have a null
finalize method in Object.

Levente
>
> >
> > But what should happen in atNewIndex:put: if the object being added
> isn't appropriate?  Do we
> > - raise an error? (that's my preference, but I've got limited use cases
> in my head)
> > - replace the association with one of assocationClass? (seems dangerous
> to me but maybe someone needs this or the existing system does this anyway)
> > - ignore it and hope the user knows what they're doing?
>

_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20201001/eb7e3ec3/attachment.html>


More information about the Squeak-dev mailing list