Celeste improvements!

Lex Spoon lex at cc.gatech.edu
Fri Oct 5 14:54:46 UTC 2001


> > > > 2) Why not use ImageSegments to save the index on disk?


If anyone wants to play with this further, here's what I tried (as
best as I can remember).

	1. Begin with an "ImageSegment new".
	2. Specify a root or roots.  For starters, just the
	   dictionary inside the IndexFile would be enough.)
	   Be aware of using the whole IndexFile, because it
	   contains a SortedCollection with a custom sortBlock,
	   and last time I tried you couldn't save a block
	   in an ImageSegment.
	3. Send the magic "extract" command.  I don't remember the
	   exact name, and there are a few variations.  They do have
	   nice comments, IIRC.  The right method will ask you
	   for a size hint, but you can safely specify 0 to get
	   started.


Afterwards, you can query the outPointers from the segment.  There
shouldn't be any, right?  Well, in practice there will be, and you'll
have to work at it.  In practice, there will be extraneous inpointers,
and these will manifest as outpointers that don't really seem to be
pointing "out" of the mess of objects.

Finding the problem in-pointer requires detective work.  It just turns
out that our intuitive notion of where an object's boundaries are,
aren't always literally followed in the image itself.

Let me try a trimmed down example, to show the problem in detail.

Suppose we have the following objects, with object names on the left
and instance variables on the right in parentheses.  I don't
give the instance variable names, because they don't matter for
this game.  I hope it's clear enough.  Note I've explicitly listed
the array and associations that a dictionary uses, and the array that
a sorted collection uses -- this is to give you an idea of how
tough slogging through these objects can really be!

	idxfile     (sortedlist, dictionary)
 	sortedlist  (array1, sortblock)
	array1      (entry1, entry2, entry3)
	dictionary  (array2)
	array2      (assoc1, assoc2, assoc3)
	assoc1      (entry1)      "I'll leave out the keys..."
	assoc3      (entry2)
	assoc3      (entry3)
	entry1      (from1, to1, cc1, subject1)
	entry2      (from2, to2, cc2, subject2)
	entry3      (from3, to3, cc3, subject3)

	celeste     (from2, to2, cc2, subject2)

If you specify "dictionary" as a root then the ImageSegment machinery
will findall objects reachable only from "dictionary" and not from
elsewhere.  So that would be: array2, assoc1, assoc2, and assoc3.
entry1, entry2, and entry3 will be listed as outpointers!  The
ImageSegment machinery will find you all objects reachable *only*
via the roots, but entry1, entry2, and entry3 are also reachable via
the sorted collection.

Two general strategies from here are to add more roots, or to
get rid of the inpointers somehow.  Let's try adding more roots
for this problem, and getting rid of inpointers on the next.  We
can mark *all* the entries as roots, in which case the sorted collection
becomes irrelevant.

Now, we manage to extract most of the index file entries plus their
associated from, to, cc, and subject strings, but *not* the ones for
entry2.  It seems that Celeste (I'm making this up) has a reference to
from2, to2, cc2, and subject2 !

Well, let's try the other strategy this time.  We could rewrite
IndexFile>>from, IndexFile>>to, etc., to return *copies* of the
underlying strings.  Then, the Celeste object wouldn't have a direct
reference to these strings, but instead would reference copies of them.
Thus, we've gotten rid of those in-pointers.

And now everything should go out.  I wish everyone who tries this
the same level of success with the *real* Celeste.  :)

By the way, don't worry if outpointers contains classes or other
things that don't need to be saved.  Just worry when index file
entries aren't going out, for example.

-Lex




More information about the Squeak-dev mailing list