Automatic index maintenance

Sebastian Sastre ssastre at seaswork.com
Mon Mar 3 16:53:35 UTC 2008


> This sounds very interesting, would you mind sharing more details
> about how you accomplish your solution?
> 
I plan not only to give details but I'm willing to share it as a contribution to
Magma.
The concept: it takes the point of deserialization to install proxies in every
object who cames from the odb. The proxies send #noteOldKeys: to the
magmaSession every time they detect a change in a hashed attribute. The proxies
are installed using introspection so they are installed deeply in the graph [*].
That way they can detect hash changes even when de indexed object answers the
attribute deducting it from an arbitrarily deep (sub)component. When a commit
happen, at some point, Magma will start to serialize again the graph which will
(at that point) have (deeply)uninstalled all the proxies. It uses one point more
than #materializeObject: to install proxies, right now I'm seeing:
#slowlyDo:pageBoundariesDo: also it uninstalls proxies in #refresh: besides
#serializeGraph:do:

The good: this will be 100% transparent for the developer. 
The bad: we'll pay a price to use this proxies in some overhead (mem and cpu)
but I have used this for OmniBase and worked nicely so my guess is it will also
apply to Magma.
I've made two monticello packages: Vigilance and MagmaIndexMaintenance if you or
anyone is willing to collaborate may I put them in SqueakSource.
About the proxies itselves, they intercepts using DNU every message sent to the
"vigilated" persistent and before forwarding the message to it, it ask for the
current attribute value. Then forward the message and ask again for the
attribute value. If they are equal nothing more is done. If they are different
it sends #modifiedValueFor: anObject oldValue: oldValue newValue: newValue to
the magma session (I know there is no need to send that degree of detail but is
the way they was made for the original solution they where made for and I've
just ported it).

[*] the introspection could be improved by installing proxies looking at the
class kind (fixed, variable, words, etc). Lots of them don’t need to be
vigilated (chars, most numbers, strings, etc) so that improves performance and
saves mem *a lot*. Right now the instrospection is in a more basic, less
general, fashion (deducted from a class method).

...
> proxies so they can
> >  detect key changes and notify the session to note that.
> 
> It's interesting how this was just *removed* from the
> RepositoryDefinition just a few versions ago.  It was a cache of all
> MagmaCollections in the repository, just what you needed, but it
> design improvement to remove them, since the number of
> MagmaCollections you could have would be constrained by memory.
> 
In fact no. For this I *only need the symbols* of all the attributes of all the
magma collections. I don't want to spend memory in caching the whole magma
collections and their indexes. As example, my odb model will scale perhaps to
thousands magma collections but only to a couple of dozens of attributes
(symbols to have cached). 
When proxies are installed you should tell them what selectors they have to
vigilate so we'll need to tell them just all (whithout repetitions).

...
> >         Any better suggestion?
> 
> You may wish to attach your Set of Symbol attributes to the
> MagmaRepositoryDefintion.  This is the root of "domain" model that
> describes a Magma repository.  Whenever an index is added to any
> collection, this Set could be updated.  There shouldn't be scalability
> problem since the number of different attribute selectors could not
> exceed the number of Symbols in an image, which would only number in
> the thousands at most.
> 
>  - Chris
> 
This is very interesting to acomplish the goal. I'll take a look to
MagmaRepositoryDefintion to see if I can extract them (and update new or removed
ones). For sure I'll ask something soon about this.

Cheers,

Sebastian



More information about the Magma mailing list