Re: Fwd: Undirect index update

20 May 2009


      Chris,
This email exchange about indexing is really worthwhile for all of us
using Magma and unsure about the various implications with indexes, not
to mention the discussion is archived and could be of good use for any
new future user of Magma.
To be sure we talk about the same things (excuse the misconceptions from
a non native English reader), I would like to reword to be sure I
understand correctly the issues.
Le mardi 19 mai 2009 à 17:32 -0500, Chris Muller a écrit :
...
I only meant to caution about the limited scale associated with
duplicate keys in a MaHashIndexes.  As duplicate keys accumulate it's
a worst-case, O(n) linear search for adds, removes AND reads.
By two duplicate keys, do you mean two different index objects (in my
example lexical term objects) producing the same hash value from
indexHashForIndexObject: ?
Or do you mean two different objects indexed with an identical object?
If so, why is it a problem? (I see it as a wished feature)
And then I don't see how to avoid it.
...
It may be tempting to add index on the #color of millions of objects.
Which might be worthwhile there are 24-million possible colors, but if
only 8 of those colors are used then, as one-million objects are added
to that collection, the minimum number of collisions at each key is 1M
/ 8..  Access to this index is now slow and will continue to worsen as
more duplicates are added..
So if I understand correctly, it is a problem when the index range is
limited in regard to the indexed object population extend.
Frankly I think I do have this exact problem with indexes defined over a
limit set of word.
The object I am indexing are LOM metadata of pedagogical resources.
The LOM norm comes with attributes defined from a fixed set of
vocabulary. For example 'cost' is a two values attribute ('yes' or
'no'). In your scenario it is a worst case, but from the user point of
view it is interesting to query only commercial (value 'yes') resources.
Is it a dead end?
Best regards,
Hilaire

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: Fwd: Undirect index update