conjuncting MaHashIndexes
Chris Muller
chris at funkyobjects.org
Thu Apr 6 04:01:18 UTC 2006
Idea for finding intersections between MaHashIndexes based on multiply
anded conditions:
If, for every key->oid pair we insert into any MaHashIndex, we also
insert into a separate, additional MaHashIndex, oid->key. I will refer
to this reverse-mapped index as the "census" index because it answers
where is each object.
So, with these census indexes we can quickly know, for any oid, all
keys it resides at. We have a tree of conjuncted conditions being
requested from the user. To (and) conjunct the ranges specified in the
tree of a number of MaHashIndexes we could follow these steps:
1) First use #numberOfEntriesFrom:to: to discover which index will
have the most-restricted range to enumerate based on the anded
conditions of the tree. Call the size of this range "sr".
2) enumerate all the oids from lowKey to highKey of that index.
3) enumerate each of the other census indexes specified by the
conditions.
4) determine whether this eachOid should be included in the result.
If any censusIndex indicates no, the oid is rejected (because we're
anding), move to the next oid. If all censusIndexes have that oid in
the key-range, the oid is included.
The implementation should be sure to enumerate the Census indexes a
max of once.
For eachCensusEntry:
get the condition for this index#
if the key is between: condition lowKey and: condition highKey
mark condition satisfied
return true if all conditions satisfied
return false
For example, for a particular oid with these census values and this
tree of conditions being anded together
Census ANDed Conditions
index1 { 100. 319 } index1 -> keys 300 to: 500.
index2 { 700 } index2 -> keys 700 to: 900.
this oid would be accepted because all conditions were satisfied.
Note it existed in two places in index1 (which would happen in a
keyword index) but only one of them needs to be satisfied.
So, in the worst case, the most number of steps we have to process is
sr*number of conditions
This does not seem too bad!
- Chris
More information about the Magma
mailing list