Idea for finding intersections between MaHashIndexes based on multiply anded conditions:
If, for every key->oid pair we insert into any MaHashIndex, we also insert into a separate, additional MaHashIndex, oid->key. I will refer to this reverse-mapped index as the "census" index because it answers where is each object.
So, with these census indexes we can quickly know, for any oid, all keys it resides at. We have a tree of conjuncted conditions being requested from the user. To (and) conjunct the ranges specified in the tree of a number of MaHashIndexes we could follow these steps:
1) First use #numberOfEntriesFrom:to: to discover which index will have the most-restricted range to enumerate based on the anded conditions of the tree. Call the size of this range "sr". 2) enumerate all the oids from lowKey to highKey of that index. 3) enumerate each of the other census indexes specified by the conditions. 4) determine whether this eachOid should be included in the result. If any censusIndex indicates no, the oid is rejected (because we're anding), move to the next oid. If all censusIndexes have that oid in the key-range, the oid is included. The implementation should be sure to enumerate the Census indexes a max of once. For eachCensusEntry: get the condition for this index# if the key is between: condition lowKey and: condition highKey mark condition satisfied return true if all conditions satisfied return false
For example, for a particular oid with these census values and this tree of conditions being anded together Census ANDed Conditions index1 { 100. 319 } index1 -> keys 300 to: 500. index2 { 700 } index2 -> keys 700 to: 900. this oid would be accepted because all conditions were satisfied. Note it existed in two places in index1 (which would happen in a keyword index) but only one of them needs to be satisfied.
So, in the worst case, the most number of steps we have to process is
sr*number of conditions
This does not seem too bad!
- Chris
magma@lists.squeakfoundation.org