Reading multiple indices on MagmaCollections - first thoughts
Brent Pinkney
brent.pinkney at aircom.co.za
Thu Apr 6 08:49:36 UTC 2006
Hi All,
Lets start the analysis with a MagmaCollection of Person objects.This
collection has indices on #familyName and #age. Let the variable 'people'
refer to this collection in all Smalltalk code that follows.
This can be exposed in Lava as the 'people' table with columns familyName and
age.
This is the example used in the Lava SUnit test suite.
--------------------------------------------------------------------------------------------------------
The Problem
Concider now the following query, parenthises added for clarity:
select * from people
where (familyName = 'Man')
or ( (familyName = 'Pinkney') and (age < 60) )
A decent RDBMS would be able to execute this query very efficiently be
exploiting the indices on familyName and age.
Magma can only read one index at a time. This is the problem.
--------------------------------------------------------------------------------------------------------
Magma Client API
We can logically convert the SQL qeury into Smalltalk via the following pseudo
code as an intermediate step:
people reader
(#familyName at: 'Man')
or: ( (#familyName at: 'Pinkney') and: (#age to: 60) )
This can then be refined into syntactically correct, but less terse,
Smalltalk:
people where: [ :r |
(r read: #familyName at: 'Man')
or: ( (r read: #familyName at: 'Pinkney') and: (r read: #age to: 60) )
].
The result of this Smalltalk code is a Magma reader with the same semantics as
the existing single index reader.
--------------------------------------------------------------------------------------------------------
Implementation
The result of the aforementioned #where: method is a MagmaExpressionReader.
Executing the conjunctions (#and:) and disjunctions (#or:) constructs a tree
of MaExpression and MaClause instances.
At this point the fringe of the expression tree are all MaClause instances, we
are sure all the indices exist, and the values have been converted into their
hash values.
i.e.
MagmaExpressionReader
|
--------------
|
MaExpression( _, or:, _ )
|
-------------------------------------------------------------------
| |
MaClause( #familyName, at:, 'Man' ) MaExpression( _, and:, _ )
|
--------------------------------------------------------------
| |
MaClause( #familyName, at:, 'Pinkney' ) MaClause( #age, to:, 60 )
When the first page of objects is requested, the entire MagmaExpressionReader
is serialised to the Magma server. The Magma server instantiates a
MaBitmapIndex for the query walks the tree of MaExpressions in the corect
order.
The resulting bitmap index is zero except for the oids which satsify the
query.
The MabitmapIndex and the first page of objects are returned to the client;
all subsequent reads serialise the MaBitmapIndex.
--------------------------------------------------------------------------------------------------------
Issues
1. Will the Magma server be able to build up a 2^^48 bit bitmap quickly enough
2. How will this bitmap be compressed so that both the client and server can
still determine the first oid for the next page
3. Is is possible to order the oids (e.g. results in descending dateOfBirth) ?
Regards
Brent
More information about the Magma
mailing list