Reading multiple indices on MagmaCollections - first thoughts

Brent Pinkney brent.pinkney at aircom.co.za
Thu Apr 6 08:49:36 UTC 2006


Hi All,

Lets start the analysis with a MagmaCollection of Person objects.This 
collection has indices on #familyName and #age. Let the variable 'people' 
refer to this collection in all Smalltalk code that follows.

This can be exposed in Lava as the 'people' table with columns familyName and 
age. 

This is the example used in the Lava SUnit test suite.

--------------------------------------------------------------------------------------------------------
The Problem

Concider now the following query, parenthises added for clarity:

	select * from people 
	where (familyName = 'Man') 
	or ( (familyName = 'Pinkney') and (age < 60) )

A decent RDBMS would be able to execute this query very efficiently be 
exploiting the indices on familyName and age.

Magma can only read one index at a time. This is the problem.

--------------------------------------------------------------------------------------------------------
Magma Client API

We can logically convert the SQL qeury into Smalltalk via the following pseudo 
code as an intermediate step:

	people reader
		(#familyName at: 'Man')
		or: ( (#familyName at: 'Pinkney') and:  (#age to: 60) )

This can then be refined into syntactically correct, but less terse, 
Smalltalk:

	people where: [ :r |
		(r read: #familyName at: 'Man')
		or: ( (r read: #familyName at: 'Pinkney') and:  (r read: #age to: 60) )
	].

The result of this Smalltalk code is a Magma reader with the same semantics as 
the existing single index reader.

--------------------------------------------------------------------------------------------------------
Implementation

The result of the aforementioned #where: method is a MagmaExpressionReader.
Executing the conjunctions (#and:) and disjunctions (#or:) constructs a tree 
of MaExpression and MaClause instances.

At this point the fringe of the expression tree are all MaClause instances, we 
are sure all the indices exist, and the values have been converted into their 
hash values.

i.e.

MagmaExpressionReader
		|
		--------------
				|
		             MaExpression( _, or:, _ )
			          | 
	-------------------------------------------------------------------
	|										|
MaClause( #familyName, at:, 'Man' ) 		MaExpression( _, and:, _ )
										|				
				      --------------------------------------------------------------
				      |					   					|
			MaClause( #familyName, at:, 'Pinkney' )		MaClause( #age, to:, 60 )

When the first page of objects is requested, the entire MagmaExpressionReader 
is serialised to the Magma server. The Magma server instantiates a 
MaBitmapIndex for the query walks the tree of MaExpressions in the corect 
order.

The resulting bitmap index is zero except for the oids which satsify the 
query.

The MabitmapIndex and the first page of objects are returned to the client; 
all subsequent reads serialise the MaBitmapIndex.

--------------------------------------------------------------------------------------------------------
Issues
1. Will the Magma server be able to build up a 2^^48 bit bitmap quickly enough
2. How will this bitmap be compressed so that both the client and server can
    still determine the first oid for the next page
3. Is is possible to order the oids (e.g. results in descending dateOfBirth) ?

Regards

Brent




More information about the Magma mailing list