get well (Actually how to query and sort!)

Brent Pinkney brent.pinkney at aircom.co.za
Thu May 18 23:25:49 UTC 2006


Hi Chris,

> Hey man, I'm sorry to hear of your injury, I sincerely hope you are ok!  Given the brilliance of your post it doesn't sound like there's any permanent damage thank goodness.

I am fine, a bit tender but quite fine, thanks.

>  I do not yet understand your solution but my cursory read it sounds real good..   I need to read it again and dedicate some cycles to understand it better..

In essense I want to allow the Magma server to estimate whether it can create the bitmap and sort order within a given time and signal an exception if it cannot.
In this way the client can determine how long is it preperared to wait. I have spent many a while waiting for Oracle to finish a sorted query, so an upperbound is
a good enough compromise for me.

The sorting solution is also quite simple. Concider the people collection with indices on familyName and age.

Currently we will have five indices on this collection: one for the elements, 2 normal indices and 2 census indices.

The main index will hold the oids:

	0001	-> Brent Pinkney				(age = 33)
	0002	-> Tom Pinkney				(age = 63)
	0003	-> Claire Pinkney				(age = 31)
	0004	-> Man Kanam 					(age = 29) NB. (familyName = 'Man')
	0005	-> Nelson Madiba Mandela		(age = 86)

Now concider the same objects in the familyName index: they will be stored in sorted order of familyName, viz:

	0004, 0005, 0001, 0002, 0003	(ie, Man, Mandela, Brent Pinkney, Tom Pinkney, Claire Pinkney)

The age index will also be stored in sorted order, viz

	0004, 0003, 0001, 0002, 0005	(ie, Man, Claire Pinkney, Brent Tom Pinkney, Tom Pinkney, Mandela)


So if we maintain a new sorting-index where each entry maps and oid to (in this exampl) two integers, being the position of that oid in the index then we have

	0001	-> 3, 3
	0002	-> 4, 4
	0003	-> 5, 2
	0004	-> 1, 1
	0005	-> 2, 5


So if we have the following query:

	query := people
			where: [ :p | (p familyName from: 'Adam' to: 'Zorro') & (p age between: 32 and: 90) ]
			orderBy: [ p age ].
	query pageSize: 2.

	[ q prepareWithin: 10 seconds ] on: MaTimeout do: [....].
	q do: [ :p | .... ]

To satisfy this query we have to a) build bitmap and b) walk the bitmap in the correct order. But from #pageSize we know we only have to find the smallest and 
second smallest age values. We can use the sort-index above to help.

Now, (p familyName from: 'Oscar' to: 'Zorro') matches oids:	0001, 0002, 0003, 0004, 0005. 
But (p age between: 32 and: 80) only matches oids 0001, 0002, 0005.

We can estimate the time it will take to built the bitmap as it bound by the number of oids in the tightest cluase (age) and the number of sort cluases (one).

So we decide to build up the bimap index using the age clause as it is 'tighter'. For each oid we check the census index for the #familyName index to see whether the oid
survives and set each oid's bit in the index accordingly. If the oid survives, we find its sort order in the age index and maintain the smallest two. 
At the end of this process the bitmap is set and the sort collection is 

	#( 0001->3   0002->4 )     

We store this array or sort assacations in the bitmap. Obviously if we fail to build this bitmap and sort order withing the 10 seconds mandated by the client we 
return an exception. However if all goes well, we return immediately with a segment containing 0001 and 0002 being 'Brent Pinkney' and 'Tom Pinkney'. We also shuttle our 
posh bitmap back to the client to keep the Magma server simple.

Now, if the client does need the second (or subsequent pages) we know that the next oid must occur after entry 4 in the #age index. So we read each oid from the 5th
and check whether it is in the bitmap. We continue reading ad checking oids until #pageSize oids have been found.

This is the best solution I can think of. It is also not too scary to implement (I think).


Please let me know.

Brent




More information about the Magma mailing list