Hi Goran,
This ws Chris's message to me (I hope Chris does not mind) on how to try out his first pass at a query enginer which uses the census indices to narrow the list of possible oids.
This solution is prone to the limitations I described elsewhere in this thread.
---------- Forwarded Message ----------
Subject: multiple indices - first optimized querying !! Date: Saturday 15 April 2006 00:06 From: Chris Muller chris@funkyobjects.org To: Brent Pinkney brent.pinkney@aircom.co.za
I've finally caught up to you Brent. I am full into this iteration, I have therefore merged all of the Magma-research package into appropriate real Magma packages.
Please load the following from SqueakSource on top of a Magma 1.0 image:
Ma base additions-cmm.63 Ma special collections-cmm.63 Magma client-cmm.180 Magma server-cmm.156
Then you (should) be able to execute the same where: expressions except it will now use optimized algorithm that calculates the minimum "QueryTrunk" that must be enumerated.
Note I have only performed very minimal testing; two or three queries that seem to work.
To support optimized queries, we now must maintain TWO MaHashIndexes for each index instead of just one. Regrettably, this will slow down the insert rate, one of the slowest things Magma does anyway (despite our hard work to optimize it last year). Many of the changes in MagmaCollectionManager are to account for the new oid->key index mapping. I nevertheless, think the reward of supporting complex queries will make it worth it.
Please note that, so far, this implementation dynamically calculates a query result, it does NOT require ANY bitmap as yet, but at the expense of some side-effects:
- sorting is not provided (yet) - the paging is cannot be perfect, there will be some overlap between the pages (note this could happen anyway because the result set can change while the user is paging). This happens because we only start at a specified page/startIndex of the *trunk*, not the actual query result.
If you happen to see how it works in the debugger, you will doubless understand new possibilities for our "bitmap" solution. I think we could possibly get away with simply recording results dispersion for the pages the user has visited.
One interesting thing to look it is 'partial ordering' as used by Google. They find the 'best' results and just order the first two pages. Just a thought though.
Well, I think this is a great thought, because that is exactly what we can do right now; we cannot provide sorted results like a relational database where page one contains the lowest key in the entire result set. If we can sort just the page that we have the *client* could do that work. If we can figure out how to do it relational style I'm all for it, but do you think we can maintain this dynamism we have now?
Enjoy! Chris
-------------------------------------------------------
Brent