Dear all,
I've been using the MagmaCollectionReader to develop a paged view over some collection. I'm using this paged view in a seaside app. This shows if there is a previous or a next page and the number of items being shown now: 1 - 21 of 2867.
If I do a query on a MagmaCollection using a sortBy: clause; some things go sometimes a bit wrong.
If I do a query using a sortBy: ; then i can check if sortComplete. Using a query without sortBy:, I can know the total number of items by calling lastKnownSize. If I do a sortBy: query, the lastKnownSize returns the max number already sorted (which is not ok in my gui; I need the total of that query). However I noticed that u can also use fractionSorted to know how much items already have been sorted, and derived from that how much items there are in total.
For instance:
Last known size: 2249 Fraction Sorted:(2249/6065)
So I wrote code like this to already know and show the total numbers of the search, even if the search is still going on on the server, like this I can then already show the first page with the result, while the server is still creating further pages
totalNumberOfElements ^(magmaCollectionReader sortComplete) ifTrue: [magmaCollectionReader lastKnownSize] ifFalse: [|fractionSorted| fractionSorted := magmaCollectionReader fractionSorted. (fractionSorted = 0) ifTrue: [self totalNumberOfElements]. (fractionSorted isFraction) ifTrue: [fractionSorted denominator] ifFalse: [magmaCollectionReader lastKnownSize]]].
This works mostly. If the fraction gets reduced however, the denominator returns a too low number off course. Is there any other way to know the total number of items in the search, even if it is still going on the server?
If I follow the code into MaQueryExecutor>>fractionComplete ^ trunkPosition ifNil: [ 0 ] ifNotNil: [ | trunkSize | trunkSize _ self trunk trunkSize. trunkSize = 0 ifTrue: [ 1 ] ifFalse: [ trunkPosition / self trunk trunkSize ] ]
I understand where the 0 can come from & where the reduction comes from. Integer/anotherInteger automatically does a reduce.
If I on the server Image change MaQueryExecutor>>fractionComplete ^ trunkPosition ifNil: [ 0 ] ifNotNil: [ | trunkSize | trunkSize _ self trunk trunkSize. trunkSize = 0 ifTrue: [ 1 ] ifFalse: [* Fraction numerator: trunkPosition denominator: self trunk trunkSize* ] ]
It works because the reduce is then not done. This is a bit of hack off course.
Is there any other way of doing this?
Thanks for any help.
Kind Regards
Bart
Hi Bart,
I had to re-read my own docs on the swiki to remember how this works.. I might be confused, but sorting should not affect the size. So you should be able to do a #where:, then check the #lastKnownSize, then sort. The (#where: aBlock distinct: makeDistinct sortBy: attributeSymbol descending: shouldDescend) message is just a convenience method that does them both at once. If you need to know the size before sorting, I think they can be done separately.
Otherwise, I rather like your hack.. I don't see any harm in it, and it does convey "more information" for checking the overall progress of the background load on the server..
- Chris
On Sun, Dec 6, 2009 at 3:26 AM, Bart Gauquie bart.gauquie@gmail.com wrote:
Dear all,
I've been using the MagmaCollectionReader to develop a paged view over some collection. I'm using this paged view in a seaside app. This shows if there is a previous or a next page and the number of items being shown now: 1 - 21 of 2867. If I do a query on a MagmaCollection using a sortBy: clause; some things go sometimes a bit wrong. If I do a query using a sortBy: ; then i can check if sortComplete. Using a query without sortBy:, I can know the total number of items by calling lastKnownSize. If I do a sortBy: query, the lastKnownSize returns the max number already sorted (which is not ok in my gui; I need the total of that query). However I noticed that u can also use fractionSorted to know how much items already have been sorted, and derived from that how much items there are in total.
For instance:
Last known size: 2249 Fraction Sorted:(2249/6065)
So I wrote code like this to already know and show the total numbers of the search, even if the search is still going on on the server, like this I can then already show the first page with the result, while the server is still creating further pages
totalNumberOfElements ^(magmaCollectionReader sortComplete) ifTrue: [magmaCollectionReader lastKnownSize] ifFalse: [|fractionSorted| fractionSorted := magmaCollectionReader fractionSorted. (fractionSorted = 0) ifTrue: [self totalNumberOfElements]. (fractionSorted isFraction) ifTrue: [fractionSorted denominator] ifFalse: [magmaCollectionReader lastKnownSize]]].
This works mostly. If the fraction gets reduced however, the denominator returns a too low number off course. Is there any other way to know the total number of items in the search, even if it is still going on the server?
If I follow the code into MaQueryExecutor>>fractionComplete ^ trunkPosition ifNil: [ 0 ] ifNotNil: [ | trunkSize | trunkSize _ self trunk trunkSize. trunkSize = 0 ifTrue: [ 1 ] ifFalse: [ trunkPosition / self trunk trunkSize ] ]
I understand where the 0 can come from & where the reduction comes from. Integer/anotherInteger automatically does a reduce.
If I on the server Image change MaQueryExecutor>>fractionComplete ^ trunkPosition ifNil: [ 0 ] ifNotNil: [ | trunkSize | trunkSize _ self trunk trunkSize. trunkSize = 0 ifTrue: [ 1 ] ifFalse: [ Fraction numerator: trunkPosition denominator: self trunk trunkSize ] ]
It works because the reduce is then not done. This is a bit of hack off course.
Is there any other way of doing this?
Thanks for any help. Kind Regards Bart -- imagination is more important than knowledge - Albert Einstein Logic will get you from A to B. Imagination will take you everywhere - Albert Einstein Learn from yesterday, live for today, hope for tomorrow. The important thing is not to stop questioning. - Albert Einstein The true sign of intelligence is not knowledge but imagination. - Albert Einstein Gravitation is not responsible for people falling in love. - Albert Einstein
Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Hi Chris,
now I see how it works. And why I needed to develop that extra part of code to make it work in my case. Let me try to explain a bit further why.
The PagedReader class I'm developing to visualize data in the GUI requires some subclass with following protocol: PRPagedReaderSource>>readFrom: elementStartIndex to: elementEndIndex PRPagedReaderSource>>refresh PRPagedReaderSource>>totalNumberOfElements
Using an implementation of this, you can page your data up, calculate how many pages there are, navigate to next / prev pages, and forth.
I've created an implementation for MagmaCollectionReader. But since this can be any MagmaCollectionReader; I had to take into account that the MagmaCollectionReader might still not be sorted yet.
The method for instance I was using for testing this, was: BMTMagmaTodoRepository>>findTodosWithTitle: aString ^PRPagedReader on: ((items where: [:eachTodo| eachTodo title match: aString] distinct: false sortBy: #dueDate descending: false) asPagedReaderSource)
So I have a where clause on one attribute, but are sorting on another one; so magma needs to do a background sort; leading to the implementation of above to make the total number of items work. I first used sortedBy: but that took around 15 seconds to do a sort of a test set of 7000 items, which off course was too long. I only needed to show the first 20 of them quickly, and in the background the server was sorting the rest for me.
Since I want to keep my abstraction of PagedReader to work on any MagmaCollectionReader (and let the repository - BMTMagmaTodoRepository - business logic do the sort) - so in fact separate both concerns; I had to implement it this way.
The remark you make about doing the where: clause first, taking the size, and then doing the sort is true; but only if you're not requiring distinct results.
I still have one further question about this. If the #fractionSorted for instance replies with: (2249/6065); does this then mean for sure that the GUI visualizing the query can show the first 2249 items correctly sorted ? I mean, if I for instance show items 1-20, there will come no other items 1-20 even if the sort is not done yet ? That the sort up to item 2249 is already correct? (This would be a very good thing) Or does this mean that just 2249/6065 percent of the result is sorted? (This would make it a rather bad thing?).
What I also noticed that if the server is doing this background sort, and I am already asking the next page; that there is something wrong with the priorities of both. What I would find reasonable is that the background sort should get a lower priority than me asking the next page. So that this next page is returned quickly, and the background sort due to its lower priority is kinda 'paused' for this short time. But I'm not sure if this does make any sense.
Thanks again for your support.
Kind Regards,
Bart
Hi Bart,
On Wed, Dec 9, 2009 at 1:39 AM, Bart Gauquie bart.gauquie@gmail.com wrote:
Hi Chris, now I see how it works. And why I needed to develop that extra part of code to make it work in my case. Let me try to explain a bit further why. The PagedReader class I'm developing to visualize data in the GUI requires some subclass with following protocol: PRPagedReaderSource>>readFrom: elementStartIndex to: elementEndIndex PRPagedReaderSource>>refresh PRPagedReaderSource>>totalNumberOfElements Using an implementation of this, you can page your data up, calculate how many pages there are, navigate to next / prev pages, and forth. I've created an implementation for MagmaCollectionReader. But since this can be any MagmaCollectionReader; I had to take into account that the MagmaCollectionReader might still not be sorted yet. The method for instance I was using for testing this, was: BMTMagmaTodoRepository>>findTodosWithTitle: aString ^PRPagedReader on: ((items where: [:eachTodo| eachTodo title match: aString] distinct: false sortBy: #dueDate descending: false) asPagedReaderSource) So I have a where clause on one attribute, but are sorting on another one; so magma needs to do a background sort; leading to the implementation of above to make the total number of items work. I first used sortedBy: but that took around 15 seconds to do a sort of a test set of 7000 items, which off course was too long. I only needed to show the first 20 of them quickly, and in the background the server was sorting the rest for me. Since I want to keep my abstraction of PagedReader to work on any MagmaCollectionReader (and let the repository - BMTMagmaTodoRepository - business logic do the sort) - so in fact separate both concerns; I had to implement it this way. The remark you make about doing the where: clause first, taking the size, and then doing the sort is true; but only if you're not requiring distinct results. I still have one further question about this. If the #fractionSorted for instance replies with: (2249/6065); does this then mean for sure that the GUI visualizing the query can show the first 2249 items correctly sorted ? I mean, if I for instance show items 1-20, there will come no other items 1-20 even if the sort is not done yet ? That the sort up to item 2249 is already correct? (This would be a very good thing) Or does this mean that just 2249/6065 percent of the result is sorted? (This would make it a rather bad thing?).
Hmm, I pretty sure it can't know which of the 6065 elements matching a complex query expression the lowest of a particular key.. Sorry.
What I also noticed that if the server is doing this background sort, and I am already asking the next page; that there is something wrong with the priorities of both. What I would find reasonable is that the background sort should get a lower priority than me asking the next page. So that this next page is returned quickly, and the background sort due to its lower priority is kinda 'paused' for this short time. But I'm not sure if this does make any sense.
I checked #registerAndLoad: aMaCommitPackage using: aMaTerm from: oidInteger forSession: sessionId distinct: aBoolean. It does indeed fork the background load process at #userBackgroundPriority, which will be interrupted by the normal request-processor Process. How did you "notice" otherwise?
I would dare to caution that making Magma process 7K elements just so a user can select one from the first 20 (or not!) and move on, is "not a bargain", and may ultimately not live up to your performance expectations. Smalltalk, whether using Magma or not, is too slow to be doing anything but useful work and should avoid wasting many cycles..
Especially with a Magma application, success is achieved by leveraging the flexibility afforded by the transparency; choosing to cache certain things, choosing to put some in a standard Smalltalk collection with appropriate ReadStrategy instead of into a MagmaCollection, etc..
For example, most people probably don't have 7K things "to do" in the future. Perhaps the "to do" lists are personal, so the reduced-conflict nature of a MC is not needed anyway.. Items on a ToDo past their #dueDate could be moved to a "archivedToDo" collection to keep the future "active" list small and fast.. 7K items would easily cache in memory and provide fast search and sorting if that was an object that really needs frequent, fast access..
Regards, Chris
Hi Chris,
Thanks for your clarifications.
You are absolutely right about the fact that it is madness to launch a query which would load 7000 items. I was just trying the limits of the system. Furthermore, I am coming from a Java world where we use a relational db every day; there I'm used to filter out of millions of entries some objects you need, and this relational database is actually very fast at that. The point you make about linking the TODO's to a specific user is off course the way you would do it in an object database using a rich domain. The idea about moving TODO's from one collection to another collection (the history) is also sound. Instead of changing the status field on a TODO (and filter on that status), just move it to the history. Like this, you never end up with a collection of many thousands of items. Just need to think more OO when modeling the domain. And don't care how it gets persisted :-).
The 'toy' projects I've created with Magma don't have a rich domain. But on my job we're investigating Smalltalk/Seaside & we've decided that we will use Magma as a database. We are building a kind-off bigger POC. ( http://www.squeaksource.com/SunnysidePlanning2).
| Hmm, I pretty sure it can't know which of the 6065 elements matching a | complex query expression the lowest of a particular key.. Sorry.
If I understand you correctly on this, it means that even if the sorting for instance says: 200/2999, that if I read the first 20 items, these might still change afterwards, so I should warn the user that these are preliminary results. Or offcourse use #sortedBy:.
Kind Regards,
Bart
| Hmm, I pretty sure it can't know which of the 6065 elements matching a | complex query expression the lowest of a particular key.. Sorry.
If I understand you correctly on this, it means that even if the sorting for instance says: 200/2999, that if I read the first 20 items, these might still change afterwards, so I should warn the user that these are preliminary results. Or offcourse use #sortedBy:.
Oops, yes, that is what I meant.
magma@lists.squeakfoundation.org