MagmaCollectionReader sortBy: total number of elements in list?

Thu Dec 10 19:23:24 UTC 2009

Hi Bart,

On Wed, Dec 9, 2009 at 1:39 AM, Bart Gauquie <bart.gauquie at gmail.com> wrote:
> Hi Chris,
> now I see how it works. And why I needed to develop that extra part of code
> to make it work in my case. Let me try to explain a bit further why.
> The PagedReader class I'm developing to visualize data in the GUI requires
> some subclass with following protocol:
> PRPagedReaderSource>>readFrom: elementStartIndex to: elementEndIndex
> PRPagedReaderSource>>refresh
> PRPagedReaderSource>>totalNumberOfElements
> Using an implementation of this, you can page your data up, calculate how
> many pages there are, navigate to next / prev pages, and forth.
> I've created an implementation for MagmaCollectionReader. But since this can
> be any MagmaCollectionReader; I had to take into account that the
> MagmaCollectionReader might still not be sorted yet.
> The method for instance I was using for testing this, was:
> BMTMagmaTodoRepository>>findTodosWithTitle: aString
>    ^PRPagedReader on:
>      ((items
>        where: [:eachTodo| eachTodo title match: aString]
>        distinct: false
>        sortBy: #dueDate
>        descending: false)
>       asPagedReaderSource)
> So I have a where clause on one attribute, but are sorting on another one;
> so magma needs to do a background sort; leading to the implementation of
> above to make the total number of items work. I first used sortedBy: but
> that took around 15 seconds to do a sort of a test set of 7000 items, which
> off course was too long. I only needed to show the first 20 of them quickly,
> and in the background the server was sorting the rest for me.
> Since I want to keep my abstraction of PagedReader to work on any
> MagmaCollectionReader (and let the repository - BMTMagmaTodoRepository -
> business logic do the sort) - so in fact separate both concerns; I had to
> implement it this way.
> The remark you make about doing the where: clause first, taking the size,
> and then doing the sort is true; but only if you're not requiring distinct
> results.
> I still have one further question about this. If the #fractionSorted for
> instance replies with: (2249/6065); does this then mean for sure that the
> GUI visualizing the query can show the first 2249 items correctly sorted ? I
> mean, if I for instance show items 1-20, there will come no other items 1-20
> even if the sort is not done yet ? That the sort up to item 2249 is already
> correct? (This would be a very good thing) Or does this mean that just
> 2249/6065 percent of the result is sorted? (This would make it a rather bad
> thing?).

Hmm, I pretty sure it can't know which of the 6065 elements matching a
complex query expression the lowest of a particular key..  Sorry.

> What I also noticed that if the server is doing this background sort, and I
> am already asking the next page; that there is something wrong with the
> priorities of both. What I would find reasonable is that the background sort
> should get a lower priority than me asking the next page. So that this next
> page is returned quickly, and the background sort due to its lower priority
> is kinda 'paused' for this short time. But I'm not sure if this does make
> any sense.

I checked #registerAndLoad: aMaCommitPackage using: aMaTerm from:
oidInteger forSession: sessionId distinct: aBoolean.  It does indeed
fork the  background load process at #userBackgroundPriority, which
will be interrupted by the normal request-processor Process.  How did
you "notice" otherwise?

I would dare to caution that making Magma process 7K elements just so
a user can select one from the first 20 (or not!) and move on, is "not
a bargain", and may ultimately not live up to your performance
expectations.  Smalltalk, whether using Magma or not, is too slow to
be doing anything but useful work and should avoid wasting many
cycles..

Especially with a Magma application, success is achieved by leveraging
the flexibility afforded by the transparency; choosing to cache
certain things, choosing to put some in a standard Smalltalk
collection with appropriate ReadStrategy instead of into a
MagmaCollection, etc..

For example, most people probably don't have 7K things "to do" in the
future.  Perhaps the "to do" lists are personal, so the
reduced-conflict nature of a MC is not needed anyway..  Items on a
ToDo past their #dueDate could be moved to a "archivedToDo" collection
to keep the future "active" list small and fast..  7K items would
easily cache in memory and provide fast search and sorting if that was
an object that really needs frequent, fast access..

Regards,
  Chris