Howdy all, and Chris of course!
Ok, let me show you a method in Gjallar:
findObjectById: id "Find any object by id. Return nil if not found."
| reader | reader _ cases where: [:each | each read: #id at: id ]. reader size > 1 ifTrue: [self error: 'Multiple objects with the same UUID!']. ^reader size = 1 ifTrue: [reader at: 1] ifFalse: [nil]
So "cases" is a MagmaCollection with an index #id. Now... what is the preferred simplest way of doing the #read:at: ? Nowadays this method is not on MagmaCollection but on the reader. So.... the above looks needlessly complex IMHO - or is there a point I am missing?
regards, Göran
PS. I am now experimenting with a shared magma session in combination with one per user - I drowned in too much materialization otherwise.
Before, #read:at: that was on MagmaCollection and returned a reader;
| reader | reader _ cases read: #id at: id. reader size > 1 ifTrue: [self error: 'Multiple objects with the same UUID!']. ^reader size = 1 ifTrue: [reader at: 1] ifFalse: [nil]
so you still had to do the size checking, nothing has changed w.r.t. that. The only difference is now we have just one #where: method for all querying rather than multiple methods. Everyone is concerned with code bloat, so I think #where: handling all cases was addition by subtraction.
The other thing to watch out for with the above code is, because the reader is based on a collection that is always changing, its really best to use #at:ifOutOfBounds: since, theoretically, another user could remove objects causing the size to change.
PS. I am now experimenting with a shared magma session in combination with one per user - I drowned in too much materialization otherwise.
By "otherwise", do you mean the strict 1:1? And even after employing correct read-strategys?
thanks..
--- goran@krampe.se wrote:
Howdy all, and Chris of course!
Ok, let me show you a method in Gjallar:
findObjectById: id "Find any object by id. Return nil if not found."
| reader | reader _ cases where: [:each | each read: #id at: id ]. reader size > 1 ifTrue: [self error: 'Multiple objects with the same UUID!']. ^reader size = 1 ifTrue: [reader at: 1] ifFalse: [nil]
So "cases" is a MagmaCollection with an index #id. Now... what is the preferred simplest way of doing the #read:at: ? Nowadays this method is not on MagmaCollection but on the reader. So.... the above looks needlessly complex IMHO - or is there a point I am missing?
regards, Göran
PS. I am now experimenting with a shared magma session in combination with one per user - I drowned in too much materialization otherwise. _______________________________________________ Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma
Hi Chris!
Chris Muller chris@funkyobjects.org wrote:
Before, #read:at: that was on MagmaCollection and returned a reader;
| reader | reader _ cases read: #id at: id. reader size > 1 ifTrue: [self error: 'Multiple objects with the same UUID!']. ^reader size = 1 ifTrue: [reader at: 1] ifFalse: [nil]
so you still had to do the size checking, nothing has changed w.r.t.
Yes.
that. The only difference is now we have just one #where: method for all querying rather than multiple methods. Everyone is concerned with code bloat, so I think #where: handling all cases was addition by subtraction.
Right, I just wanted to be sure that I haven't missed something regarding the "proper" way of doing it. :)
The other thing to watch out for with the above code is, because the reader is based on a collection that is always changing, its really best to use #at:ifOutOfBounds: since, theoretically, another user could remove objects causing the size to change.
Mmmm, ok.
PS. I am now experimenting with a shared magma session in combination with one per user - I drowned in too much materialization otherwise.
By "otherwise", do you mean the strict 1:1? And even after employing correct read-strategys?
Thing is I probably need "in memory" speed when viewing and filtering the case collection. Some of my filtering can be done using indexing, but some probably will need "iteration". So I will most probably end up materializing all cases on a pretty regular basis. Doing that for 100 cases is probably no big deal, but 15000? Nope. Also - what I have seen so far it takes a bit too long to materialize 20 cases in order to show them in a table. Sure, a better readstrategy that only fetches exactly what I need would help - but it is hard for me to predict what columns (attributes) will be shown and how much state I need to fetch in order to satisfy that - the columns are not hard coded.
So I simply need for Gjallar to "cache" the cases in RAM. I can't do that for each and every user - so a shared session is the answer. The "problem" with this approach is to know when I can abort it to get it refreshed since the objects read using that session will be accessed (readonly) by multiple Squeak Processes. I think I will just "ignore" the issue for now and consider the model "thread safe" when used readonly. Of course, Magma probably puts the objects in theoretically unsound states for short periods of time when doing the refresh - but whatever. :) If this turns out to be a problem I will just have to protect the model with a Monitor while doing the abort.
Also, I have been toying a bit with readstrategies - only a tiny bit - but I actually have a few questions regarding those:
0. You need to update the page http://minnow.cc.gatech.edu/squeak/2638 so that it says MaReadStrategy. :)
1. Am I meant to create a MaReadStrategy (say "MaReadStrategy minimumDepth: 1"), configure it using #forVariableNamed:onAny:readToDepth: (several of those messages of course) and then put it in the session? If so - for how long is this strategy valid? For the life of the session? Or just for the next query?
2. Let's say I use one strategy to read "just enough" of the Q2Case instances when making queries over the collection. Then when I need to view a *selected* case - I typically need to get "most of the rest" of the case - a new strategy. But... this creates two more questions:
2.1 If the case is already materialized - how can I control the reading of the rest - can I somehow make sure it reads "the rest" in a single materialization? Like, change the strategy in the session and then perhaps run "session rematerialize: myCase" and then it would fetch the stuff that is still missing using the new strategy?
2.2 If I am running locally (no network roundtrips for each materialization) do I still benefit from turning this into a single materialization or does it not matter? This I can of course test - *if* I knew how to do rematerialization using a different strategy.
Regarding #1 above, it puzzled me because I configured it when I created the session but it appeared to "not work" - I was profiling and clearly saw materializations where I would not expect them given the strategy.
In fact - it would be great if I could easily log materializations - I will try to hook into that method you mentioned earlier.
regards, Göran
Hi Göran,
I have successfuly used Magma to demonstrate how a transparent OODb can be used to managed a very large connected graph of objects (millions).
The read strategies were instrumental in getting the performance to an acceptable level (ie. very fast ;) ).
For some of the scenarios we know how deep and/or which attributes to materialise and could configure this when the session was initialiated.
In another scanarios where the user was defining an ad-hoc search, I was able too construct the read strategy before executing the magma collection query. (It worked quite well but this was just a proof of concept though.
Finally our deliberate combination of Seaside + Magma gave us great performance becuase we could combine a MagmaCollectionReader and a WABatchedList. This combination meant we need only materialise the first 'page' of objects when returning the results. This feature is still true with complex queries unless sorting or unique results are required.
Cheers
Brent
Thanks for the great questions Göran!
Thing is I probably need "in memory" speed when viewing and filtering the case collection. Some of my filtering can be done using indexing, but some probably will need "iteration". So I will most probably end up materializing all cases on a pretty regular basis. Doing that for 100 cases is probably no big deal, but 15000? Nope. Also - what I have seen so far it takes a bit too long to materialize 20 cases in order to show them in a table. Sure, a better readstrategy that only fetches exactly what I need would help - but it is hard for me to predict what columns (attributes) will be shown and how much state I need to fetch in order to satisfy that - the columns are not hard coded.
Don't worry about trying to get "exactly" what you need. Start by being liberal with the ReadStrategy (i.e., read to depth 9999 on as many instVars of the Case as you are reasonably sure will be needed) and then back it off if necessary. You might be surprised (I hope!) at how fast it retrieves the case with the ReadStrategy in place. To put it into perspective, I have witnessed 10X improvement when faulting down a hundred Transactions simply by putting a ReadStrategy on their 'date' variable to read 9999 levels deep (since then, I've put that into Magma's base code, 9999 levels deep on any Date).
So I simply need for Gjallar to "cache" the cases in RAM.
Magma was designed (with the hope that) any type of end-user GUI application, where the user searches for some chunky object (i.e., a Case), opens it up (faulted all at once, via ReadStrategy) and then commits some changes to it, can be accomplished with reasonable performance via the 1:1 session design. Each of these steps; search, open, commit change; should take no more than a second or two each..
And then, when the user "closes" that case the program should keep the readSet small by stubbing it out.
Now, if the nature of the model is large enough and with enough activity by lots of users to where it becomes too much, then the 1:1 can scale by spreading multiple sessions across multiple OS threads (images), cpu's, or computers. But the code remains simple and unchanged.
I probably sound like a broken record, sorry..
I can't do that for each and every user - so a shared session is the answer. The "problem" with this approach is to know when I can abort it to get it refreshed since the objects read using that session will be accessed (readonly) by multiple Squeak Processes. I think I will just "ignore" the issue for now and consider the model "thread safe" when used readonly. Of course, Magma probably puts the objects in theoretically unsound states for short periods of time when doing the refresh - but whatever. :) If this turns out to be a problem I will just have to protect the model with a Monitor while doing the abort.
Yea, this makes me nervous; I think it could experience intermittent problems due to the two-step materialization process (remember the Dictionary full of Integers?).
A large cache in Ram has the detriment of a large readSet, which works against performance; maybe not too much for read-only but some because the underlying dictionary's are large.
Also, I have been toying a bit with readstrategies - only a tiny bit
Good deal, I hope you find they *really* help out!
but I actually have a few questions regarding those:
- You need to update the page
http://minnow.cc.gatech.edu/squeak/2638 so that it says MaReadStrategy. :)
Ok thanks, I fixed that. Please feel free to update if you see typos.
- Am I meant to create a MaReadStrategy (say "MaReadStrategy
minimumDepth: 1"), configure it using #forVariableNamed:onAny:readToDepth: (several of those messages of course) and then put it in the session? If so - for how long is this strategy valid? For the life of the session? Or just for the next query?
For the life of the session or until you replace it by putting another one in the session. Setting your sessioons #readStrategy: completely replaces the previous one.
- Let's say I use one strategy to read "just enough" of the Q2Case
instances when making queries over the collection. Then when I need to view a *selected* case - I typically need to get "most of the rest" of the case - a new strategy. But... this creates two more questions:
2.1 If the case is already materialized - how can I control the reading of the rest - can I somehow make sure it reads "the rest" in a single materialization? Like, change the strategy in the session and then perhaps run "session rematerialize: myCase" and then it would fetch the stuff that is still missing using the new strategy?
You are talking about maintaining two ReadStrategies. One conservative for the "searching for a case", another liberal one for "opening a case". Your program will have to constantly swap them back and forth.
But that's a great question I never thought of! Since the conservative strategy has already materialized the actual case, you'll only hit proxy's on one of its *sub*-objects when you go to open and view the rest of it, not the actual case, so the liberal ReadStrategy that says to ready 99999 on the Case will not be used!
The answer is, for the liberal "opening a case" read-strategy, whatever kind of object contains "most of the rest of" the case, be sure to include that in the read-strategy as well with a depth of 99999.
Don't forget once the case is rendered, put back the conservative read-strategy (so future searches don't fault down too much).
2.2 If I am running locally (no network roundtrips for each materialization) do I still benefit from turning this into a single materialization or does it not matter? This I can of course test - *if* I knew how to do rematerialization using a different strategy.
Yes, absolutely. It severely cuts down on the number of trips to the server, even though its local, it still matters a lot.
In fact - it would be great if I could easily log materializations - I will try to hook into that method you mentioned earlier.
You may also want to try:
mySession preferences signalProxyMaterializations: true
and change your code to
[ "..do something, open a case.." ] on: MagmaProxyMaterialization do: [ : noti | MyMaterializedObjectBag add: noti materializedObject. noti resume "don't forget this!! :)" ]
Whew, cheers! :) Chris
magma@lists.squeakfoundation.org