My first Magma experience (Now including GOODS)

Daniel Salama dsalama at user.net
Mon Apr 4 03:08:44 UTC 2005


On Apr 3, 2005, at 4:52 PM, Chris Muller wrote:

>
> Daniel, you have repeated a classic performance-comparison between 
> relational
> and object databases.  Using a ultra-simple model and a
> "crunch-through-a-big-table" style test, all the benefits granted by 
> an ODBMS
> are minimized while performance benefits of slamming data, not 
> objects, into a
> table are emphasized.

I agree with you 100% on this, but this simple test is only a small 
part of a large application. The model is much more complex. However, 
to a certain degree, several features of the application will do a very 
similar task. Something as simple as providing the application user 
with a way to list all customer records within 5 miles of a given zip 
code. In my amateur mind, this implies that I will pretty much have to 
iterate through a large collection of objects to select the ones that 
match the criteria.

> With ODBMS's "transparency" is granted in terms of not having to 
> think-in or
> map-to tables, but there is the need to account for the database when 
> it comes
> to high-volume processing.  One thing is traded for another, it's not 
> a free
> lunch.

I know and like I said before, the model is more complex than this. 
This is merely the tip of the iceberg, but I need to be able to allow 
the user to, at least, be able to select an object before expanding on 
all the object relations.

>
>> What I tried to do here was simply to iterate through all the
>> FHKC834Entry objects already stored in the database and updated and
>> committed one attribute every 100th object. It took 29 minutes on the
>> same computer. This is even worse timing as the other one. I guess I
>> could live with a slow bulk load process, but random updates to a 
>> large
>> collection shouldn't take this long.
>
> This particular test and the way it was run exemplies what I would 
> think to be
> very close to worst-possible-case performance with Magma.  You're 
> materializing
> nearly the entire dataset, object-by-object, surely building up a huge 
> readSet,
> and then making periodic commits without stubbing (or using 
> WriteBarrier on the
> GOODS test).

I know. The test might have been a bit inappropriate, but nevertheless, 
say we are talking about an order entry application for a very busy 
company. Say they process 75,000 orders per month and throughout the 
life cycle of the order, the order may be modified. This means that to 
some degree, I will be traversing the list of several thousands of 
orders to allow the user to select one so he/she can edit something 
about the order, or may be edit, at least, one in possibly several 
dozens of order line items. If you think about it, my test is a 
simplification of this process. Or not?

> ODBMS's give you more transparency than relationals, so they have to 
> do a
> little more.  High-volume processing requires profiling.  I didn't see 
> that,
> nor consideration of the various performance-tuning tools documented at
> http://minnow.cc.gatech.edu/squeak/2985 in your post.

You're right. I glanced at that page, but never read it carefully. I 
spent some time today reading it more carefully and I understand some 
things to keep in mind when testing a larger piece of the application. 
However, I didn't quite find any "tips" on directly improving my test 
case.

> Still, while I'm sure what you've done can be improved I don't know 
> whether
> your performance requirements can be met given the apparent volume you 
> are
> dealing with.  Magma is definitely in an experimental state.  It's 
> performance
> will improve with time but more conventional technologies might serve 
> you
> better for your high-volume project, as Avi suggested.
>
> One way to help determine this may be to measure the "best-case" 
> performance
> and see if even that is acceptable.  If not, you can stop right there, 
> no point
> in going any further.  For Magma, check out MagmaBenchmarker (part of 
> the Magma
> tester package).

Again, I agree with you and that's the one of the main reasons I come 
to you guys in this list. I'm new to Smalltalk, Squeak, Magma, GOODS, 
and everything that surrounds it. In my experience, even experimental 
libraries or frameworks I've used in the past, are stable enough to be 
used in production. That's why I still gave Magma a shot. Other than 
GOODS, the remaining option to "play" with is OmniBase, but if it can't 
run in linux, I can't use it :(

Who knows, may be the target clients I look for are always high 
transaction volume ones and I may be better off with RDBMS even if I 
have to use tools like GLORP (which BTW, looks very promising albeit 
the missing features) and live with the additional overhead of 
maintaining the relational models of a standard, say, PostgreSQL 
implementation.

> For your convenience, I post my results from a recent run on my 1.3GHz 
> laptop.

These results look good. However, when I benchmark something, I always 
like to look at the numbers as well as the user experience. If, say, my 
tests were done in 30 seconds, I would say, wow, the OODB is really 
fast, but if running these processes, although may seem fast, they will 
prohibit the user from doing anything else because Seaside (or whatever 
for that matter) is "blocked" during those 30 seconds, then these 30 
seconds are unacceptable. I don't care if the bulk loading of data 
takes 8 hours to run. As long as it does not affect the user's 
perception of performance, I don't care. Now, when it comes to the 
orders example I presented above, it's all user's perception of 
performance, so it must be "fast".

Thanks again for your wise words. I always look up to you guys for the 
more I read the posts in this list every day, the more I appreciate 
your efforts and your advise.

Thanks,
Daniel Salama




More information about the Squeak-dev mailing list