My first Magma experience (Now including GOODS)
Daniel Salama
dsalama at user.net
Mon Apr 4 03:08:44 UTC 2005
On Apr 3, 2005, at 4:52 PM, Chris Muller wrote:
>
> Daniel, you have repeated a classic performance-comparison between
> relational
> and object databases. Using a ultra-simple model and a
> "crunch-through-a-big-table" style test, all the benefits granted by
> an ODBMS
> are minimized while performance benefits of slamming data, not
> objects, into a
> table are emphasized.
I agree with you 100% on this, but this simple test is only a small
part of a large application. The model is much more complex. However,
to a certain degree, several features of the application will do a very
similar task. Something as simple as providing the application user
with a way to list all customer records within 5 miles of a given zip
code. In my amateur mind, this implies that I will pretty much have to
iterate through a large collection of objects to select the ones that
match the criteria.
> With ODBMS's "transparency" is granted in terms of not having to
> think-in or
> map-to tables, but there is the need to account for the database when
> it comes
> to high-volume processing. One thing is traded for another, it's not
> a free
> lunch.
I know and like I said before, the model is more complex than this.
This is merely the tip of the iceberg, but I need to be able to allow
the user to, at least, be able to select an object before expanding on
all the object relations.
>
>> What I tried to do here was simply to iterate through all the
>> FHKC834Entry objects already stored in the database and updated and
>> committed one attribute every 100th object. It took 29 minutes on the
>> same computer. This is even worse timing as the other one. I guess I
>> could live with a slow bulk load process, but random updates to a
>> large
>> collection shouldn't take this long.
>
> This particular test and the way it was run exemplies what I would
> think to be
> very close to worst-possible-case performance with Magma. You're
> materializing
> nearly the entire dataset, object-by-object, surely building up a huge
> readSet,
> and then making periodic commits without stubbing (or using
> WriteBarrier on the
> GOODS test).
I know. The test might have been a bit inappropriate, but nevertheless,
say we are talking about an order entry application for a very busy
company. Say they process 75,000 orders per month and throughout the
life cycle of the order, the order may be modified. This means that to
some degree, I will be traversing the list of several thousands of
orders to allow the user to select one so he/she can edit something
about the order, or may be edit, at least, one in possibly several
dozens of order line items. If you think about it, my test is a
simplification of this process. Or not?
> ODBMS's give you more transparency than relationals, so they have to
> do a
> little more. High-volume processing requires profiling. I didn't see
> that,
> nor consideration of the various performance-tuning tools documented at
> http://minnow.cc.gatech.edu/squeak/2985 in your post.
You're right. I glanced at that page, but never read it carefully. I
spent some time today reading it more carefully and I understand some
things to keep in mind when testing a larger piece of the application.
However, I didn't quite find any "tips" on directly improving my test
case.
> Still, while I'm sure what you've done can be improved I don't know
> whether
> your performance requirements can be met given the apparent volume you
> are
> dealing with. Magma is definitely in an experimental state. It's
> performance
> will improve with time but more conventional technologies might serve
> you
> better for your high-volume project, as Avi suggested.
>
> One way to help determine this may be to measure the "best-case"
> performance
> and see if even that is acceptable. If not, you can stop right there,
> no point
> in going any further. For Magma, check out MagmaBenchmarker (part of
> the Magma
> tester package).
Again, I agree with you and that's the one of the main reasons I come
to you guys in this list. I'm new to Smalltalk, Squeak, Magma, GOODS,
and everything that surrounds it. In my experience, even experimental
libraries or frameworks I've used in the past, are stable enough to be
used in production. That's why I still gave Magma a shot. Other than
GOODS, the remaining option to "play" with is OmniBase, but if it can't
run in linux, I can't use it :(
Who knows, may be the target clients I look for are always high
transaction volume ones and I may be better off with RDBMS even if I
have to use tools like GLORP (which BTW, looks very promising albeit
the missing features) and live with the additional overhead of
maintaining the relational models of a standard, say, PostgreSQL
implementation.
> For your convenience, I post my results from a recent run on my 1.3GHz
> laptop.
These results look good. However, when I benchmark something, I always
like to look at the numbers as well as the user experience. If, say, my
tests were done in 30 seconds, I would say, wow, the OODB is really
fast, but if running these processes, although may seem fast, they will
prohibit the user from doing anything else because Seaside (or whatever
for that matter) is "blocked" during those 30 seconds, then these 30
seconds are unacceptable. I don't care if the bulk loading of data
takes 8 hours to run. As long as it does not affect the user's
perception of performance, I don't care. Now, when it comes to the
orders example I presented above, it's all user's perception of
performance, so it must be "fast".
Thanks again for your wise words. I always look up to you guys for the
more I read the posts in this list every day, the more I appreciate
your efforts and your advise.
Thanks,
Daniel Salama
More information about the Squeak-dev
mailing list
|