Database options was (Re: My first Magma experience ...)
Cees de Groot
cg at cdegroot.com
Mon Apr 4 12:19:35 UTC 2005
On Sun, 03 Apr 2005 15:37:29 -0500, Jimmie Houchin <jhouchin at cableone.net>
wrote:
> Okay. Let me see if I understand you correctly.
> For a simple model, no joins or complex relational modeling, use of a
> RDBMS is fine maybe preferrable.
>
No joins? Use DBM or some ISAM storage.
Seriously, that 'R' in RDBMS is there for something :)
Joins will pop up all the time, even in the simplest schemas. The only
pattern I know where a table stands apart is if that table has been
carefully cut loose from the schema by replicating data into it. Happens
often to log files (CDR logging in telco applications) and invocing. See
the reference to Time Travel below for a solution. I never have such
tables/objects.
> For something in which in a RDBMS you have multiple joins and complex
> modeling in which to represent or store your data, go OODBMS.
>
Complex modeling is what you want to look at. Subclassing of important
domain objects is a key indicator - the usual thing is where you have a
'relation', which can be a person or a company, which can have various
roles (customer, supplier, employee, employee of customer, maitresse of
employee of customer, ...); or take for instance the case where you have a
good argument for implementing time travel patterns
(http://c2.com/cgi/wiki?TimeTravel, I always will try to use that in
anything that has to do with logging/ordering/invoicing) - implementing
this in Smalltalk+ODBMS is essentially 'implement and forget', I'm not
sure I would like to do this on top of an O/R layer, and probably not
without either major surgery to the O/R layer or - horrors - adding
another abstraction layer to the top.
> The price is fine. Is the commercial version on Linux current, working
> and complete? Or does it have the file locking problems discussed? I use
> Linux and like Daniel have nothing to do with Windows.
>
The file locking problem is a problem ONLY if you try to access the
database concurrently from multiple images. There are arguments to be made
that this is a good idea, but personally I've only have had a use case for
this only once, and even there it wasn't a very good use case (we split
the image into application server backend for interactive use and batch
processing for all the things in the background, simply because the batch
processing ate lots of CPU time at times and it was easier to let nice+the
Linux kernel deal with this than to find out how we could do the batch
processing at a lower priority :)).
I'm not sure whether locking has already been implemented under
Squeak/Linux - but I know what the issues are, I know how to solve them, I
know how much time they'll take, so if anyone wants this feature for a
commercial project, I can implement and test them in under a day's work
(and so can anyone else who takes half a look at the code - but I'm just
shamelessly advertising myself as an Omnibase consultant, ok?). So that
shouldn't be a showstopper for a commercial project ;)
> I really don't know quite to where SQLite scales. But it seems it would
> scale anywhere Goods or Magma would. Yes? No? Just a thought.
>
Can't comment on that, not having used SQLite. Goods and Magma have
separate database servers which should help scalability. OmniBase can have
multiple concurrent images accessing the database, and I imagine that with
a decent network (Gbit ethernet, separate storage area network, etcetera)
putting the database on a fast fileserver and having a cluster of images
access the thing would work quite nicely. If SQLite is indeed just a DLL,
that limits your scalability to anything you can build yourself (make an
SQLite database server image, etcetera). Depending on how much you can
influence, scalability is in your imagination (with SQLite and a typical
10% write/90% read environment, you could share the database filesystem
with Linux NBD to other machines and mount it read-only there. Write your
clients to load-balance read requests, and you suddenly have gained a lot
along the scalability axis.
With scalability being limited mostly by your imagination, robustness and
usability become the most important factors in chosing a persistence
engine. Personally, I never put scalability on the checklist, because
there are so many tools to solve these kinds of issues - it smells a lot
like premature optimization.
My current project just assumes a whole lot of peers and everyone of them
keeping bits of information in plain BerkeleyDB tables. The database is
the network, in essence. If we don't mess up, it'll scale to awful
proportions without the need for a single RAID drive :)
More information about the Squeak-dev
mailing list
|