OODB

Wed Apr 9 20:35:55 UTC 2003

Hi Brian!

Took the libery to CC to squeak-dev, perhaps someone else would like to
read this - hope that was ok!

Brian Brown <rbb at techgame.net> wrote:
> Hi Goran,
>   I didn't want to put this on the Squeak list, but do you know of any decent 
> OODB resources for learning? I'm a Squeak/Smalltalk newbie, and new in the OO 
> arena especially when it comes to data... I've done waaayyyyy to much 
> relational work ;-)

Well, not really. You could check
http://www.cetus-links.org/oo_db_systems_1.html for a bunch of links.
And here are a few keywords you can stuff into Google:

ODMG - Object Database Management Group. A rather teethless organization
but they did at least produce a standard also called ODMG. It describes
in different levels standard logical APIs for object databases.

OQL - Object Query Language. A SQL similar language but only for queries
(no modifying mechanisms) and very clearly defined. I think the
definition is 40 pages compared to several hundred pages for SQL.
Unfortunately OQL didn't get much use in the products and a query
language is not the primary way to interact with an OODB anyway.

A few OODBs: GemStone, Poet, Objectivity, Versant, GOODS (an open source
variant from Russia), Ozone (a Java variant also open source) - to
mention a few. Not many open source and the fact that Squeak has Magma
is *really* cool.

Finally a short description on how to work with an OODB:

Essentially you just pretend that you have infinitely amounts of RAM and
that it can survive a powerdown! :-) Really. The OODB maintains that
illusion for you. When you connect to an OODB and start a session you
typically access a "root" object. That root object is just a regular
Smalltalk object and it is typically the top domain object for your
application.

It - and all it's content (the complete object "tree" reachable from
that top object - it could be comprised of literally millions of
objects) is in the database. It isn't in RAM yet but as you access it
Magma/anyOODB will automatically load parts of it into RAM so the
illusion is that it in fact already in RAM - it simply reads (or
"faults") objects in by demand.

Let's say you want to change something in it - like adding a new Person
object or something. It could look like this (lets assume we have
connected and have a Session instance already - it's like 2 lines of
code for Magma):

mySession begin. "This line starts a transaction"
p := Person new. "Just regular Smalltalk"
p name: 'Goran'; age: 12. "Still just regular Smalltalk, objects,
objects - no tables whatsoever"
rootObject addPerson: p. "...you got it now - just Smalltalk."
mySession commit "And this line commits all changes made since begin."

Voilá! That's it. As you can see the code between begin and commit is
just plain old ordinary Smalltalk code just like if the rootObject was
completely in RAM. There are no extra languages to learn to use OODBS or
any explicit read (like SELECT) or write operations (like
INSERT/DELETE). Reads are done automatically behind the scene as we
traverse the persistent object graph and writes are done automatically
for the objects that have been modified when the transaction is
committed.

So, all modifications to rootObject (we added an instance of Person to
one of its internal Collection instances) are automatically noted by
Magma and committed to disk when the last line is reached - the commit
line. You could also at that point do "mySession abort" and that would
roll back all modifications made in rootObject since begin. You get the
picture.

Then of course you have locks and yadda yadda - but essentially working
with Magma or any OODB is just like ordinary Smalltalk code - you just
sprinkle a couple of begin and commit in appropriate places and you are
done!

In fact at my latest large project we used GemStone/J (an OODB for Java)
with over 1300 domain classes. We stored literally millions of instances
in the database and the domain model was *very* complex and all the way
totally OO (these guys are Smalltalkers inside). Guess how many lines of
code we used for storing all that? The system was about 500 kloc and a
total of about 3000 classes.

Answer: about 1 kloc. And that was just because we built a little
miniframework for advanced notifications between client applications.
Otherwise it would have been less.

Hope this got your appetite whet. :-)

regards, Göran