On Thu, 31 Jan 2002, Cees de Groot wrote:
The structure (now I'm going to tread dangerous territory here - this is from the back of my head, I really should refresh my knowledge from the docs) is basically: magic, transaction header, object, object, object, ..., transaction header, object, object, object, ... etcetera. Objects have backpointers to parent versions, transactions too - this mean you can 'timetravel' to older transactions or older object versions (guess how trivial it is to build a Wiki on top of that...).
A few questions -
Does anything without an oid get serialized in full to the transaction log, as part of its parent object? How does it recover the oids from the objects when writing to disk - does it keep a reverse map object->oid? Or does every business object have to have an oid field?
Which brings me to - what can be used in business objects? In Squeak, will standard collections be usable, or will you have to use an ODBDictionary, ODBOrderedCollection, etc, that have oids, and can associate themselves with transactions? And how do most objects (that don't have an easy at:put: protocol) know they've been modified? Compiler hacks like kats uses? VM hacks? Or do your business objects have to send "self changed" all the time?
I ask all of this because I've been thinking about similar designs myself without coming up with satisfactory answers. What I've been looking at instead, recently, is generalizing my hacked together little O/R mapping framework (sort of a GLORP-lite) so that it will work with simple in-memory tables as well as RDBs. Once you map an object structure down to simple relational tables (or arrays of dictionaries of strings, which is what a relational table looks like to my framework), transactions become very easy to serialize. Of course, the disadvantage is that you're constrained by the O/R model and have to have all of this metadata about relationships, etc. One upside is that you can freely mix your "object" (in-memory) DB and any relational DBs you need to use, in terms of having relationships between them, since they all use the same model. The other is that all of the DB-related stuff is totally external to your business objects, rather than forcing them to inherit from anything in particular (although you still have the "self changed" problem to some degree).
I don't particularly like this approach, so if you manage to get a ZODB lookalike working, I'll be very pleased. I'm also happy to help out if you need it.
Cheers, Avi
Avi Bryant avi@beta4.com said:
Does anything without an oid get serialized in full to the transaction log, as part of its parent object? How does it recover the oids from the objects when writing to disk - does it keep a reverse map object->oid? Or does every business object have to have an oid field?
AFAIK, anything without an OID gets included by value. ZODB persistence is based on the Python pickle library (http://www.python.org/doc/current/lib/module-pickle.html), the documentation on that is fairly extensive.
Which brings me to - what can be used in business objects? In Squeak, will standard collections be usable, or will you have to use an ODBDictionary, ODBOrderedCollection, etc, that have oids, and can associate themselves with transactions?
Given the fact that there is not much difference in Smalltalk between a 'persistent' object and a non-persistent object (except for the indication that you want to persist an object by reference or by value), there are two options: - Automatically make persistent anything that's registered with a transaction; - Throw an exception when a non-persistent object (however it's marked persistent) is registered with a transaction. I'm in favour of the second solution, it's bound to catch programmer errors more quickly than the first solution which risks that an object you meant as a reference object is stored twice.
And how do most objects (that don't have an easy at:put: protocol) know they've been modified? Compiler hacks like kats uses? VM hacks? Or do your business objects have to send "self changed" all the time?
That's a matter of taste and philosophy. ZODB intercepts instance variable access through the ExtensionClass mechanism. I work with OmniBase under VisualWorks which needs 'self markDirty' all over the place. Both have the notion of a thread-local transaction where the object is automatically registered when it's marked dirty, and both require the programmer to explicitely close the transaction - I think both items make a lot of sense. As for dirtying an object, I like the ZODB approach better, and I think that it should be very doable in Smalltalk. However, I wouldn't object to the 'self changed' solution, because it doesn't really get in the way.
I ask all of this because I've been thinking about similar designs myself without coming up with satisfactory answers.
That's because orthogonal persistence is a very hard thing. Having worked with O/R mapping variations and OO databases, I must say that I greatly prefer the latter. Basically, I don't see any problems solved by an O/R mapping layer (now talking from the 'blank sheet of paper' perspective - you don't have an existing RDBMS that you need to interface with), only problems introduced.
squeak-dev@lists.squeakfoundation.org