I was just explaining the ZODB design to a colleague, and it struck me that this might be a very useful persistence engine design for Squeak. It's simple, straightforward to implement, and we have sample code in Python to, err, get inspiration from ;-).
The design of the ZODB is straightforward: on-disk is just a transaction log, the index of 'current' objects is kept in memory. It is single user, if you want multiple processes to access the same database write a small server (ZEO for Zope). If you want multiple threads to access the same database, use a semaphore.
When Zope opens the ZODB, it scans the objects and rebuilds the index (that's simple: just run through the transaction log, build a dictionary of (oid -> offset), and at the end of the transaction log you have the most recent versions in the dictionary). If you cleanly exit Zope, the dictionary is dumped to disk so you don't need to scan next time. The in-core index is not very large: our ZEO server serves a 500Mb database and uses around 15Mb memory, which includes roughly 5Mb for basic Python.
The structure (now I'm going to tread dangerous territory here - this is from the back of my head, I really should refresh my knowledge from the docs) is basically: magic, transaction header, object, object, object, ..., transaction header, object, object, object, ... etcetera. Objects have backpointers to parent versions, transactions too - this mean you can 'timetravel' to older transactions or older object versions (guess how trivial it is to build a Wiki on top of that...).
If the accumulated cruft becomes too large, you can close the database, move it to database.old, and copy the freshest objects back to a new file (or the freshest objects plus a week's worth of history, etcetera).
The fact that you don't need to overwrite existing parts of the db file (tricky), that you don't need on-disk indexing (even trickier), and that it all performs reasonably well makes it an ideal candidate for a built-in persistence mechanism for Squeak, methinks.
The ZODB interface is basically a dictionary of oid->object mappings. IIRC, it employs a root object so compacting the database will not only get rid of old versions, but also old cruft. For most datastructures, you simply persist Python collections (lists, dictionaries) but they have also layered a B*tree index thingy on top of Zope so you have better scalability for large collections.
Inside Python, Zope employs some metaobject hackery to make persistence transparent (every thread has a current transaction and through some sneaky tricks wrapped in a Python C module called 'ExtensionClass' every object that inherits/mixes in Persistent will register itself with the transaction when it is called; it's up to the code to commit the transaction, Zope does this naturally when the HTTP request has been dealt with).
Would that sound like something useful for Squeak? It has its limitations, for sure, but it seems like the simplest way to get a solid persistence engine into Squeak, which is a Good Thing I think. It's also a project I'd like to tackle...
Regards,
Cees
squeak-dev@lists.squeakfoundation.org