cg@cdegroot.com (Cees de Groot) wrote:
I was just explaining the ZODB design to a colleague,
Thank you for explaining it to me, also. ZODB has sounded interesting for a long time but my occasional quick glances at never led to understanding it. (If you do get a chance to review its documentation, please feel free to repeat and expand your explanation of how it works.)
and it struck me that this might be a very useful persistence engine design for Squeak. It's simple,
I have been using a combination of Dolphin Smalltalk for the clients and Python/Metakit as a database server. I've been very pleased with Metakit and my Python/Metakit server "just works" for perhaps months at a time before I need to add or change my Python code. However, when I do, there is a certain stress on me in "shifting gears" from Smalltalk back to Python/Metakit as it takes me a long time to get re-oriented. (In hindsight, I wish I had done many things different, but one thing would have been to pass Python code from the client directly to the Python server to be interpreted, so even when writing Python code I would stay in Smalltalk.)
That (the desire to stay in Smalltalk all the time), plus the recent persistence discussions along the lines of "RAM is cheap; let's use it", plus the fact that this application's data size is fairly small (typically 150,000 to 250,000 business objects occupying perhaps 8 to 16 MB when represented in text files -- I don't quite know what that would translate to in terms of Smalltalk memory usage), have led me to consider switching the server over to Squeak and keeping all the objects in RAM all the time. I thought I'd want to write a transaction log and then save out the objects to disk at the end of the day (or whenever). If all went well, the next time the server started up, it would reload the objects from the saved out version, otherwise, it would reload the previous good version and replay the log.
If I followed your ZODB explanation, its approach would be similar to that in my previous paragraph except it is only the dictionary of oid-to-starting-address-in-log-file that is kept in RAM all the time and rewritten out and reloaded.
The structure (now I'm going to tread dangerous territory here - this is from the back of my head, I really should refresh my knowledge from the docs) is basically: magic, transaction header, object, object, object, ..., transaction header, object, object, object, ... etcetera. Objects have backpointers to
What would we write in the log file if we did it in Squeak? Since some of my business objects point to other business objects, etc. etc., how complicated must the proxying of objects be? I am doing that now in the client and one of the appeals of keeping all the objects in RAM all the time was the possibility of eliminating the need to deal with proxies altogether, or at least restrict it to server start-up time. Any thoughts or suggestions in this area? (Of course, at first, even though either all the objects or just their oid dictionary would be in RAM on the new Squeak server, the clients would continue as if it were Python/Metakit serving them, but later I would hope to use lighter clients with all the work done on the Squeak server.)
all performs reasonably well makes it an ideal candidate for a built-in persistence mechanism for Squeak, methinks.
Would that sound like something useful for Squeak?
I certainly think so.
It has its limitations, for sure, but it seems like the simplest way to get a solid persistence engine into Squeak, which is a Good Thing I think. It's also a project I'd like to tackle...
I'll look forward to it.
-- Frank If we had realized how difficult it would be, the Romans would not have begun building the pyramids nor would I have begun studying history.
On Thu, 31 Jan 2002, Frank Sergeant wrote: [snip]
That (the desire to stay in Smalltalk all the time), plus the recent persistence discussions along the lines of "RAM is cheap; let's use it", plus the fact that this application's data size is fairly small (typically 150,000 to 250,000 business objects occupying perhaps 8 to 16 MB when represented in text files -- I don't quite know what that would translate to in terms of Smalltalk memory usage),
I'd be a little surprised if that gained 50% in memory, though it certainly depends on the encoding.
have led me to consider switching the server over to Squeak and keeping all the objects in RAM all the time.
Certainly feasible. The big thing to watch for is not losing info...but look at the Changes/Image set up. That's a *lot* of data (zillions of objects...classes and methods and versions of methods, etc.) with a fairly fine grained level of persistency (i.e, if you don't accept a method you can lose it, but accepts are reasonably cheap).
I thought I'd want to write a transaction log and then save out the objects to disk at the end of the day (or whenever).
Snapshots are fast and easy *though* they can disrupt stuff. There are a bunch of ways around that on the server, though.
Plus, you could use ImageSegments, which is a touch trickier at the moment.
Or a Celeste like strategy, where "real" objects are added to a file, your index file only writes out occasionally, but you can always recover from the main file.
If all went well, the next time the server started up, it would reload the objects from the saved out version, otherwise, it would reload the previous good version and replay the log.
That sounds exactly like the changes/image strategy. Works well, IMHO.
If I followed your ZODB explanation, its approach would be similar to that in my previous paragraph except it is only the dictionary of oid-to-starting-address-in-log-file that is kept in RAM all the time and rewritten out and reloaded.
I'll point again to FileDictionary, which was made in preImageSegment days. I'm not sure how easy it is to recover from it.
[snip]
What would we write in the log file if we did it in Squeak? Since some of my business objects point to other business objects, etc. etc., how complicated must the proxying of objects be? I am doing that now in the client and one of the appeals of keeping all the objects in RAM all the time was the possibility of eliminating the need to deal with proxies altogether, or at least restrict it to server start-up time. Any thoughts or suggestions in this area?
Avoiding proxies sounds right to me. But ImageSegs do a bit of this, if I understand things correctly. A lot depends on your transaction rate. If it's relatively slow and non-concurrant, then simple serializtion appending to a file (or writing a separate file for each entry) might do the trick, plus snapshotting. In that case, I'd optimize for fast writing to the transaction log at the expense of fast reading back...i.e., you'd only want to reconstruct from the log when the server went bad, and even in the case of crashes before snapshotting, you'd only have to recover a relativley few objects.
[snip]
MinneStore might actually work well for you right now, if you wanted something a little more generic. Added advantage there is that it's somewhat portable between most Smalltalk dialects (aside from the standard DB features).
Disadvantage is that I've never really gotten it working for me :) But I do encourage experimenting with it on at least one of the dialects available for you.
Cheers, Bijan Parsia.
squeak-dev@lists.squeakfoundation.org