OOZE and LOOM by Ted Kaehler, et al did this kind of thing. Here's a link to the 1981 article on OOZE:<div><a href="http://www-cs-students.stanford.edu/~eswierk/misc/kaehler81/">http://www-cs-students.stanford.edu/~eswierk/misc/kaehler81/</a></div>
<div><br></div><div><a href="http://www-cs-students.stanford.edu/~eswierk/misc/kaehler81/"></a>It mentions LOOM, but doesn't go into detail...I think the more detailed LOOM paper(s) are in the ACM digital library.<br>
<div><br></div><div>- Stephen<br><br><div class="gmail_quote">On Tue, Dec 29, 2009 at 10:10 AM, Louis LaBrunda <span dir="ltr"><<a href="mailto:Lou@keystone-software.com">Lou@keystone-software.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><br>
Hello Squeak VM Guys,<br>
<br>
My name is Louis LaBrunda. I use Instantiations VA Smalltalk but dabble<br>
with Squeak from time to time.<br>
<br>
I have an outside-the-box way of implementing an object database for<br>
Smalltalk that I would like to see if there is anyone here who is<br>
interested in implementing. I understand the theory behind Smalltalk VMs<br>
(at least I think I do) but would require a large learning curve to<br>
actually modify one. This idea doesn't require the inventing or improving<br>
of any technology but it does require changes to the VM.<br>
<br>
For the purpose of describing this idea, I will deal with only one database<br>
and not go into binding to the database and other details like transaction<br>
processing and such. These things are of course important but I think they<br>
can be handled in very much standard ways that should not be changed by<br>
this means of implementing the object database.<br>
<br>
The idea is that the VM would treat the database file much like a CPU chip<br>
would treat RAM and would treat its (the VM) memory like a CPU chip would<br>
treat its internal (on-chip) cache. There would be a similar means of<br>
linking the data in memory to the data in the database as there is between<br>
linking a CPU chip's cache and RAM.<br>
<br>
A I said, I'm not very knowledgeable of the internal working of Smalltalk<br>
VMs, so much of what I am about to say is guess work but I think it is<br>
accurate. Objects represented in the memory of a Smalltalk VM probably<br>
take up about 12 bytes or so for 32 bit systems, more for 64 bit systems.<br>
Much of these bytes are bits that define the class. Some of the bytes<br>
might be the value of the object if it is say a small integer or a byte or<br>
character. If the data (value) of the object is larger than will fit in a<br>
few bytes, there is a pointer to the data. If the object has instance<br>
variables that are of course other objects, there are pointers to them.<br>
<br>
A bit would be needed to indicate a persisted object and probably another<br>
bit to indicate the object is dirty (changed and therefore doesn't match<br>
the database file copy). Objects with the persisted bit off would<br>
otherwise look and be treated the same as they are now. Objects with the<br>
persisted bit on would have all their pointers replaced with offsets from<br>
the beginning of the database file (a single file containing all the<br>
persisted objects. All objects pointed to by a persisted object must also<br>
be persisted objects.<br>
<br>
When the VM comes across a persisted object it would use the pointers (that<br>
are now offsets within the database file) as keys into a lookup table (hash<br>
table) to find the real pointer to the data in memory. If the item is<br>
found in the lookup table the value is used as it would have been if it was<br>
in the object and all is the same. If the item is not found in the lookup<br>
table the offset into the database file is used to read the object from the<br>
database. The lookup table would then be updated to include the new item.<br>
<br>
As far as I can tell the copies of the object in memory and in the database<br>
file can be identical (no object dumper/loader serialization). There may<br>
need to be a little bit of a wrapper in the database file but I don't think<br>
much. This should make for a very quick loading and saving of objects.<br>
<br>
Probably some objects, like blocks of code can't or shouldn't be saved to<br>
the database (I'm not sure if this is true for Squeak). But I don't think<br>
that is any different than systems that use object dumper/loader<br>
serialization.<br>
<br>
I think a low priority fork could run through the lookup table for objects<br>
with the dirty bit set and save them to the database file. A #persist (or<br>
some other good name) method could be added to #Object to force the saving<br>
of an object to the database. This would probably be implemented with a<br>
primitive but maybe not.<br>
<br>
There may be some changes needed for garbage collection to keep the lookup<br>
table up to date but I don't think that will be a big deal. Hopefully<br>
garbage collection for the database file could be handled mostly by<br>
Smalltalk code with the help of a few primitives.<br>
<br>
Well, that's it for now. I hope this has been an interesting read and not<br>
a waste of your time. If you think the idea has merit, let me know and we<br>
can discuss it further.<br>
<br>
Thank you very much for your time.<br>
<br>
Lou<br>
-----------------------------------------------------------<br>
<font color="#888888">Louis LaBrunda<br>
Keystone Software Corp.<br>
SkypeMe callto://PhotonDemon<br>
mailto:<a href="mailto:Lou@Keystone-Software.com">Lou@Keystone-Software.com</a> <a href="http://www.Keystone-Software.com" target="_blank">http://www.Keystone-Software.com</a><br>
<br>
</font></blockquote></div><br></div></div>