OOZE and LOOM by Ted Kaehler, et al did this kind of thing.  Here&#39;s a link to the 1981 article on OOZE:<div><a href="http://www-cs-students.stanford.edu/~eswierk/misc/kaehler81/">http://www-cs-students.stanford.edu/~eswierk/misc/kaehler81/</a></div>

<div><br></div><div><a href="http://www-cs-students.stanford.edu/~eswierk/misc/kaehler81/"></a>It mentions LOOM, but doesn&#39;t go into detail...I think the more detailed LOOM paper(s) are in the ACM digital library.<br>

<div><br></div><div>- Stephen<br><br><div class="gmail_quote">On Tue, Dec 29, 2009 at 10:10 AM, Louis LaBrunda <span dir="ltr">&lt;<a href="mailto:Lou@keystone-software.com">Lou@keystone-software.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><br>

Hello Squeak VM Guys,<br>

<br>

My name is Louis LaBrunda.  I use Instantiations VA Smalltalk but dabble<br>

with Squeak from time to time.<br>

<br>

I have an outside-the-box way of implementing an object database for<br>

Smalltalk that I would like to see if there is anyone here who is<br>

interested in implementing.  I understand the theory behind Smalltalk VMs<br>

(at least I think I do) but would require a large learning curve to<br>

actually modify one.  This idea doesn&#39;t require the inventing or improving<br>

of any technology but it does require changes to the VM.<br>

<br>

For the purpose of describing this idea, I will deal with only one database<br>

and not go into binding to the database and other details like transaction<br>

processing and such.  These things are of course important but I think they<br>

can be handled in very much standard ways that should not be changed by<br>

this means of implementing the object database.<br>

<br>

The idea is that the VM would treat the database file much like a CPU chip<br>

would treat RAM and would treat its (the VM) memory like a CPU chip would<br>

treat its internal (on-chip) cache.  There would be a similar means of<br>

linking the data in memory to the data in the database as there is between<br>

linking a CPU chip&#39;s cache and RAM.<br>

<br>

A I said, I&#39;m not very knowledgeable of the internal working of Smalltalk<br>

VMs, so much of what I am about to say is guess work but I think it is<br>

accurate.  Objects represented in the memory of a Smalltalk VM probably<br>

take up about 12 bytes or so for 32 bit systems, more for 64 bit systems.<br>

Much of these bytes are bits that define the class.  Some of the bytes<br>

might be the value of the object if it is say a small integer or a byte or<br>

character.  If the data (value) of the object is larger than will fit in a<br>

few bytes, there is a pointer to the data.  If the object has instance<br>

variables that are of course other objects, there are pointers to them.<br>

<br>

A bit would be needed to indicate a persisted object and probably another<br>

bit to indicate the object is dirty (changed and therefore doesn&#39;t match<br>

the database file copy).  Objects with the persisted bit off would<br>

otherwise look and be treated the same as they are now.  Objects with the<br>

persisted bit on would have all their pointers replaced with offsets from<br>

the beginning of the database file (a single file containing all the<br>

persisted objects.  All objects pointed to by a persisted object must also<br>

be persisted objects.<br>

<br>

When the VM comes across a persisted object it would use the pointers (that<br>

are now offsets within the database file) as keys into a lookup table (hash<br>

table) to find the real pointer to the data in memory.  If the item is<br>

found in the lookup table the value is used as it would have been if it was<br>

in the object and all is the same.  If the item is not found in the lookup<br>

table the offset into the database file is used to read the object from the<br>

database.  The lookup table would then be updated to include the new item.<br>

<br>

As far as I can tell the copies of the object in memory and in the database<br>

file can be identical (no object dumper/loader serialization).  There may<br>

need to be a little bit of a wrapper in the database file but I don&#39;t think<br>

much.  This should make for a very quick loading and saving of objects.<br>

<br>

Probably some objects, like blocks of code can&#39;t or shouldn&#39;t be saved to<br>

the database (I&#39;m not sure if this is true for Squeak).  But I don&#39;t think<br>

that is any different than systems that use object dumper/loader<br>

serialization.<br>

<br>

I think a low priority fork could run through the lookup table for objects<br>

with the dirty bit set and save them to the database file.  A #persist (or<br>

some other good name) method could be added to #Object to force the saving<br>

of an object to the database.  This would probably be implemented with a<br>

primitive but maybe not.<br>

<br>

There may be some changes needed for garbage collection to keep the lookup<br>

table up to date but I don&#39;t think that will be a big deal.  Hopefully<br>

garbage collection for the database file could be handled mostly by<br>

Smalltalk code with the help of a few primitives.<br>

<br>

Well, that&#39;s it for now.  I hope this has been an interesting read and not<br>

a waste of your time.  If you think the idea has merit, let me know and we<br>

can discuss it further.<br>

<br>

Thank you very much for your time.<br>

<br>

Lou<br>

-----------------------------------------------------------<br>

<font color="#888888">Louis LaBrunda<br>

Keystone Software Corp.<br>

SkypeMe callto://PhotonDemon<br>

mailto:<a href="mailto:Lou@Keystone-Software.com">Lou@Keystone-Software.com</a> <a href="http://www.Keystone-Software.com" target="_blank">http://www.Keystone-Software.com</a><br>

<br>

</font></blockquote></div><br></div></div>