[Vm-dev] A Smalltalk object database idea

Stephen Pair stephen at pairhome.net
Tue Dec 29 15:47:18 UTC 2009


OOZE and LOOM by Ted Kaehler, et al did this kind of thing.  Here's a link
to the 1981 article on OOZE:
http://www-cs-students.stanford.edu/~eswierk/misc/kaehler81/

<http://www-cs-students.stanford.edu/~eswierk/misc/kaehler81/>It mentions
LOOM, but doesn't go into detail...I think the more detailed LOOM paper(s)
are in the ACM digital library.

- Stephen

On Tue, Dec 29, 2009 at 10:10 AM, Louis LaBrunda
<Lou at keystone-software.com>wrote:

>
> Hello Squeak VM Guys,
>
> My name is Louis LaBrunda.  I use Instantiations VA Smalltalk but dabble
> with Squeak from time to time.
>
> I have an outside-the-box way of implementing an object database for
> Smalltalk that I would like to see if there is anyone here who is
> interested in implementing.  I understand the theory behind Smalltalk VMs
> (at least I think I do) but would require a large learning curve to
> actually modify one.  This idea doesn't require the inventing or improving
> of any technology but it does require changes to the VM.
>
> For the purpose of describing this idea, I will deal with only one database
> and not go into binding to the database and other details like transaction
> processing and such.  These things are of course important but I think they
> can be handled in very much standard ways that should not be changed by
> this means of implementing the object database.
>
> The idea is that the VM would treat the database file much like a CPU chip
> would treat RAM and would treat its (the VM) memory like a CPU chip would
> treat its internal (on-chip) cache.  There would be a similar means of
> linking the data in memory to the data in the database as there is between
> linking a CPU chip's cache and RAM.
>
> A I said, I'm not very knowledgeable of the internal working of Smalltalk
> VMs, so much of what I am about to say is guess work but I think it is
> accurate.  Objects represented in the memory of a Smalltalk VM probably
> take up about 12 bytes or so for 32 bit systems, more for 64 bit systems.
> Much of these bytes are bits that define the class.  Some of the bytes
> might be the value of the object if it is say a small integer or a byte or
> character.  If the data (value) of the object is larger than will fit in a
> few bytes, there is a pointer to the data.  If the object has instance
> variables that are of course other objects, there are pointers to them.
>
> A bit would be needed to indicate a persisted object and probably another
> bit to indicate the object is dirty (changed and therefore doesn't match
> the database file copy).  Objects with the persisted bit off would
> otherwise look and be treated the same as they are now.  Objects with the
> persisted bit on would have all their pointers replaced with offsets from
> the beginning of the database file (a single file containing all the
> persisted objects.  All objects pointed to by a persisted object must also
> be persisted objects.
>
> When the VM comes across a persisted object it would use the pointers (that
> are now offsets within the database file) as keys into a lookup table (hash
> table) to find the real pointer to the data in memory.  If the item is
> found in the lookup table the value is used as it would have been if it was
> in the object and all is the same.  If the item is not found in the lookup
> table the offset into the database file is used to read the object from the
> database.  The lookup table would then be updated to include the new item.
>
> As far as I can tell the copies of the object in memory and in the database
> file can be identical (no object dumper/loader serialization).  There may
> need to be a little bit of a wrapper in the database file but I don't think
> much.  This should make for a very quick loading and saving of objects.
>
> Probably some objects, like blocks of code can't or shouldn't be saved to
> the database (I'm not sure if this is true for Squeak).  But I don't think
> that is any different than systems that use object dumper/loader
> serialization.
>
> I think a low priority fork could run through the lookup table for objects
> with the dirty bit set and save them to the database file.  A #persist (or
> some other good name) method could be added to #Object to force the saving
> of an object to the database.  This would probably be implemented with a
> primitive but maybe not.
>
> There may be some changes needed for garbage collection to keep the lookup
> table up to date but I don't think that will be a big deal.  Hopefully
> garbage collection for the database file could be handled mostly by
> Smalltalk code with the help of a few primitives.
>
> Well, that's it for now.  I hope this has been an interesting read and not
> a waste of your time.  If you think the idea has merit, let me know and we
> can discuss it further.
>
> Thank you very much for your time.
>
> Lou
> -----------------------------------------------------------
> Louis LaBrunda
> Keystone Software Corp.
> SkypeMe callto://PhotonDemon
> mailto:Lou at Keystone-Software.com http://www.Keystone-Software.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20091229/f1062ae2/attachment.htm


More information about the Vm-dev mailing list