[Vm-dev] A Smalltalk object database idea

David T. Lewis lewis at mail.msen.com
Tue Dec 29 15:32:10 UTC 2009

Hello and welcome.

I think you'll find the Squeak VM to be quite adaptable to experiments
like this. All the code for the object memory is written in Smalltalk
(actually a limited subset of Smalltalk), so it is quite accessible
and relatively easy to modify.

If you have not already done so, try loading the VMMaker package
from SqueakSource, and read the class comment of ObjectMemory for
a description of the object memory organization and header formats
(I'm not sure how familiar you are with Squeak at this point, so
ask some questions if this is not clear). Also, read the "Back
to the Future" paper for general background:


On Tue, Dec 29, 2009 at 10:10:11AM -0500, Louis LaBrunda wrote:
> Hello Squeak VM Guys,
> My name is Louis LaBrunda.  I use Instantiations VA Smalltalk but dabble
> with Squeak from time to time.
> I have an outside-the-box way of implementing an object database for
> Smalltalk that I would like to see if there is anyone here who is
> interested in implementing.  I understand the theory behind Smalltalk VMs
> (at least I think I do) but would require a large learning curve to
> actually modify one.  This idea doesn't require the inventing or improving
> of any technology but it does require changes to the VM.
> For the purpose of describing this idea, I will deal with only one database
> and not go into binding to the database and other details like transaction
> processing and such.  These things are of course important but I think they
> can be handled in very much standard ways that should not be changed by
> this means of implementing the object database.
> The idea is that the VM would treat the database file much like a CPU chip
> would treat RAM and would treat its (the VM) memory like a CPU chip would
> treat its internal (on-chip) cache.  There would be a similar means of
> linking the data in memory to the data in the database as there is between
> linking a CPU chip's cache and RAM.
> A I said, I'm not very knowledgeable of the internal working of Smalltalk
> VMs, so much of what I am about to say is guess work but I think it is
> accurate.  Objects represented in the memory of a Smalltalk VM probably
> take up about 12 bytes or so for 32 bit systems, more for 64 bit systems.
> Much of these bytes are bits that define the class.  Some of the bytes
> might be the value of the object if it is say a small integer or a byte or
> character.  If the data (value) of the object is larger than will fit in a
> few bytes, there is a pointer to the data.  If the object has instance
> variables that are of course other objects, there are pointers to them.
> A bit would be needed to indicate a persisted object and probably another
> bit to indicate the object is dirty (changed and therefore doesn't match
> the database file copy).  Objects with the persisted bit off would
> otherwise look and be treated the same as they are now.  Objects with the
> persisted bit on would have all their pointers replaced with offsets from
> the beginning of the database file (a single file containing all the
> persisted objects.  All objects pointed to by a persisted object must also
> be persisted objects.
> When the VM comes across a persisted object it would use the pointers (that
> are now offsets within the database file) as keys into a lookup table (hash
> table) to find the real pointer to the data in memory.  If the item is
> found in the lookup table the value is used as it would have been if it was
> in the object and all is the same.  If the item is not found in the lookup
> table the offset into the database file is used to read the object from the
> database.  The lookup table would then be updated to include the new item.
> As far as I can tell the copies of the object in memory and in the database
> file can be identical (no object dumper/loader serialization).  There may
> need to be a little bit of a wrapper in the database file but I don't think
> much.  This should make for a very quick loading and saving of objects.
> Probably some objects, like blocks of code can't or shouldn't be saved to
> the database (I'm not sure if this is true for Squeak).  But I don't think
> that is any different than systems that use object dumper/loader
> serialization.
> I think a low priority fork could run through the lookup table for objects
> with the dirty bit set and save them to the database file.  A #persist (or
> some other good name) method could be added to #Object to force the saving
> of an object to the database.  This would probably be implemented with a
> primitive but maybe not.
> There may be some changes needed for garbage collection to keep the lookup
> table up to date but I don't think that will be a big deal.  Hopefully
> garbage collection for the database file could be handled mostly by
> Smalltalk code with the help of a few primitives.
> Well, that's it for now.  I hope this has been an interesting read and not
> a waste of your time.  If you think the idea has merit, let me know and we
> can discuss it further.
> Thank you very much for your time.
> Lou
> -----------------------------------------------------------
> Louis LaBrunda
> Keystone Software Corp.
> SkypeMe callto://PhotonDemon
> mailto:Lou at Keystone-Software.com http://www.Keystone-Software.com

More information about the Vm-dev mailing list