Transactions as Modules (was: Behaviors vs Modules)

Anthony Hannan ajh18 at cornell.edu
Sat Feb 23 22:36:34 UTC 2002


One thing I don't like about having protocol behavior loaded together as
a single unit is that it is likely that a program crosses many protocols
instead of just staying in one or a few of them, requiring many loads
instead of just a few.  One solution is to not have protocols, and only
load and trace default methods of selectors we are using.  But this fine
granularity even increases the number of loads.

Related to module functionality is database functionality, which has
largely been missing from the module discussion.  But I think the two
should be treated together since loading and saving modules is just a
special case of loading and saving objects in general.  Methods that
call each other should be grouped together so they can be loaded
together.  When we program we usually write methods that call each,
together, and when we're at a stopping point we save.  We usually put
all these methods into a single changeset.  A changeset is equivalent to
a transaction.  But instead of the forgetting the transaction after the
changes are applied to the database/image, we want to keep it around as
a separate loadable entity, as a module.
	A module can be defined as a collection of objects.  Some of these
objects may shadow other objects (ie. the shadowed object will become:
the new object when loaded).  Shadowing is how one module makes changes
to another module but without replacing it.  Later, the overriding
module (or part of it) can be merged in with the original module once
its changes are considered stable.  Every object would belong to one and
only one module.  Cross-module pointers would become disk proxy pointers
when the referenced module is not loaded.  Traversing a proxy pointer
would cause the referenced module to be loaded.  ImageSegments are a lot
like modules decribed here and I would like to reuse its quick binary
format and root & outPointers structure.
	When working in your image there would always be a current module
active where all new and changed objects would be placed.  Upon commit,
the module will be written out to a repository so others can see it.  At
this point the module can stay as the current active module to be
changed further, or a new one can be started, depending on the commit
command issued (save vs commit, or something).
	Of course there is the issue of the repository, distributed or central,
how to manage concurrency, consistency, priviledges, etc.  We could use
a traditional database system with just one table called Modules where
each record is a module with fields id and imageSegmentBitStream; or we
can build our own database system.  A consistency check needed in either
system would be to make sure cross-module pointers are still good.  If I
save a module, the database has to make sure the modules it points to
haven't changed since last time I loaded them, otherwise I could have
dangling references.
	One important module would be the one containing the Smalltalk
dictionary (or root category).  It would contain (proxy) pointers to all
other top-level objects available in Squeak.  So if someone want to make
new objects/functionality available to everyone, s/he would just add a
pointer in the Smalltalk root and update its module. 
Namespaces/categories/environments would be orthogonal to
modules/transactions.

I like this better than behaviors as modules.  Of course I still like
default selector behavior for mixin capability, we just don't need the
protocol grouping anymore.  What do you guys think?

Cheers,
Anthony



More information about the Squeak-dev mailing list