[ENH][Modules] Delta Modules [was: Another version]

Thu Oct 25 11:23:33 UTC 2001

Allen Wirfs-Brock <Allen_Wirfs-Brock at Instantiations.com> wrote:
> At 12:34 PM 10/24/2001 +0100, goran.hultgren at bluefish.se wrote:
> >...
> >Allen Wirfs-Brock <Allen_Wirfs-Brock at Instantiations.com> wrote:
> > > At 09:45 AM 10/23/2001 +0100, goran.hultgren at bluefish.se wrote:
> > > >...
> > > ...
> > > translate externally?.  What is the external form of a module reference.
> > > Externally, is definedNames a list of class and global names or is it the
> > > actual source/object code for the defined entities. How about exported
> > > names. Is it a list of strings?
> >
> >Ehrm. I have no good picture of this unfortunately! I hope Henrik chimes
> >in.
> >Have you got his code by the way? I can always give you an url with the
> >image we used at OOPSLA.
> 
> Actually, I've been kind of avoiding using the code as a reference. 
> Facilities like this should be explained and understandable without 
> reference to source code.  Otherwise it is impossible to separate the 
> specification from the implementation details.

Fair enough! There are also pages on the swiki:

http://minnow.cc.gatech.edu/squeak/2042

> > > >...
> > > >- Submodule = Another module that is one of my children. Consider it a
> > > >"part of me". Especially note that a submodule of "me" has me as it's
> > > >parent. This is in contrast to the external modules whose parent is
> > > >someone else completely.
> > >
> > > Note that the bi-directional linkage of submodules and parents mean that a
> > > particular module may be a submodule of exactly one parent.. This would
> > > seem to significantly limit the reusability of modules. This is
> > > particularly a problem if you think about your modules in a context
> > > external to an image. In concept a "module" could contain code that is
> > > applicable in many different situations by many different parents but the
> > > tight (and early) binding of the submodule to the parent means that to
> > > reuse the code it will have to be duplicated within additional modules.
> >
> >Well, I think I say yes and no, or something... :-)
> >
> >Yes, each module has only one parent. The full name of the module is
> >derived from its place in the module hierarchy. And yes, that is a tree
> >which means a submodule only really lives in one single place.
> >(Sidenote: I like this. This is simple. More or less like in Java btw. A
> >module has a home and a name. Period.)
> 
> Wait...Java module?  Do you mean Java packages? I wouldn't characterize 
> Java packages as "modules", they are a name space mechanism.  They simply 
[SNIPPED a very good explanation on my illfated comparison with Java]

No I agree - baaad comparison. The only thing that I really meant was
that having
a simple hierarchy for naming is good. And that the name (path) also
mirrors the modules
place in the repository etc. as typically Java also does by mirroring
the name into
a directory tree.

> >BUT... you can also "link" modules as an external module - didn't I
> >write that? Yes I did. But perhaps it wasn't clear. Or perhaps I am dead
> >wrong. Henrik, correct me here but wasn't the idea that if a module
> >needs to reuse an already existant module placed somewhere else
> >completely in the module tree it just references it as an external
> >module? I think so.
> 
> Perhaps, I guess this is where some of the complexity leads to confusion. 
> Is there really the need for both external modules and submodules and 
> module parameters.  Are their usages truly orthogonal? Will user's know 
> when to choose one over the other? Couldn't this be simplified?

Well I would say that submodules and external modules are quite
different - I mean,
isn't it good for a module to have one single "home" thus giving us
essentially a tree (when it comes to naming etc)
of modules? In theory these two concepts might be unifyable but in
practice a graph would be really messy and
unintuitive to navigate etc - IMHO.

The module parameters are obviously a "cool thing". I could really live
without them for starters - they feel a bit like "hey, this is something
cool that we might need...", on the other hand there might be some
really good uses that I haven't yet heard.

> Regardless, you have clarified one thing for me.  Apparently, there is a 
> module definition tree (with parent back pointers) and a module usage 
> (dependence) graph.

Yes. Touchdown! :-)

> >...
> >
> > >
> > > An overly constrained module system is essentially the same as no module
> > > system. If every module must explicitly identify the specific versions of
> > > other modules it interacts with then you have essentially gone back to
> > > having a monolithic image where any change causes a cascade of dependent
> > > changes throughout the system.
> >
> >Ok, I see your argument. But... Hmm. The Debian package system is very
> >early/tight bound and it sure doesn't mean it is useless... Anyway, I
> >think I will pass on this one and let Henrik dive in! :-)
> >
> >I need that watch that Tim Olsen (name?) has in Superman...
> 
> I think his name is Jimmy

Right. Hmmm, now who the heck is Tim Olsen then? :-) Anyway, this feels
like one of
the most crucial questions and I wonder if Henrik might have thought
like this:

Ok, when the delta module is created by the developer it is created from
a (baseModule) specific module revision.
That is good information to keep. But it might not necessarily mean that
you are forbidden to apply it (like a changeset) to a newer revision of
that module. Henrik was talking about some pluggable strategy stuff that
was intended for this. The idea was that the creator of the deltamodule
could set a strategy for conflict-detection so that one simple strategy
would be "revision no of module x must equal revison no of module y" or
"revision no of module x must be higher than revision no of module y".

Anyway - viewing the deltas as diffs (or changesets - but there are
subtle differences there I think) they are easy to apply on a different
module. The comment of DeltaModule says that they actually should be
viewed as a new revision of the baseModule they modify and that the
diff-format is essentially just an optimization, but that I think that
is a bit wrong. If a DeltaModule was represented as the complete
baseModule after the modification then we wouldn't know what the
modifications where without comparing with the base module revision.
Slight difference there.

Anyway - so I think that it cannot be wrong to have a Delta module know
what module revision it actually was modifying when it was created -
that seems like good information to have. The tricky part is of course
to decide if it can be activated in a different module revision. And I
think this might be one of the "loose ends" that we need to at least
have a working set of default rules for. Do you have any proposal? :-)

> >...
> > > Most systems would make a terminology distinction between a logical module
> > > and a particular "version" of the module. Common terminology would be
> > > "module version", "module revision", "module edition", etc. It's would
> > > certainly help communications, to choose a term and then to be fastidious
> > > in making the distinction.
> >
> >Ok, I like "module revision" or "module version". In the Smalltalk world
> >the word "edition" and "version" might be tainted by earlier products in
> >such a way that we should perhaps stay clear of them. And the word
> >"revision" is used in CVS but the meaning there is probably intuitive
> >enough to not be a problem.
> >
> >I vote for "module revision" meaning "one specific version".
> >And when we say only "module" we should always mean the logical module.
> >
> >Agreed? :-) (pretty, pretty please...)
> 
> fine with me...
> 
> >Perhaps we should then rename those classes to ModuleRevision etc...
> >
> > > >- A Deltamodule inherits from Module and this could perhaps be
> > > >refactored with a common baseclass as is common in the Composite pattern
> > > >but I haven't looked into that issue. One fact though: Looking at the
> > > >code a Deltamodule CAN NOT have neighborModules. Those methods are
> > > >overridden. So, a Delta module can only be a leaf in the Module tree.
> > > >Henrik, what about applying the Composite pattern here? Or are there
> > > >other considerations?
> > >
> > > Why the restriction on neighborModules?  This would seem to imply that a
> > > class extension may not reference any classes (or globals) that are not
> > > already known to the parent. This would seem to extreme limit the use of
> > > deltamodules.
> >
> >Ah, I checked the code! :-) Well, a Deltamodule can not HAVE neighbors,
> >instead it defines CHANGES to it's basemodules neighbors. So I think
> >your guessed implication is not valid. Phew. :-)
> 
> Wait, not so fast.  Does the set of possible "changes" include the addition 
> of a new neighbor?  If not, you still have the problem. Regardless, I think 

I think it does.

> you may be adding unnecessary conceptual complexity.
> 
> Here's a sketch of what I think of as a prototypical "module" when I 
> consider things like this:
> 
> module "FooImplementation"
>          extends class Object with method isFoo
>          implements class Foo with assorted methods
>          references class Bar (defined in another module)
> 
> To me, this is a nice self-contained unit.  Why would I want to be thinking 
> about changing the definition of the module that defines Object.  It's not 

Eh, I wouldn't be thinking of doing that either... Ok, what would happen
is that I create a module
called say "People/gh/FooImplementation". In that my new class Foo is
put.
The module system will detect that I access Bar in another module
revision so it will add that module revision as an external module to
FooImplementation.

When I add the isFoo method to Object it will detect that I am trying to
change a class not residing in my "current module" (which is
FooImplementation). It will then create a Delta module for me and place
that as a submodule in FooImplementation.

Since it knows what module revision the class Object resides in the
delta module will have a reference to that as the "baseModule" (that is
the name of that instvar).

So we are not changing the module containing the class Object. Right?

> necessary in order to implement the above.  It also would have a broader 
> effect then I intend.  If Object's module now essentially imports Bar (or 

No, no. See above.

> Bar's module) as a neighbor then presumably this has the side-effect that 
> its other submodules of Object's module could also reference Bar.  (As a 
> side point, I also don't really want to think about which module implements 
> Bar)

Perhaps you thought that the delta module gets created in the module
containing the class Object?

> A possible reason for this difference in perspective just occurred to 
> me.  Most of the Squeak modularity discussion has been in the context of 
> "modularizing" the existing Squeak image.  In other words, taking what's 
> there and chopping it up into a set of modules.  If you are looking at the 
> problem from that perspective it's easy to think of the one definitive 
> structure of modules that define the current system and to tightly bind 
> them so you are guaranteed to have the same composition that you currently 
> have in the image.  However, that's really isn't the important perspective 
> going forward. You need to be able to define new modules that will be 
> composed in many different ways in different situations to assemble a wide 
> variety applications. It's really all about reusability. You need to be 
> able to reuse your modules in unanticipated ways.  This means you have to 
> be very careful about premature or unnecessary bindings.

I think we agree on that. And the "tightness" that you perceived from my
explanation might be just a mirage from the fact that the delta modules
record the exact module revision they where created against. But that
does not necessarily mean that we cannot apply them to other module
revisions.

> Like I said in the previous message.  If all the modules that make up the 
> current image are tightly bound you are no better off then you are today.

Agree.

> > > >...
> > > > > Eh, well since a Deltamodule defines a difference between two 
> > modules it
> > > > > should be able to contain a whole new class. But you would probably
> > > > > rather seldomly have one Module add classes to another module so it 
> > will
> > > > > probably be a rare case.
> > > > >
> > >
> > > Again, this seems like the common case, not the rare case.  The module 
> > that
> > > defines the Foo class also wants to add the isFoo method to Object and
> > > possibly other classes.
> >
> >Really? When you write "and possibly other classes." did you mean "and
> >add possibly other classes." or "add the isFoo method to possibly other
> >classes."?
> 
> I meant extend other classes but certainly you might need to also define 
> additional classes.  I specified my canonical test case above. Everything 
> else is pretty much an elaboration of one of the three concepts I used in 
> FooImplementation

Ok, still - I think you have misunderstood something that perhaps my
explanation above covered.
The important thing being that the "losse method" is contained in a
Delta module and that that delta module is a submodule of my
FooImplementation module. So in speech we could say the "the
FooImplementation module contains that loose method" since it contains
the delta module - I mean, the module "consists" of it's whole subtree
of modules and deltamodules.

> >It doesn't seem common to me that the module Morphic typically would add
> >classes to the module Collections (or any other module), methods
> >certainly probably all over the place but new classes in OTHER modules?
> >Don't really buy that. Convince me! :-)
> 
> No, I didn't mean "add a class to the Collection module".  However, a 
> method extension to a Collection class certainly might reference a class 
> that was not currently a neighbor of the Collection module.

Yes, and that would translate into that the delta module in Morphic for
the Collection module would both contain a loose method and a new
reference to an external module that should be added to Morphics set of
external modules. (mentioned above somewhere)

So I think I am still standing upright - even though I was a bit wobbly
in the first round... :-) :-)

> > > >...
> > > >Ok, so revising it a bit further (leaving out module parameters):
> > > >
> > > >- A Module contains global definitions including classes and can contain
> > > >other modules and Delta modules creating a tree.
> > >
> > > As explained above, it's not a tree -- it a cyclic graph!
> >
> >Yes, my threeliner tried to be "pedagogic" in the sense that the first
> >line establish the module tree and then I add the fact that there are
> >also crossreferences making it somewhat "graphish", but still - those
> >references are notably different.
> 
> As I mentioned above, it appears to me that there are really two overlayed 
> structures.  A definitional tree and a dependency graph.

Yes, that is correct.

> >Perhaps we should try to revise the threeliner even further then
> >(leaving out "module parameters" and changing "module" to "module
> >revision"):
> >
> >- A Module revision contains global definitions including classes and
> >can contain
> >other module revisions and Delta module revisions creating a tree.
> 
> Clarifying the meaning of the word "contain" in this context is probably 
> important.

Yes, Henrik and I was discussing what it means and we realized that it
can mean slightly different things on different levels in the module
tree. But we also settled on the fact that if you load (not activate)
any module in the tree you will load that whole subtree. This will be
rather interesting for some modules, but since you can always load a
module it is not a problem.

Examples (faking module names):

Load the module "People/gh" will load all the stuff that I have
published under my name.

Load the module "Squeak/Kernel/Collection 1.2" will load the Collection
code revision 1.2.

Load the module "Org/SqueakFoundation/Squeak 3.5/release" will load a
whole bunch of modules (like an ENVY configuration I guess or a tagged
release in CVS) that comprise the blessed version 3.5 of Squeak.

Load the module "Org/SqueakFoundation/Squeak 3.5/updates" might load a
bunch of delta modules acting as patches for bugs in Squeak 3.5. (This
might be a way of turning the current Squeak update stream into a series
of deltas instead of ChangeSets)

So, I would settle for a generic word "contains" since it can mean
different things to be a submodule of something else.

> >- A Delta module revision can only be a leaf in the module revision tree
> >and defines the
> >difference between two Module revisions. (changed methods, added
> >methods, added
> >classes, removed methods or classes etc.)
> >
> >- A Module revision also references so called external module revisions
> >(used module revisions) outside
> >of it's subtree. This enables reuse of module revisions across module
> >revision tree boundaries. This actually turns our simple tree into a
> >cyclic graph but do note that a module revision still only has one home
> >(=one parent module revision) in the tree.
> 
> So what if I need to define a module like this:
> 
> module "FooImplementation2"
>          extends class Object with method isFoo
>          extends class Collection with method asFoo
>          implements class Foo with assorted methods
>          references class Bar (defined in another module)
> 
> and Object and Collection are defined in two different modules (note I 
> didn't use the term "module revision" , at this level of abstraction I 
> don't really care about specific revisions).  It sounds like this extension 
> would be impossible to define because it would be a delta module that would 
> require two parents.  A similar problem would occur if I started with a 

Nope - it would turn into TWO delta modules instead. One for eact of the
modules that you mess around in.
In fact, I think Henrik said that Morphic (if you run some form of Delta
module extractor thingy he has written) currently has A LOT of delta
modules... Was it over 100? Can't remember.

> parent that defined two classes, created a delta module, and then I decided 
> I really needed to break the parent into two separate modules.

That is an interesting scenario though. Ok, a delta module Y with two
loose methods for two different classes in module X revision 1.0.

Then you move one of the classes in module X over to module Z thus
creating revision 1.1 of module X. Ok. The deltamodule still "knows"
that it really only applies to module X revision 1.0 and not necessarily
for later revisions. So, if module X revision 1.0 was published and then
someone tries to activate (load is fine as always) my delta module Y it
would be up to the conflict detection logic to at least say - "Oops, you
are activating a delta module that really applies to an older revision
of the base module it changes, should I proceed?". And it probably
should barf because the loose method has nowhere to go. And then it will
be a need for a nice tool to fix it! :-) I mean, we have all the
information at hand, loaded and all in the image - we just need to
figure out a reasonable action.

> Again, my one over-riding message is that for reusability you need to defer 
> binding decisions as late as possible.

Yes, and I think we do that too - it is just that there can't be any
harm in having some more information (the specifict revision of the base
module that is) available, can it?

> >...
> > > >- A Delta module can only be a leaf in the module tree and defines the
> > > >difference between two Modules. (changed methods, added methods, added
> > > >classes, removed methods or
> > > >  classes etc.)
> > >
> > > So to create a delta module you first must have two "editions" of a module
> > > and the creation process creates a third edition of the module (necessary
> > > to contain the child extension reference) plus an extension module?
> >
> >I think you lost me there. As it is meant to work now (I think) is that
> >when you change a class in a different module than your "own" (that you
> >are working in - think "current module" like in "current changeset") a
> >delta module revision is created and added to your own module revision.
> >Any other changes after that in that same other module will also end up
> >collected in this delta module revision.
> 
> I pushing to see if you really meant what you seemed to be saying. Module 
> revisions are presumably real entities.  You could store them externally, 
> compare them, etc. "defines the difference between two Module revisions" 
> says that you have two distinct module revisions already in hand. I don't 
> think this is really what you meant.
> 
> To be more concrete, you statement seems to say that if (and only if) I 
> have module revisions BazImplementation 1.0 and BazImplementation 2.0 then 
> I can create a delta module DeltaBaz 12.0 that captures the difference 
> between them.  The semantics of delta modules (as I understand them) would 

Ok, that was not what I meant, definitely not.

> also require the creation of BazImplemention 1.1 that explicitly lists 
> DeltaBaz 12.0 as a child.  BazImplemention 1.1 would then be functionally 
> equivalent to BazImplementation 2.0.

You keep putting the delta modules IN the modules that they change and
that is not what I have been saying (I hope).
They do not end up there - that would really get weird, which is
probably why you reacted! :-)

> I think what you really mean, is that you can start with just 
> BazImplementation 1.0 and instead of creating BazImplementation 2.0 to add 
> a new method you can create BazImplemention 1.1 and  DeltaBaz 12.0.

This is not what Delta modules are meant for. Not that I know at least.
They capture changes to OTHER modules - not changes to their parent
module.

Unfortunately I have lost track of what I originally was talking
about... ;-)

> Because of the fact that intermodule reference are always references to 
> specific module revisions,  BazImplemention 1.1+DeltaBaz 12.0 is still not 
> precisely equivalent to BazImplementation 2.0.  You would have to 
> explicitly reference one or the other (premature binding!)

Sorry, but this is not something I follow, probably because of the
misunderstanding above.
Again, delta modules do not end up inside the module that they are meant
to modify.

> >This delta module revision is also (mostly I guess) activated from the
> >start meaning that the method just added is actually there able to run.
> >
> >This was probably not an answer to your question...
> 
> While the manner in which modules are dynamically constructed is important 
> in building a good tool set, it shouldn't be necessary to think about 
> dynamic construction in order to understand the semantics of the module 
> system.  It's quite likely that the tools for dynamic module construction 

I agree - but it is just that sometimes it can help understanding. :-)

> may require modules to pass through "illegal" or incomplete states.  This 
> is analogous to filing-in a change set with forward reference.  You may 
> temporarily have Undeclared bindings but they will go away eventually.
> 
> > > >...
> > > >Obviously the Delta modules are the tricky part - but personally I just
> > > >view them as a "diff" between two modules. I do have a question to
> > > >Henrik though - the deltamodule refers to the baseModule that is applies
> > > >to but it doesn't refer to the resulting module when it has been
> > > >applied. Ok, I admin that the last sentence sounded strange, but let me
> > > >put it like this: When Delta module Z has been applied to "Morphic 1.23"
> > > >I can't really say if the result is "Morphic 1.24" or whatever? The
> > > >source is defined but the target is not. So... I am not even sure if
> > > >that is a problem, but perhaps it can be? When it comes to conflict
> > > >detection I mean. Whatever...
> > >
> > > Since I've been throwing stone I should also offer a solution.  I think
> > > what you need to decouple extensions and parents.  You can do this by
> > > introducing a new module type, call it an extended module. So, instead of
> > > defining Module "Morphic 1.23" that must explicitly include module "Z 
> > 4.56"
> > > (and which must explicitly say it extends "Morphic 1.23") it would define
> > > the extended module "MorphicWithZ 1.0" which includes as children "Morphic
> > > 1.23" and "Z 4.56"
> >
> >I am sorry, my brain is turning into jello here. I can't give any
> >sensible response.
> >
> >But I can try to clarify my little question:
> >
> >Currently a delta module revision exactly specifies a new revision of
> >another module revision.
> >It does this effectively by definining a bunch of differences that
> >should be applied to the specifically named module revision. So... delta
> >module revision "Z 1.0", which is actually my improved homehacked
> >version of "Morphic 1.23" includes a reference to "Morphic 1.23" and
> >then a bunch of changed methods, added methods etc.
> >
> >Loading and activating "Z 1.0" will also load and activate "Morphic
> >1.23" and then apply all those changes to it.
> >
> >Now, back to my question: What is now the activated revision of Morphic
> >called? It is not 1.23. And the delta module doesn't say. It's like an
> >"open edition" in ENVY (or whatever they are called) that hasn't got a
> >revision number yet. And more importantly - is this a problem? Perhaps
> >it's just fine, I can't see the implications.
> 
> Well I think those are all good questions that I think reinforce the point 
> that you don't really man that a delta module just capture the deference 
> between two pre-existing module revisions.

Correct - I misled you down that path somewhere in this thread. They are
NOT normally constructed from two preexisting revisions.

But since a delta module "is just a diff" - it could in theory be
constructed that way, which sortof got me thinking of the fact that a
delta module does not say what revision number the new modified
baseModule actually should have. Because if we (for some weird reason)
did create a delta module from two pre-existing module revisions then we
would know the revision number of the resulting baseModule.

One way of "covering our bases" would be to add an instvar to
DeltaModule called "resultModuleRevision" and prehaps rename
"baseModule" to "baseModuleRevision". If the resultModuleRevision was
null then the resulting revision has no revision number meaning that it
has never been published before. If it does point out a module revision
this DeltaModule has been created as a "diff" between two existing
published revisions and then we know the resulting revision when it has
been applied. This could prove to be useful if we start using delta
modules to capture changes between two existing module revisions and
start sending those around. Again - there is no such thing as "too much
information". :-)

Anyway - this is a pretty "far out" question but it might have some
implications on conflict detection.

> > > >...
> > > >regards, Göran
> > >
> > > Final thoughts.  I would expect a module system to support two major goals:
> > >          1) Reusability of modules
> > >          2) Reproducability of system configurations.
> > >
> > > These goals might appear to be conflicting but need not be. However, you
> > > have to be very careful in designing a module system to not achieve one of
> > > the goals at the expense of the other. My sense is that the design
> > > decisions in this system that support goal 2 will severely interfere with
> > > goal 1.
> >
> >Well, let us bounce this around a bit first and let Henrik explain it a
> >bit further - my impression is that he has given all this a lot of
> >thought and also read up on the problems too. He has also studied Ginzu
> >in some extent. So I have faith that it isn't as "bad" as it looks when
> >I try to explain it! :-)
> 
> As should be clear by now, it's the premature bindings that concern me.

Yes, that is the interesting question.

But I think it is more a question of how we handle the activation part
of the deltamodules - conflict detection etc.
I still do not see the "wrong" in recording the specific revision of the
baseModule - it does not mean that we can't apply it to another module
revision upon activation.

So what we need to dig into is the conflict detection upon activation -
that is where all the fun starts... :-)

> Thanks for the interaction. I like it.

Me too!

> Allen

regards, Göran