[ENH][Modules] Delta Modules [was: Another version]

Wed Oct 24 18:46:37 UTC 2001

At 12:34 PM 10/24/2001 +0100, goran.hultgren at bluefish.se wrote:
>...
>Allen Wirfs-Brock <Allen_Wirfs-Brock at Instantiations.com> wrote:
> > At 09:45 AM 10/23/2001 +0100, goran.hultgren at bluefish.se wrote:
> > >...
> > ...
> > translate externally?.  What is the external form of a module reference.
> > Externally, is definedNames a list of class and global names or is it the
> > actual source/object code for the defined entities. How about exported
> > names. Is it a list of strings?
>
>Ehrm. I have no good picture of this unfortunately! I hope Henrik chimes
>in.
>Have you got his code by the way? I can always give you an url with the
>image we used at OOPSLA.

Actually, I've been kind of avoiding using the code as a reference. 
Facilities like this should be explained and understandable without 
reference to source code.  Otherwise it is impossible to separate the 
specification from the implementation details.

> > >...
> > >- Submodule = Another module that is one of my children. Consider it a
> > >"part of me". Especially note that a submodule of "me" has me as it's
> > >parent. This is in contrast to the external modules whose parent is
> > >someone else completely.
> >
> > Note that the bi-directional linkage of submodules and parents mean that a
> > particular module may be a submodule of exactly one parent.. This would
> > seem to significantly limit the reusability of modules. This is
> > particularly a problem if you think about your modules in a context
> > external to an image. In concept a "module" could contain code that is
> > applicable in many different situations by many different parents but the
> > tight (and early) binding of the submodule to the parent means that to
> > reuse the code it will have to be duplicated within additional modules.
>
>Well, I think I say yes and no, or something... :-)
>
>Yes, each module has only one parent. The full name of the module is
>derived from its place in the module hierarchy. And yes, that is a tree
>which means a submodule only really lives in one single place.
>(Sidenote: I like this. This is simple. More or less like in Java btw. A
>module has a home and a name. Period.)

Wait...Java module?  Do you mean Java packages? I wouldn't characterize 
Java packages as "modules", they are a name space mechanism.  They simply 
provide a hierarchical naming scheme.  However, packages have no physical 
manifestation in Java.  There is no entity you can point at and say this is 
the package com.instantiations.jove. You can't even enumerate all the class 
that are defined in that package because it is an open ended set. All you 
can do, is for any particular class answer the question, "Is this class 
defined in the package com.instantiations.jove?"

Take a class such as Parser, defined in the package 
com.instantiations.jove.  It's real name is 
com.instantiations.jove.Parser.  All Java packages do is provide a lexical 
scope where that name can be abbreviated to its shorter form Parser.

There has been a lot of discussion (and little agreement??) on this list 
about what we mean by a "module". Cutting through the nuisances I suspect 
that most people could accept the following: A module is an *atomic* unit 
of functionality (code, objects, whatever) with an independent existence 
that can be incorporated into a program (image, whatever).  The atomisity 
aspect is very important.  If loading a module, the normal expectation is 
that you get the whole thing.

 From this perspective, a Java package is clearly not a "module". The only 
things in Java that fully match this definition are classes although an 
argument could also be made to consider Java "compilation units" (source 
files) as modules.

To wrap up this train of thought, lets revisit 
com.instantiations.jove.Parser.  There isn't a single instance of the com 
or the com.instantiatiuons or even the com.instantiations.jove package 
where this class lives.  There can be many different compositions of these 
packages some, but not all, of which contain Parser.

>BUT... you can also "link" modules as an external module - didn't I
>write that? Yes I did. But perhaps it wasn't clear. Or perhaps I am dead
>wrong. Henrik, correct me here but wasn't the idea that if a module
>needs to reuse an already existant module placed somewhere else
>completely in the module tree it just references it as an external
>module? I think so.

Perhaps, I guess this is where some of the complexity leads to confusion. 
Is there really the need for both external modules and submodules and 
module parameters.  Are their usages truly orthogonal? Will user's know 
when to choose one over the other? Couldn't this be simplified?

Regardless, you have clarified one thing for me.  Apparently, there is a 
module definition tree (with parent back pointers) and a module usage 
(dependence) graph.

>...
>
> >
> > An overly constrained module system is essentially the same as no module
> > system. If every module must explicitly identify the specific versions of
> > other modules it interacts with then you have essentially gone back to
> > having a monolithic image where any change causes a cascade of dependent
> > changes throughout the system.
>
>Ok, I see your argument. But... Hmm. The Debian package system is very
>early/tight bound and it sure doesn't mean it is useless... Anyway, I
>think I will pass on this one and let Henrik dive in! :-)
>
>I need that watch that Tim Olsen (name?) has in Superman...

I think his name is Jimmy

>...
> > Most systems would make a terminology distinction between a logical module
> > and a particular "version" of the module. Common terminology would be
> > "module version", "module revision", "module edition", etc. It's would
> > certainly help communications, to choose a term and then to be fastidious
> > in making the distinction.
>
>Ok, I like "module revision" or "module version". In the Smalltalk world
>the word "edition" and "version" might be tainted by earlier products in
>such a way that we should perhaps stay clear of them. And the word
>"revision" is used in CVS but the meaning there is probably intuitive
>enough to not be a problem.
>
>I vote for "module revision" meaning "one specific version".
>And when we say only "module" we should always mean the logical module.
>
>Agreed? :-) (pretty, pretty please...)

fine with me...

>Perhaps we should then rename those classes to ModuleRevision etc...
>
> > >- A Deltamodule inherits from Module and this could perhaps be
> > >refactored with a common baseclass as is common in the Composite pattern
> > >but I haven't looked into that issue. One fact though: Looking at the
> > >code a Deltamodule CAN NOT have neighborModules. Those methods are
> > >overridden. So, a Delta module can only be a leaf in the Module tree.
> > >Henrik, what about applying the Composite pattern here? Or are there
> > >other considerations?
> >
> > Why the restriction on neighborModules?  This would seem to imply that a
> > class extension may not reference any classes (or globals) that are not
> > already known to the parent. This would seem to extreme limit the use of
> > deltamodules.
>
>Ah, I checked the code! :-) Well, a Deltamodule can not HAVE neighbors,
>instead it defines CHANGES to it's basemodules neighbors. So I think
>your guessed implication is not valid. Phew. :-)

Wait, not so fast.  Does the set of possible "changes" include the addition 
of a new neighbor?  If not, you still have the problem. Regardless, I think 
you may be adding unnecessary conceptual complexity.

Here's a sketch of what I think of as a prototypical "module" when I 
consider things like this:

module "FooImplementation"
         extends class Object with method isFoo
         implements class Foo with assorted methods
         references class Bar (defined in another module)

To me, this is a nice self-contained unit.  Why would I want to be thinking 
about changing the definition of the module that defines Object.  It's not 
necessary in order to implement the above.  It also would have a broader 
effect then I intend.  If Object's module now essentially imports Bar (or 
Bar's module) as a neighbor then presumably this has the side-effect that 
its other submodules of Object's module could also reference Bar.  (As a 
side point, I also don't really want to think about which module implements 
Bar)

A possible reason for this difference in perspective just occurred to 
me.  Most of the Squeak modularity discussion has been in the context of 
"modularizing" the existing Squeak image.  In other words, taking what's 
there and chopping it up into a set of modules.  If you are looking at the 
problem from that perspective it's easy to think of the one definitive 
structure of modules that define the current system and to tightly bind 
them so you are guaranteed to have the same composition that you currently 
have in the image.  However, that's really isn't the important perspective 
going forward. You need to be able to define new modules that will be 
composed in many different ways in different situations to assemble a wide 
variety applications. It's really all about reusability. You need to be 
able to reuse your modules in unanticipated ways.  This means you have to 
be very careful about premature or unnecessary bindings.

Like I said in the previous message.  If all the modules that make up the 
current image are tightly bound you are no better off then you are today.

> > >...
> > > > Eh, well since a Deltamodule defines a difference between two 
> modules it
> > > > should be able to contain a whole new class. But you would probably
> > > > rather seldomly have one Module add classes to another module so it 
> will
> > > > probably be a rare case.
> > > >
> >
> > Again, this seems like the common case, not the rare case.  The module 
> that
> > defines the Foo class also wants to add the isFoo method to Object and
> > possibly other classes.
>
>Really? When you write "and possibly other classes." did you mean "and
>add possibly other classes." or "add the isFoo method to possibly other
>classes."?

I meant extend other classes but certainly you might need to also define 
additional classes.  I specified my canonical test case above. Everything 
else is pretty much an elaboration of one of the three concepts I used in 
FooImplementation

>It doesn't seem common to me that the module Morphic typically would add
>classes to the module Collections (or any other module), methods
>certainly probably all over the place but new classes in OTHER modules?
>Don't really buy that. Convince me! :-)

No, I didn't mean "add a class to the Collection module".  However, a 
method extension to a Collection class certainly might reference a class 
that was not currently a neighbor of the Collection module.

> > >...
> > >Ok, so revising it a bit further (leaving out module parameters):
> > >
> > >- A Module contains global definitions including classes and can contain
> > >other modules and Delta modules creating a tree.
> >
> > As explained above, it's not a tree -- it a cyclic graph!
>
>Yes, my threeliner tried to be "pedagogic" in the sense that the first
>line establish the module tree and then I add the fact that there are
>also crossreferences making it somewhat "graphish", but still - those
>references are notably different.

As I mentioned above, it appears to me that there are really two overlayed 
structures.  A definitional tree and a dependency graph.

>Perhaps we should try to revise the threeliner even further then
>(leaving out "module parameters" and changing "module" to "module
>revision"):
>
>- A Module revision contains global definitions including classes and
>can contain
>other module revisions and Delta module revisions creating a tree.

Clarifying the meaning of the word "contain" in this context is probably 
important.

>- A Delta module revision can only be a leaf in the module revision tree
>and defines the
>difference between two Module revisions. (changed methods, added
>methods, added
>classes, removed methods or classes etc.)
>
>- A Module revision also references so called external module revisions
>(used module revisions) outside
>of it's subtree. This enables reuse of module revisions across module
>revision tree boundaries. This actually turns our simple tree into a
>cyclic graph but do note that a module revision still only has one home
>(=one parent module revision) in the tree.

So what if I need to define a module like this:

module "FooImplementation2"
         extends class Object with method isFoo
         extends class Collection with method asFoo
         implements class Foo with assorted methods
         references class Bar (defined in another module)

and Object and Collection are defined in two different modules (note I 
didn't use the term "module revision" , at this level of abstraction I 
don't really care about specific revisions).  It sounds like this extension 
would be impossible to define because it would be a delta module that would 
require two parents.  A similar problem would occur if I started with a 
parent that defined two classes, created a delta module, and then I decided 
I really needed to break the parent into two separate modules.

Again, my one over-riding message is that for reusability you need to defer 
binding decisions as late as possible.

>...
> > >- A Delta module can only be a leaf in the module tree and defines the
> > >difference between two Modules. (changed methods, added methods, added
> > >classes, removed methods or
> > >  classes etc.)
> >
> > So to create a delta module you first must have two "editions" of a module
> > and the creation process creates a third edition of the module (necessary
> > to contain the child extension reference) plus an extension module?
>
>I think you lost me there. As it is meant to work now (I think) is that
>when you change a class in a different module than your "own" (that you
>are working in - think "current module" like in "current changeset") a
>delta module revision is created and added to your own module revision.
>Any other changes after that in that same other module will also end up
>collected in this delta module revision.

I pushing to see if you really meant what you seemed to be saying. Module 
revisions are presumably real entities.  You could store them externally, 
compare them, etc. "defines the difference between two Module revisions" 
says that you have two distinct module revisions already in hand. I don't 
think this is really what you meant.

To be more concrete, you statement seems to say that if (and only if) I 
have module revisions BazImplementation 1.0 and BazImplementation 2.0 then 
I can create a delta module DeltaBaz 12.0 that captures the difference 
between them.  The semantics of delta modules (as I understand them) would 
also require the creation of BazImplemention 1.1 that explicitly lists 
DeltaBaz 12.0 as a child.  BazImplemention 1.1 would then be functionally 
equivalent to BazImplementation 2.0.

I think what you really mean, is that you can start with just 
BazImplementation 1.0 and instead of creating BazImplementation 2.0 to add 
a new method you can create BazImplemention 1.1 and  DeltaBaz 12.0.

Because of the fact that intermodule reference are always references to 
specific module revisions,  BazImplemention 1.1+DeltaBaz 12.0 is still not 
precisely equivalent to BazImplementation 2.0.  You would have to 
explicitly reference one or the other (premature binding!)

>This delta module revision is also (mostly I guess) activated from the
>start meaning that the method just added is actually there able to run.
>
>This was probably not an answer to your question...

While the manner in which modules are dynamically constructed is important 
in building a good tool set, it shouldn't be necessary to think about 
dynamic construction in order to understand the semantics of the module 
system.  It's quite likely that the tools for dynamic module construction 
may require modules to pass through "illegal" or incomplete states.  This 
is analogous to filing-in a change set with forward reference.  You may 
temporarily have Undeclared bindings but they will go away eventually.

> >
> > >...
> > >Obviously the Delta modules are the tricky part - but personally I just
> > >view them as a "diff" between two modules. I do have a question to
> > >Henrik though - the deltamodule refers to the baseModule that is applies
> > >to but it doesn't refer to the resulting module when it has been
> > >applied. Ok, I admin that the last sentence sounded strange, but let me
> > >put it like this: When Delta module Z has been applied to "Morphic 1.23"
> > >I can't really say if the result is "Morphic 1.24" or whatever? The
> > >source is defined but the target is not. So... I am not even sure if
> > >that is a problem, but perhaps it can be? When it comes to conflict
> > >detection I mean. Whatever...
> >
> > Since I've been throwing stone I should also offer a solution.  I think
> > what you need to decouple extensions and parents.  You can do this by
> > introducing a new module type, call it an extended module. So, instead of
> > defining Module "Morphic 1.23" that must explicitly include module "Z 
> 4.56"
> > (and which must explicitly say it extends "Morphic 1.23") it would define
> > the extended module "MorphicWithZ 1.0" which includes as children "Morphic
> > 1.23" and "Z 4.56"
>
>I am sorry, my brain is turning into jello here. I can't give any
>sensible response.
>
>But I can try to clarify my little question:
>
>Currently a delta module revision exactly specifies a new revision of
>another module revision.
>It does this effectively by definining a bunch of differences that
>should be applied to the specifically named module revision. So... delta
>module revision "Z 1.0", which is actually my improved homehacked
>version of "Morphic 1.23" includes a reference to "Morphic 1.23" and
>then a bunch of changed methods, added methods etc.
>
>Loading and activating "Z 1.0" will also load and activate "Morphic
>1.23" and then apply all those changes to it.
>
>Now, back to my question: What is now the activated revision of Morphic
>called? It is not 1.23. And the delta module doesn't say. It's like an
>"open edition" in ENVY (or whatever they are called) that hasn't got a
>revision number yet. And more importantly - is this a problem? Perhaps
>it's just fine, I can't see the implications.

Well I think those are all good questions that I think reinforce the point 
that you don't really man that a delta module just capture the deference 
between two pre-existing module revisions.

> > >...
> > >regards, Göran
> >
> > Final thoughts.  I would expect a module system to support two major goals:
> >          1) Reusability of modules
> >          2) Reproducability of system configurations.
> >
> > These goals might appear to be conflicting but need not be. However, you
> > have to be very careful in designing a module system to not achieve one of
> > the goals at the expense of the other. My sense is that the design
> > decisions in this system that support goal 2 will severely interfere with
> > goal 1.
>
>Well, let us bounce this around a bit first and let Henrik explain it a
>bit further - my impression is that he has given all this a lot of
>thought and also read up on the problems too. He has also studied Ginzu
>in some extent. So I have faith that it isn't as "bad" as it looks when
>I try to explain it! :-)

As should be clear by now, it's the premature bindings that concern me.

Thanks for the interaction. I like it.

Allen