Namespaces and... modules (was Re: Name spaces in Spoon)

Mon May 29 09:19:32 UTC 2006

Hi folks!

First a few comments in no particular order:

- I have decided to again go through my Namespaces code and will publish
and do a boringly detailed description of how it works, method by
method. :)

- Note that the :: syntax *is* easy to implement - I can say that
because Andreas helped me do it in the Namespaces code which is
available on SM and *works fine*. QED.

- If we go for a runtime messages approach as Dan describes then another
disadvantage is the analysis of code. Currently I am working on some
analysis code that is intended to run in the SM server that figures out
dependencies between packages based on their references and definitions
of globals. This would not really work if we move this to runtime, or at
least it gets definitely more complicated.

Andreas wrote: 
> You get very similar advantages when you consider all references to  
> "outside" globals be really definitions inside the module's  
> namespace itself. In other words, if module Foo would use "Array  
> new" it would really mean "Foo::Array" (e.g., the value of #Array  
> inside the module Foo) and you could populate (parametrize) that  
> via, say "Foo::Array := Collections::Array" etc. This has the added  
> advantage that there is no true "global" access to anything (e.g.,  
> no ambient authority beyound what was explicitly given to the  
> module) and that multiple modules can co-exist with different  
> parametrizations.

Let me mention that Henrik actually had stuff along these lines
(parametrization just like above) in modules 3.3a.

But considering these issues in light of my Namespaces solution it also
seems very easy to "remap" bindings at compile time - because the
references are *always* qualified fully in the source. So for example,
before filing in code that uses Foo::X, we could remap Foo:: to Bar::
and thus have Foo::X resolved to Bar::X. (note that it is pretty nice to
be able to say "Foo::" instead of "the namespace called 'Foo'" or
"namespace Foo")

And another note: My code does not forbid good ole globals. In fact, I
probably propose that we do NOT split up the current "basic" image into
any namespaces at all. If you file in my Namespaces code into an image
you can then proceed working *just as you do now*. There are NO
differences. Well, ok, some small tiny differences are there at the
moment - but I intend to turn those "off" using Preferences with default
values false. So soon there will be NO differences.

I also agree with Michael Rueger (as Dan mentioned) that hierarchies
*are* complex. Especially considering if we start making remappings like
the above possible. I urge us all to "remember the Alamo" - 3.3a
modules. It was too foreign, too "complicated", it made experienced
Smalltalkers fumble and lose the ball. And the result was that noone
moved into the "New House Of Modularity" - and thus it died. Please, do
not underestimate the problems with such "advanced" approaches -
especially if they radically change the "taste and feel" of Smalltalk.

Now... to move ahead a bit. Modules...

A simple proposal for Modules
-----------------------------------

I strongly claim that Namespaces and Modules are two different things -
even though they in practice often (other languages) are handled in the
same mechanism since they mostly overlap. Someone said once on
squeak-dev (one of the Alans on the list IIRC) that the only interesting
"meaning" of a module is "separately deployable unit". I very much agree
with that.

So then, if we consider a Module to be a unit of code (let's consider
only code for the moment - it doesn't hamper the idea AFAICT) then we
have:

1. References to "globals" that are not in the module itself.
2. Defined "globals" in the module.

This is like inputs and outputs. :) Of course - I am using the term
"globals" so that we can relate this to how things work today. Also note
that these are named objects - not necessarily classes - but again, let
us focus on code.

Now - if we install (=file in, create classes, compile code) a module A
it will have to resolve all foreign globals (#1) above and it will add
its own globals (#2). Monticello and regular Squeak deals with missing
globals in various ways - if I am not mistaken the Association is
created and put in Undeclared. This means that all compiled methods
referring to say nonexistent class Banana are linked to the same
Association - but with a value of nil.

And then if you install or create class Banana it will "reuse" that
Association in Undeclared (and remove it from Undeclared). So this is
how Squeak generally deals with installing code without having to worry
too much about load order. All references to Banana will refer to the
same object - regardless if they were compiled before or after Banana
was created.

I want to make this mechanism visible and much more tangible. I want to
be able to install a "module" A into the image without having it be
"activated" (just like Henrik had in 3.3a). Then I can look at it and
see that yes, it still is not functional because its binding with the
class Banana is still not fulfilled. Then I can install module B (which
has Banana) and see that module A is now fulfilled and should work if we
activate it together with module B.

So to summarize:

1. I want to use my Namespaces code, which more or less is "prefixes
done right". This introduces class Namespace and each prefix corresponds
to an instance of Namespace held in Smalltalk. Or expressed in
Smalltalk:

	Foo::X == ((Smalltalk at: #Foo) at: #X)
	Foo == (Smalltalk at: #Foo)
	Foo class == Namespace

2. A simple module system would be to create class Module and let each
instance represent a "piece" of the image with inputs and outputs like
above. It could have behaviors like activate, deactivate, hasAllInputs,
inputs, outputs and so on. How a Module defines its boundaries can be
discussed - perhaps it has a list of PIs, or whatever. And again, it can
have a few different implementations depending on if it is just a chunk
of object memory with named inputs/outputs or if we actually know it is
code.

I am not sure if I made the above easily understandable nor if it is a
good way to do it. But IMHO it seems simple, *understandable* and rather
generic. It is all about hooking a piece of object memory into the image
and make sure it can reconnect its connections in both directions
(inputs and outputs) using name lookup.

regards, Göran