Modules

Fri Feb 25 17:41:37 UTC 2005

Dan Ingalls wrote on Fri, 25 Feb 2005 10:50:19 -0800
> Modules in Squeak
> I may be wrong, but I feel it is possible to solve several problems at once
> with a good design for modules. 

This is how I like to build systems as well. To cite you (page 22 of the
"green book"):

# One way of stating the Smalltalk philosophy is to "choose a small
number of
# general principals and apply them uniformly." This approach has
somewhat
# of a recursive thrust, for it implies that once you've built
something, you
# ought to be using it whenever possible.

The parallel efforts of the various teams just created will tend to go
against this, but I will be glad if the modules can be leveraged to help
their projects without delaying them.

But back to "several problems at once", a trivial example is virtual
memory. Squeak doesn't have it which isn't a problem on most systems on
which it runs since they have their own paged virtual memory systems.
But if you can quickly load and unload image segments, that can be used
as an extremely simple virtual memory system which would come in handy
in a PDA which can only access the large Flash memory indirectly, for
example.

> I also feel that it is possible to keep the modularization nearly invisible to
> the casual user.

True, and this is what I like about the Traits thing or Göran's
namespaces - anyone who didn't have problems with the old system doesn't
see them at all. It is only when you try to do something different where
things would break down in plain Squeak that you have to learn the new
ideas.

It isn't easy, however. The changes tend to leak out. For example: I am
building two Smalltalk systems. The 16 bit one is a traditional closed
one where things like "ClassX allInstances" make perfect sense. The 36
bit one is modular and as an open system that expression's result is
less useful. In particular, you can never be sure that you have seen all
the instances "out there". An outdated explanation of my "modules" can
be found in http://www.merlintec.com:8080/software/8 if anyone is
interested, but a lot of that might not be relevant for Squeak.

While I started seriously thinking about this in 1984 (I thought an
image based system was fine in a lab, but wouldn't be as nice in a
product) the only stuff in this area which I designed and was actually
implemented and tested was an OO OS for the PC AT (286, for you young
ones) in 1988.

One thing that I see as a problem in the way of properly partitioning
Squeak is that we use a single mechanism to deal with different kinds of
coupling between objects. Fixing that would require changes visible to
the users, so this is probably not the way to go for this project but
I'll explain it anyway. As Alan Cox likes to point out, in the real
world you have objects that are soldered together, others that are
bolted together, that are wired together with connectors that are easy
to snap on/snap off and so on. But when I do

         inst := MyHelper new.

and distribute this as text to be recompiled in another user's system,
the coupling is looser than it should be. Their environment is different
and the text could be ambiguous. I know exactly what object I am talking
about and having an intra-module pointer to it would be just right. The
other user shouldn't be bothered by the name I used at all, as I think
Craig Latta would agree.

On the other hand,

         p := HTMLParser new.

is far too tight a coupling for my taste. Replacing the text with an
inter-module pointer wouldn't really help. This kind of dependency might
allow the module I used while developing to be autoloaded into the
user's system when my module is, but is that what we want? As long as p
is the right type (understands a given set of messages) the job will get
done. If the user's system has a different, but compatible, parser then
I would like to use that instead. Some global Protocol (or Interface or
whatever) objects that acted as factories would be nice:

       p := HTMLParsingProtocol makeInstance.

would check the loaded modules, fetch one according to the local
preferences if none of the loaded ones has the needed class and finally
would get the class to create a new instance for us. There is no reason
why this shouldn't be written exactly as the previous expression (with a
Protocol instance simply replacing the class and implementing #new), of
course, and then any such change would be far less visible to the users.

In short: my suggestion is that checking all class references in the
image (and packages) and separating those that should be tightly coupled
from those that should be loose (by making the latter indirect) would
make partitioning the image much easier.

About the modules themselves, I like binary ones like imageSegments
better because they apply to all objects instead of just sources.  My
preference is also to refer to modules by some universal ID and then
have a separate service to map that to a local file name or URL from
where it can be loaded. This allows me to move and rename them without
breaking stuff. For immutable modules a good ID is simply the hash of
the contents. A sequence of immutable modules can simulate one mutable
one (TeaTime?) and again I would rather have a separate service deal
with this than hard coding it in the modules themselves.

The iAPX432 (Intel's first 32 bit processor, for you young ones) got me
interested in capability based security. Having different people see
different interfaces for the same object is the most flexible way to
express capabilities. I can imagine more than one way to implement that
and for all of them you would get the ability to put extensions to one
module inside another and to load incompatible stuff simultanously "for
free".

One more thing ;-)  (sorry that this is so long already) - in a binary
module system I would move the sources from external files into the
image (not really - into a module). So I would have a "main" module that
would include the class objects, the compiled methods and anything else
that is needed at run time. This shouldn't be too big and would be the
only thing that most users would ever load. A "source" module would have
a set of string objects, which happen to be the sources for the stuff in
"main", which would have inter-module pointers to the strings so trying
to see the methods in a browser or debugger would automatically load
"source". I would include at least a third module, "docs". This would
have objects with far nicer documentation than just the sources, and I
would include in that a set of test methods. This documentation would be
more at a package level than method comments or class comments and could
include animations and all kinds of neat stuff, including pointers to
related objects in other modules. This would automatically be loaded
when my actions in the browser implied "serious" use of the module.

-- Jecel