Environments

Wed Nov 6 22:34:22 UTC 2002

After reading Alan Kay's "Is 'Software Engineering' an Oxymoron?" in
appendix B of the Croquet User Manual, it dawned on me that the solution
to our modularity problems lies in late binding.  I have always tried to
think of solutions that would not add extra levels of indirection
because of the stigma of performance impact.  But after reading Croquet
and Alan's article, I felt free to use as much indirection (late
binding) as necessary.  Machines are fast, plus techniques like JIT
inlining and caches, such as the method cache, can be used to provide
temporary quick bindings.

So below is another module design, but which significantly uses late
binding.  This design was inspired by: PIE, Subjective Self (Us), and
PerspectiveS.

Environments and Variables

All words (variables and selectors) in an executing method are looked up
in the current context.  The current context means thisContext for local
variables, the stack for thread variables (generalization of exception
handlers), and the process's current environment for environment
(global) variables and methods.  Instance variables are no longer
referenced directly by name but indirectly by accessors.  Class vars are
just environment variables that have its class included in its lookup
key.  And methods are also just environment variables with its selector
and class as its lookup key.  Methods are no longer held directly by its
class, but by each environment so they can be easily overriden.

The general algorithm for looking up an environment variable is:  Look
in the current environment and if not there recursively look in its
inherited environments.

The general algorithm for looking up a method to execute for a message
send is:  Look in the current environment for the key composed of the
message's selector and the receiver's class, and if not there
recursively look in it inherited environments.  It still not found
search all the environments again using the receiver's superclass in the
key, and so on.

The Advantage

This late binding of selectors/variables to methods/objects allows
environments to intervene and redirect.  This is equivalent to what a
module does when it is installed.  But the difference is environments
don't need to be installed.  This allows us to keep our "single mode"
principle: no distinction between runtime and deployment/development. 
This is an important discipline in Squeak and Croquet which we should
not compromise.

Many environments can reside together in the same system, and processes
can run in any environment and change environments.

Environments as Changesets

A user can run inside his own environment so any changed/new variables,
classes, and methods will be added to his environment without affecting
the originals.  He can share his environment with others, so they can
inherit from it (import it).  He can easily move out of his environment
(uninstall) and start or load another one.

Multiple Inheritance of Environments

What if a user wants to use two independent environments at the same
time.  Single inheritance would require him to order them.  I prefer
using multiple inheritance and raise an exception when lookup is
ambiguous.  Unambiguous means linear search of all topological sort
permutations should find the same result.

I use the term "inherit" for environments instead of "import" because
(to me) import implies seeing only an environment's variables but not
seeing through to its imports.  Inheritance, on the other hand, implies
seeing it and through it.

Namespaces

Environments already provide a namespace - you can only see what's in
your environment and its inherited environments.  But what if you are
inheriting two independent environments that happen to define the same
name.  To solve this I am defining a word (variable or selector) as more
than just a name but as location in a specific environment.  Other
environments override an inherited location in its own environment by
specifying the original environment and the name.  If it does not
specify the original environment, the name is considered a new location
in its environment.  So words are unique by name and original
environment.

Selectors are defined independent of any class in an original
environment.  And classes are defined independent of any selectors in an
original environment.  Methods are then defined in any environment by
specify the location of its selector and class.

The Smalltalk compiler will look up words starting from the environment
the method is defined in.  It will return the location matching each
word string.  If more than one location matches, then a pop-up is given
to the user allowing him to choose which original environment.  When
displaying a method, environment names will be prefixed before words
that are ambiguous.  A menu option will allow the user to see all hidden
prefixes he wants.

So now new environments can be inherited without worrying about name
clashing.  If an environment does override a word/location, then it is
overriding a specific definition which is intended, not something by
chance of having the same name.

Performance Optimizations

The existing VM method cache will work, but now needs to be flushed on
process environment change or saved with each environment.  An
environment variable cache will be needed as well.  Again it either has
to be flushed on environment change or saved with each environment. 
Finally, Jitter optimization are also specific to each environment, and
also will need to be flushed or saved on environment change.  If
environments don't change often and cache hit rates are high, then
execution should not be much slower then today.

Cheers,
Anthony

Environments - Extreme late binding