Environments

Tue Nov 26 14:43:42 UTC 2002

Marked for a long time for replying to it later, now it comes:

Anthony Hannan wrote:
> After reading Alan Kay's "Is 'Software Engineering' an Oxymoron?" in
> appendix B of the Croquet User Manual, it dawned on me that the solution
> to our modularity problems lies in late binding.  I have always tried to
> think of solutions that would not add extra levels of indirection
> because of the stigma of performance impact.  But after reading Croquet
> and Alan's article, I felt free to use as much indirection (late
> binding) as necessary.

> Machines are fast, plus techniques like JIT
> inlining and caches, such as the method cache, can be used to provide
> temporary quick bindings.

Agreed.

> 
> So below is another module design, but which significantly uses late
> binding.  This design was inspired by: PIE, Subjective Self (Us), and
> PerspectiveS.
> 
> Environments and Variables
> 
> All words (variables and selectors) in an executing method are looked up
> in the current context.  The current context means thisContext for local
> variables, the stack for thread variables (generalization of exception
> handlers), and the process's current environment for environment
> (global) variables and methods.  Instance variables are no longer
> referenced directly by name but indirectly by accessors.  Class vars are
> just environment variables that have its class included in its lookup
> key.  And methods are also just environment variables with its selector
> and class as its lookup key.  Methods are no longer held directly by its
> class, but by each environment so they can be easily overriden.

Independent of the current future of modules and relatives I like
- consistently using accessor methods,
- viewing and implementing method lookup by (class, selector) pairs,
- nested environments
  (one Smalltalk environment for the current image as start)
.

> 
> The general algorithm for looking up an environment variable is:  Look
> in the current environment and if not there recursively look in its
> inherited environments.
> 
> The general algorithm for looking up a method to execute for a message
> send is:  Look in the current environment for the key composed of the
> message's selector and the receiver's class, and if not there
> recursively look in it inherited environments.  It still not found
> search all the environments again using the receiver's superclass in the
> key, and so on.
> 
> The Advantage
> 
> This late binding of selectors/variables to methods/objects allows
> environments to intervene and redirect.  This is equivalent to what a
> module does when it is installed.  But the difference is environments
> don't need to be installed.  This allows us to keep our "single mode"
> principle: no distinction between runtime and deployment/development. 
> This is an important discipline in Squeak and Croquet which we should
> not compromise.
> 
> Many environments can reside together in the same system, and processes
> can run in any environment and change environments.
> 
> Environments as Changesets
> 
> A user can run inside his own environment so any changed/new variables,
> classes, and methods will be added to his environment without affecting
> the originals.  He can share his environment with others, so they can
> inherit from it (import it).  He can easily move out of his environment
> (uninstall) and start or load another one.

I would merely view environments as namespaces. A user environment could be
used to temporarily switch on/off *groups* of changed methods of arbitrary
classes to compare different implementations and/or bug fixes related to
multiple methods at once. This is similar to what I've seen in the Pie paper.

Later the user environment could be merged into the inherited standard one
in case of a patch/fix or be renamed and made loadable in case of a
variant/extension.

> 
> Multiple Inheritance of Environments
> 
> What if a user wants to use two independent environments at the same
> time.  Single inheritance would require him to order them.  I prefer
> using multiple inheritance and raise an exception when lookup is
> ambiguous.  Unambiguous means linear search of all topological sort
> permutations should find the same result.

Currently I'd prefer to stay at single inheritance for simplicity and lookup
 performance.
Simplicity also applies to the understandability of the system for human
developers.
To state it short and simplified: it doesn't help to create a system well
done for creatures with an IQ (whatever that means, but this is another
discussion) > 1000.

It has to be a goal to keep it both simple and powerful!
(A counterexample is C++, which is powerful, but *not* simple (note: I know,
that the comparison with Smalltalk limps in many ways).)

Greetings,

Stephan

> 
> I use the term "inherit" for environments instead of "import" because
> (to me) import implies seeing only an environment's variables but not
> seeing through to its imports.  Inheritance, on the other hand, implies
> seeing it and through it.
> 
> Namespaces
> 
> Environments already provide a namespace - you can only see what's in
> your environment and its inherited environments.  But what if you are
> inheriting two independent environments that happen to define the same
> name.  To solve this I am defining a word (variable or selector) as more
> than just a name but as location in a specific environment.  Other
> environments override an inherited location in its own environment by
> specifying the original environment and the name.  If it does not
> specify the original environment, the name is considered a new location
> in its environment.  So words are unique by name and original
> environment.
> 
> Selectors are defined independent of any class in an original
> environment.  And classes are defined independent of any selectors in an
> original environment.  Methods are then defined in any environment by
> specify the location of its selector and class.
> 
> The Smalltalk compiler will look up words starting from the environment
> the method is defined in.  It will return the location matching each
> word string.  If more than one location matches, then a pop-up is given
> to the user allowing him to choose which original environment.  When
> displaying a method, environment names will be prefixed before words
> that are ambiguous.  A menu option will allow the user to see all hidden
> prefixes he wants.
> 
> So now new environments can be inherited without worrying about name
> clashing.  If an environment does override a word/location, then it is
> overriding a specific definition which is intended, not something by
> chance of having the same name.
> 
> Performance Optimizations
> 
> The existing VM method cache will work, but now needs to be flushed on
> process environment change or saved with each environment.  An
> environment variable cache will be needed as well.  Again it either has
> to be flushed on environment change or saved with each environment. 
> Finally, Jitter optimization are also specific to each environment, and
> also will need to be flushed or saved on environment change.  If
> environments don't change often and cache hit rates are high, then
> execution should not be much slower then today.
> 
> Cheers,
> Anthony
> 
> 

-- 
Stephan Rudlof (sr at evolgo.de)
   "Genius doesn't work on an assembly line basis.
    You can't simply say, 'Today I will be brilliant.'"
    -- Kirk, "The Ultimate Computer", stardate 4731.3

Environments - Extreme late binding