Eliminating superclass lookup in the VM (and dynamic composition of behavior)

Thu Dec 12 20:59:54 UTC 2002

At 09:26 AM 12/12/2002 +0100, Nathanael Schärli wrote:
>...
>Nevertheless, I see that while developing and experimenting with such a
>conceptual model, it would often be nice if the VM would already provide
>more flexibility, even if this additional power would be accessible from
>the image-level in a rather ad-hoc way.
>
>Nathanael

I'm making my contribution to this thread off of Nathanael posting because 
his points seems closest to my perspective on these issues. So, here are my 
two bits worth.

I think it is a mistake to think of the language and "virtual machine" as a 
single tightly couple unit.  Instead you want to have a number of layered 
abstractions which collectively implement your language(s).  Each layer has 
a specific purpose and should ideally only interact with its neighboring 
layers. The design and implementation of each layer can be optimized for 
its specific purpose. If you design the lower layers of this tower 
carefully then you will have much flexibility in supporting differing 
languages or language semantic variations in the upper layers.

Here is a cut at identifying a set of such layers starting from the 
"highest" and proceeding to the "lowest".  Next to each I have identified 
elements of a typical Smalltalk implementation that conceptually fit into 
each layers

         Language Specification (the ANSI Smalltalk standard, the 
specification the syntax and sematics of Smalltalk programs)
         Language Specific MOP  (objects modeling classes, metaclass, 
methods, etc. bytecode compiler, etc.
         Language Independent Object Execution Model  (the bytecode 
instruction set,  contexts, closures, continuations, object structure, etc.)
         Execution Engine        (bytecode interpreter and/or jitter, 
memory manager, etc.)
         Host ISP Architecture   (your favorite microprocessor)

You get a more rigid, less flexible, system when a layer knows too much 
about other non-adjacent layers.  An example of this would be an Execution 
Engine that knows it is specifically implementing ANSI Smalltalk (or 
Smalltalk-80/Squeak or Self) and hence contains design decisions based upon 
that knowledge. If you change the language in a way that invalidates those 
decisions then the system won't work. This sort of undesirable cross layer 
coupling works both ways, if your language specification requires that the 
execution engine work in a specific way then it will be very difficult to 
innovate at the Execution Engine layer. It is very easy to blur these 
layers and create such couplings when you are building a system, as a unit, 
to support a single specific language.  This has tended to be the case for 
most Smalltalk implementations.

One way to design such layers is to focus on each layer's role and what it 
provides to its immediately adjacent layers. Here is a closer look at the 
three middle layers above.

Language Specific MOP: Provides a language specific object model that can 
be used to represent and manipulate programs and their runtime state for 
some specific language. Translates such programs to/from the Execution Model.

Language Independent Object Execution Model: Provides a set of runtime 
abstractions that can be used to represent programs in a variety of 
object-orient languages. The runtime abstraction need be sufficient for 
modeling the runtime semantics of all hosted language (this mapping is 
performed by the MOP layer). The runtime abstractions also need to 
efficiently implementable by the Execution Engine.

Execution Engine: Animates the Object Execution Model by translating its 
runtime abstractions to/from data structures and executable code for some 
specific host ISP

The issues that have been discussed in this thread, largely relate to the 
Execution Model layer. The Squeak/Smalltalk-80 instruction set has a very 
specific  runtime abstraction ("Superclass lookup") for message 
binding.  This abstraction assumes a very specific set of language 
semantics and a single very Smalltalk-80 specific MOP.  The Digitalk 
Execution Model was "better" because it's runtime abstraction for message 
binding was more general and less dependent upon a specific MOP. It 
supports multi-level lookup without introducing or requiring the language 
specific concept of superclass. However, the Digitalk Execution Model had 
plenty of other "unnecessary" cross layer dependencies.

One specific recommendation I have is that if you want a system that is 
going to be flexible enough to implement a variety of languages or to 
implement alternative language semantics then your Execution Model probably 
needs to separate the operations for "message binding" from "method 
invocation". You also need a very flexible primitive binding mechanism that 
is also easy to implement efficiently by Execution Engines. The Digitalk 
object-specific "array of bindings" is a pretty good starting point for 
thinking about such binding mechanisms.

Allen_Wirfs-Brock at Instantiations.com