Eliminating superclass lookup in the VM (and dynamic
composition of behavior)
Allen Wirfs-Brock
Allen_Wirfs-Brock at Instantiations.com
Thu Dec 12 20:59:54 UTC 2002
At 09:26 AM 12/12/2002 +0100, Nathanael Schärli wrote:
>...
>Nevertheless, I see that while developing and experimenting with such a
>conceptual model, it would often be nice if the VM would already provide
>more flexibility, even if this additional power would be accessible from
>the image-level in a rather ad-hoc way.
>
>Nathanael
I'm making my contribution to this thread off of Nathanael posting because
his points seems closest to my perspective on these issues. So, here are my
two bits worth.
I think it is a mistake to think of the language and "virtual machine" as a
single tightly couple unit. Instead you want to have a number of layered
abstractions which collectively implement your language(s). Each layer has
a specific purpose and should ideally only interact with its neighboring
layers. The design and implementation of each layer can be optimized for
its specific purpose. If you design the lower layers of this tower
carefully then you will have much flexibility in supporting differing
languages or language semantic variations in the upper layers.
Here is a cut at identifying a set of such layers starting from the
"highest" and proceeding to the "lowest". Next to each I have identified
elements of a typical Smalltalk implementation that conceptually fit into
each layers
Language Specification (the ANSI Smalltalk standard, the
specification the syntax and sematics of Smalltalk programs)
Language Specific MOP (objects modeling classes, metaclass,
methods, etc. bytecode compiler, etc.
Language Independent Object Execution Model (the bytecode
instruction set, contexts, closures, continuations, object structure, etc.)
Execution Engine (bytecode interpreter and/or jitter,
memory manager, etc.)
Host ISP Architecture (your favorite microprocessor)
You get a more rigid, less flexible, system when a layer knows too much
about other non-adjacent layers. An example of this would be an Execution
Engine that knows it is specifically implementing ANSI Smalltalk (or
Smalltalk-80/Squeak or Self) and hence contains design decisions based upon
that knowledge. If you change the language in a way that invalidates those
decisions then the system won't work. This sort of undesirable cross layer
coupling works both ways, if your language specification requires that the
execution engine work in a specific way then it will be very difficult to
innovate at the Execution Engine layer. It is very easy to blur these
layers and create such couplings when you are building a system, as a unit,
to support a single specific language. This has tended to be the case for
most Smalltalk implementations.
One way to design such layers is to focus on each layer's role and what it
provides to its immediately adjacent layers. Here is a closer look at the
three middle layers above.
Language Specific MOP: Provides a language specific object model that can
be used to represent and manipulate programs and their runtime state for
some specific language. Translates such programs to/from the Execution Model.
Language Independent Object Execution Model: Provides a set of runtime
abstractions that can be used to represent programs in a variety of
object-orient languages. The runtime abstraction need be sufficient for
modeling the runtime semantics of all hosted language (this mapping is
performed by the MOP layer). The runtime abstractions also need to
efficiently implementable by the Execution Engine.
Execution Engine: Animates the Object Execution Model by translating its
runtime abstractions to/from data structures and executable code for some
specific host ISP
The issues that have been discussed in this thread, largely relate to the
Execution Model layer. The Squeak/Smalltalk-80 instruction set has a very
specific runtime abstraction ("Superclass lookup") for message
binding. This abstraction assumes a very specific set of language
semantics and a single very Smalltalk-80 specific MOP. The Digitalk
Execution Model was "better" because it's runtime abstraction for message
binding was more general and less dependent upon a specific MOP. It
supports multi-level lookup without introducing or requiring the language
specific concept of superclass. However, the Digitalk Execution Model had
plenty of other "unnecessary" cross layer dependencies.
One specific recommendation I have is that if you want a system that is
going to be flexible enough to implement a variety of languages or to
implement alternative language semantics then your Execution Model probably
needs to separate the operations for "message binding" from "method
invocation". You also need a very flexible primitive binding mechanism that
is also easy to implement efficiently by Execution Engines. The Digitalk
object-specific "array of bindings" is a pretty good starting point for
thinking about such binding mechanisms.
Allen_Wirfs-Brock at Instantiations.com
More information about the Squeak-dev
mailing list
|