[Squeak-e] Brands and Nested VMs (was: Programming the VM)

Tue Feb 4 15:34:26 CET 2003

At 01:45 PM 2/4/2003 Tuesday, Andreas Raab wrote:
>Okay, for the slow-witted like me - can someone please restate what this
>extra pointer would be good for?! I can see various ways of "making it
>happen" (with more or less complex designs) but it would really help me if I
>would understand why we actually need "some extra pointer per object".

I'll explain a possible adaptation of KeyKOS / EROS brand mechanism. But 
before I do, I'd like to reiterate

* I agree with Colin that making such changes in order to support virtually 
nested virtual machines is premature. Most of the rest of Squeak-E is based 
on ideas that have already been worked out in other languages including E. 
The present notions are also quite important -- they could largely resolve 
the apparent conflict between the requirements of object granularity 
capability security vs the extreme self-malleability pioneered by the 
Smalltalks. But there's very little precedent for these techniques. The only 
precedents I'm aware of are 1) IBM's VM and its ilk (VMWare), and 2) "Meta 
Interpreters for Real" by Shmuel Safra and Udi Shapiro in 'Concurrent 
Prolog: Collected Papers', 1987. (I can't find it online. Does anyone have 
access to an electronic version? I do have paper, and could scan it in if 
need be.)

#2 is encouraging as a) it was language based, b) it was understood to be 
compatible with capability security (though this isn't mentioned in the 
paper), and c) because it was how the system really worked in production. 
Nevertheless, #1 and even #2 were applied to foundations sufficiently 
different than Squeak that we can only take them as suggestive of solutions.

* Although we do need extra state, I'm not at all convinced it needs to be 
per-object rather than per-Behavior. Many of the "meta-controls" we'll seek 
we'll probably obtain by source-to-source transformation anyway, in which 
case we'll want the instance to point at the transformed class, not the 
original one. Is there any reason not to be dynamically instantiating 
Behaviors? Since the meta-controls will probably only be used at low speed, 
and since the same meta info will be shared by many instances of the same 
class, this seems like a good tradeoff.

Btw, you wouldn't need to virtualize all classes on each virtualizer 
creation. If you do it lazily, then most virtualizers should only cause the 
virtualization of a small number of classes.

An Attempt to Adapt the KeyKOS / EROS Brand mechanism to Squeak-E

(I don't know the original mechanism well enough to faithfully describe it, 
and I can't find a good description by Googling. 

Norm and Shap (cc'ed above), can you point us at a good description?)

(Btw, E uses the term "Brand" for a concept derived from the KeyKOS Brand, 
but the E concept is different, and I will not refer to it below.)

A normal object can be considered a combination of several parts:

* It's "state", or the creation-time capture of bindings from variable names 
used freely within the object's behavior, to actual variables. (In E, all 
such variables are called "instance variables". I avoided this term, since 
Squeak uses this term for a subset of these.)

* It's "behavior" -- static pure code and pure-data literals (without 
variables bindings or literal references to mutable or authority conveying 
objects), which determines how the object responds to incoming messages.
(I'm purposely drawing a boundary between state and behavior different than 
Squeak's boundary between instance and Behavior.)

* Special state registers which primitives can specially recognize and act 
on without sending a message to the object. This would be a way to account 
for the special role played by object identity in Smalltalk or E. Although 
nothing separate is actually separately stored, the object's address may as 
well be considered to be as-if residing in a specially recognized register 
inside the object. The primitive used to implement "==" as-if obtains the 
value of this register from both receiver and argument without sending a 
message to either.

By analogy, the Brand would reside in another such special register, but 
this one would need actual storage somewhere. It encodes not the identity of 
the individual object but the identity of the object's "creator". I'll defer 
defining "creator" for now. All objects created by the same creator would 
have the same Brand, but let's assume only the creator, not the instances, 
have access to that Brand. (I don't think it's necessary, but it makes the 
story simpler.) Let's also assume default lexical contagion of the brand, so 
BlockClosures and stack-frames created by a given object share that 
object's brand.

As in Lex's proposal, an ObjectInspector instance wrapping a given object 
gives access to the state and behavior of that object. Let's say the only 
way to obtain an ObjectInspector on a given instance is:

    aBrand inspect: object ifFail: failBlock

or

    aBlock inspect: object

where the second throws an exception if it fails.

If the object's brand is aBrand, this should succeed. Otherwise, it must 
fail. In neither case does it send a message to the object.

Now, to possess a set of brands, one must somehow be in bed with the 
corresponding set of creators. (In KeyKOS, a set of brands is gathered 
together into a CanOpener.) With such a set of brands, you can debug the 
internals of all instances that you can reach that are made by these 
creators. A particularly interesting case is that, given a continuation, you 
can obtain open access to the corresponding stack frame iff you have the 
brand of the object whose invocation created this stack frame.

This is a case of rights amplification: 
Opaque object + brand => ObjectInstector.

In some ways this is an example of the pure virtually nested VM story, at 
least for one level deep. A branded object could be explained as running a
pretend interpreter in which the interpreted object's state is actually kept 
in an internal object with the ObjectInspector API, which we could even 
suppose the pretend interpreter uses when interpreting the object's 
behavior. We can even explain away the "without sending a message to the 
object": We'd have to suppose the interpreter maintains a weak EQ table, 
mapping from the identity of each externally presented object it creates to 
the corresponding internal ObjectInspector it uses for managing that state.

Although the Brand is an example of a specially supportable virtualization, 
it isn't yet clear that it's a representative example. It also isn't clear 
when the other virtualizations will need their own additional state, vs 
when the one additional state variable will enable multiple purposes.

Finally, as I defined "state" and "behavior", all of the behavior is in a 
Smalltalk Behavior. Some of an object's state is already split between the 
instance and the Behavior. It's not at all clear on which side of this split 
the Brand should fall.

----------------------------------------
Text by me above is hereby placed in the public domain

        Cheers,
        --MarkM