Andreas Raab squeak-e@lists.squeakfoundation.org said:
I don't know where you heard that the object format is going to be changed in 3.5 but adding a pointer to every object is unlikely to be done unless you have a Very Good Reason(tm). It will instantly break all plugins and the added space and GC overhead is not to be taken lightly.
Last thing I heard was that there were some changes necessary for various things, and that 3.5 would be the version where this would, as far as possible, would all be collected so that we'd have a break in image file format only once. But that was last October, before I went into a long dive involving directory services and other ugly business matters :-)
I've already been thinking about the implications GC-wise and space wise. Probably some compression is possible and necessary. 90% of all objects, certainly in the beginning, will point to a default environment. A bit indicating whether there's additional environment info could save a lot of space and time. A second bit might indicate whether a short or long environment pointer is used (6-8 bit index in a table vs. 32 bit pointer). Immutables probably wouldn't need to have it at all.
Anyway, the discussion *might* be premature; however, before diving into options it would be nice to know how feasible they are. So I fear that we need to discuss ways *how* to make this happen more-or-less in parallel with the discussion *what* exactly we want to make happen (e.g., what do we want to point to).
In other words, even though it's premature from a technical viewpoint, as none of use here wants to fork Squeak (I hope) it is probably prudent to inventory the options for doing things like this as early as possible and see how/whether support can be raised. Politics, again...
Cees,
[Finally back home...]
Last thing I heard was that there were some changes necessary for various things, and that 3.5 would be the version where this would, as far as possible, would all be collected so that we'd have a break in image file format only once.
I don't remember this discussion (but then I've been busy) but I thought the "big changes" were within VI4.
I've already been thinking about the implications GC-wise and space wise. Probably some compression is possible and necessary. 90% of all objects, certainly in the beginning, will point to a default environment. A bit indicating whether there's additional environment info could save a lot of space and time. A second bit might indicate whether a short or long environment pointer is used (6-8 bit index in a table vs. 32 bit pointer).
This seems both, too complex (in terms of object header encoding - the three different object header formats we have are already pretty complex to handle) as well as overly simplified (who is to say that 6-8bits are enough?!).
Anyway, the discussion *might* be premature; however, before diving into options it would be nice to know how feasible they are. So I fear that we need to discuss ways *how* to make this happen more-or-less in parallel with the discussion *what* exactly we want to make happen (e.g., what do we want to point to).
Okay, for the slow-witted like me - can someone please restate what this extra pointer would be good for?! I can see various ways of "making it happen" (with more or less complex designs) but it would really help me if I would understand why we actually need "some extra pointer per object".
Cheers, - Andreas
At 01:45 PM 2/4/2003 Tuesday, Andreas Raab wrote:
Okay, for the slow-witted like me - can someone please restate what this extra pointer would be good for?! I can see various ways of "making it happen" (with more or less complex designs) but it would really help me if I would understand why we actually need "some extra pointer per object".
I'll explain a possible adaptation of KeyKOS / EROS brand mechanism. But before I do, I'd like to reiterate
* I agree with Colin that making such changes in order to support virtually nested virtual machines is premature. Most of the rest of Squeak-E is based on ideas that have already been worked out in other languages including E. The present notions are also quite important -- they could largely resolve the apparent conflict between the requirements of object granularity capability security vs the extreme self-malleability pioneered by the Smalltalks. But there's very little precedent for these techniques. The only precedents I'm aware of are 1) IBM's VM and its ilk (VMWare), and 2) "Meta Interpreters for Real" by Shmuel Safra and Udi Shapiro in 'Concurrent Prolog: Collected Papers', 1987. (I can't find it online. Does anyone have access to an electronic version? I do have paper, and could scan it in if need be.)
#2 is encouraging as a) it was language based, b) it was understood to be compatible with capability security (though this isn't mentioned in the paper), and c) because it was how the system really worked in production. Nevertheless, #1 and even #2 were applied to foundations sufficiently different than Squeak that we can only take them as suggestive of solutions.
* Although we do need extra state, I'm not at all convinced it needs to be per-object rather than per-Behavior. Many of the "meta-controls" we'll seek we'll probably obtain by source-to-source transformation anyway, in which case we'll want the instance to point at the transformed class, not the original one. Is there any reason not to be dynamically instantiating Behaviors? Since the meta-controls will probably only be used at low speed, and since the same meta info will be shared by many instances of the same class, this seems like a good tradeoff.
Btw, you wouldn't need to virtualize all classes on each virtualizer creation. If you do it lazily, then most virtualizers should only cause the virtualization of a small number of classes.
An Attempt to Adapt the KeyKOS / EROS Brand mechanism to Squeak-E
(I don't know the original mechanism well enough to faithfully describe it, and I can't find a good description by Googling.
Norm and Shap (cc'ed above), can you point us at a good description?)
(Btw, E uses the term "Brand" for a concept derived from the KeyKOS Brand, but the E concept is different, and I will not refer to it below.)
A normal object can be considered a combination of several parts:
* It's "state", or the creation-time capture of bindings from variable names used freely within the object's behavior, to actual variables. (In E, all such variables are called "instance variables". I avoided this term, since Squeak uses this term for a subset of these.)
* It's "behavior" -- static pure code and pure-data literals (without variables bindings or literal references to mutable or authority conveying objects), which determines how the object responds to incoming messages. (I'm purposely drawing a boundary between state and behavior different than Squeak's boundary between instance and Behavior.)
* Special state registers which primitives can specially recognize and act on without sending a message to the object. This would be a way to account for the special role played by object identity in Smalltalk or E. Although nothing separate is actually separately stored, the object's address may as well be considered to be as-if residing in a specially recognized register inside the object. The primitive used to implement "==" as-if obtains the value of this register from both receiver and argument without sending a message to either.
By analogy, the Brand would reside in another such special register, but this one would need actual storage somewhere. It encodes not the identity of the individual object but the identity of the object's "creator". I'll defer defining "creator" for now. All objects created by the same creator would have the same Brand, but let's assume only the creator, not the instances, have access to that Brand. (I don't think it's necessary, but it makes the story simpler.) Let's also assume default lexical contagion of the brand, so BlockClosures and stack-frames created by a given object share that object's brand.
As in Lex's proposal, an ObjectInspector instance wrapping a given object gives access to the state and behavior of that object. Let's say the only way to obtain an ObjectInspector on a given instance is:
aBrand inspect: object ifFail: failBlock
or
aBlock inspect: object
where the second throws an exception if it fails.
If the object's brand is aBrand, this should succeed. Otherwise, it must fail. In neither case does it send a message to the object.
Now, to possess a set of brands, one must somehow be in bed with the corresponding set of creators. (In KeyKOS, a set of brands is gathered together into a CanOpener.) With such a set of brands, you can debug the internals of all instances that you can reach that are made by these creators. A particularly interesting case is that, given a continuation, you can obtain open access to the corresponding stack frame iff you have the brand of the object whose invocation created this stack frame.
This is a case of rights amplification: Opaque object + brand => ObjectInstector.
In some ways this is an example of the pure virtually nested VM story, at least for one level deep. A branded object could be explained as running a pretend interpreter in which the interpreted object's state is actually kept in an internal object with the ObjectInspector API, which we could even suppose the pretend interpreter uses when interpreting the object's behavior. We can even explain away the "without sending a message to the object": We'd have to suppose the interpreter maintains a weak EQ table, mapping from the identity of each externally presented object it creates to the corresponding internal ObjectInspector it uses for managing that state.
Although the Brand is an example of a specially supportable virtualization, it isn't yet clear that it's a representative example. It also isn't clear when the other virtualizations will need their own additional state, vs when the one additional state variable will enable multiple purposes.
Finally, as I defined "state" and "behavior", all of the behavior is in a Smalltalk Behavior. Some of an object's state is already split between the instance and the Behavior. It's not at all clear on which side of this split the Brand should fall.
---------------------------------------- Text by me above is hereby placed in the public domain
Cheers, --MarkM
At 03:34 PM 2/4/2003 Tuesday, Mark S. Miller wrote:
Finally, as I defined "state" and "behavior", all of the behavior is in a Smalltalk Behavior. Some of an object's state is already split between the instance and the Behavior. It's not at all clear on which side of this split the Brand should fall.
I withdraw my choice of words. Let's leave "Behavior" with its Smalltalk meaning -- it's the thing instances point to from their special class pointer (what is this actually called). For what I was calling "behavior", how about "code"? An object would then be a combination of state and code. Or should we adopt the Actors term "script"? Do either have any conflicts with the Squeak universe I should know about?
---------------------------------------- Text by me above is hereby placed in the public domain
Cheers, --MarkM
At 15:34 -0800 03/02/04, Mark S. Miller wrote:
...
An Attempt to Adapt the KeyKOS / EROS Brand mechanism to Squeak-E
(I don't know the original mechanism well enough to faithfully describe it, and I can't find a good description by Googling.
Norm and Shap (cc'ed above), can you point us at a good description?)
I just wrote up the role of the brand in creation - vetting - debugging - deletion. http://cap-lore.com/CapTheory/KK/DT.html. I would like feed-back to improve the note, as usual. Especially of the form "What does that mean?".
squeak-e@lists.squeakfoundation.org