Comments on Smalltalk block closure designs, part 1

Allen Wirfs-Brock Allen_Wirfs-Brock at Instantiations.com
Thu Apr 26 05:56:44 UTC 2001


At 11:15 PM 4/25/2001 +0200, Stephan Rudlof wrote:
>Dear Allen and other Squeakers!
>
>Allen, I want to thank you for your interesting comments first!
>
>In the following I try to give some 'mapping' of the rules given by Allen to
>my proposal. This should lead to a better understanding (for me, too!) of the
>differences between these approaches. I try to describe which elements in the
>BCSBlockContexts draft correspond to these rules. Please note, that there has
>just been an update of my draft (triggered by Andreas Raab).
>
>Expression
>         '->'
>means
>         'the former corresponds roughly to the following elements from 
> "Draft for
>Block Closure Semantics for Squeak v0.7.0"'
>.
>
>Allen Wirfs-Brock wrote:
>
><snipped much of very interesting stuff>
>
> > I've now covered enough to explain how Smalltalk blocks with full closure
> > semantics can be implemented an a way that never forces reification of the
> > activation record stack:
> >
> >         Allocate all arguments and temps that are not "captured" by 
> blocks within
> > stack allocated activation records.
>
>-> blockCopy if needed; it is needed if there starts a second evaluation
>before finishing the first one;
>-> stack allocated activation records correspond to *some* of the indexable
>fields of the created BCSBlockContexts;

My statement above was talking about how the activation records (think 
MethodContext) for method and block invocations should be organized. 
Activation records are created as a side-effect of a method invocation or 
block evaluation.  blockCopy is the operation that creates a block object 
(a block closure), it has nothing to do with invocations. It appears to me 
that you are perpetuating the biggest "flaw" (from a closure semantics 
perspective) with the classic Smalltalk-80 design.  Block objects and block 
activation records are not the same things, you need two types of objects 
to represent them, not a single class that tries to be both. One way to see 
this is to look a them from a object-oriented design perspective...block 
closures and block activations have disjoint responsibilities and different 
cardinality, hence they should not be modeled by a single class:

         Block Closure
                 Responsibilities
                         Associate a pre-existing variable environment with 
a code block.
                         Initiate evaluation of the code block in the 
context of the associated variable environment
                 Cardinality
                         One per block constructor evaluation

         Block Activation Record
                 Responsibilities
                         Provide storage for the arguments and local 
variable for a particular block evaluation.
                         Record the continuation (the "return address") for 
a particular block evaluation
                 Cardinality
                         One per block closure evaluation.

For every block closure object that is created there may be N block 
activation record objects created. This cardinality mismatch is what causes 
the recursion/reentrancy problems for the classic Smalltalk-80 design and 
which forces you to use special logic to describe what happens if a 
BCSBlockContext is already active when it is evaluated.

I appreciate that fact that you need to incrementally enhance an existing 
system and keep it working as you make changes (I wonder if anybody has 
ever thought about putting a "firewall" been the development environment 
and the execution environment so you could make changes like this without 
worrying about crashing?  Oh well, a topic for another time...).  Given 
this, I would start by solve this fundamental design problem in the context 
of simple cases before worrying too much about deeply nested blocks 
requiring lexical addressing.

I'd start by considering the simplest interesting blocks with arguments but 
no references to outer scopes and no up arrow returns such as a typical 
sort bock:
         [:x :y | x >= y]

and I would develop a design that separated the block object from its 
activation and which didn't "borrow" space for arguments from its home 
context.  I would notice that all the block object for such a block needs 
to record is the location of the code (compiled method and start pc offset) 
and the number of expected arguments. In this case, such an object would 
seem to need at most three instance variables and no indexable fields. I 
would also notice that a method context (or something very similar, but I 
mean a separate method context, not the home context) would make a fine 
activation record for such a block.  If I was a disciple of Kent Beck I 
would immediately try to implement something.  Here are the steps I would take:

         1) Define the class SimpleBlock with instance variables: method, 
startPC, argCount
         2) Implement a new set of #value primitive methods for 
SimpleBlocks that allocates a new context (I'd try to get buy using 
MethodContext but if necessary would define a new class), copies the 
arguments into it, sets the method and pc from the block object and finally 
makes the context the active context.
         3) Manually dummy up a some instances of SimpleBlock to test the 
above primitive.
         4) Modify the compiler to recognize blocks that qualified as 
SimpleBlocks.  Change the block construction code generated from a 
blockCopy: to a push literal of a compiler constructed SimpleBlock object 
(since SimpleBlocks have no mutable state, one instance can be shared by 
all evaluations of a particular block constructor).
         5) Make any necessary changes to the compiler so that the code 
emitted for SimpleBlocks works when activated as in step 2 above.  This 
shouldn't involve much as basically you want it to be just like what would 
be generated if it was at the out level of a method and the block args were 
the method args.
         6) Test all of the above.

At this point you would have laid the foundation for everything else you 
need to do.  You will have an implementation that, for these simple blocks, 
is already better than the classic implementation.  You can write a 
recursive factorial using them that works. You don't have the fixtemps 
problem (does Squeak still have this?), You can create and stash a sort 
block without capturing the the entire activation chain!

Next, I would extend the above design to deal with block local variables 
such as:
         [:x | | a b|  a := x foo.  b := x bar. a baz:b]

This should be almost trivial because by using MethodContext activation 
records already know how to support temporary variables. The definition of 
SimpleBlock would probably have to be enhanced to record a local variable 
count.

Next, I would extend the design to deal with blocks that reference 
variables in the surrounding scope.  It start with blocks that reference 
out just one level:
         someMethod: arg
                 ^[self somethingElse:arg]

This is a nice text because it forces capture of the outer variables self 
and arg. It probably requires a new class of block object (ComplexBlock) 
and you can't get away with using a literal as the value of the block 
constructor.

Then I'd work on nested blocks with up-level addressing to intermediate 
levels. I'd probably start using static links for up-level addressing (most 
blocks are only one level deep anyway). Later, I might optimize using a 
partial display similar to what you describe in your design.

Finally, I'd make sure that up arrow return worked in all the above cases.


> >         All variables that are subject to capture by blocks are explicitly
> > allocated in heap allocated environment objects
>
>-> nothing like that

You can probably get by with static links that chain together activation 
records (MethodContexts) but this has the reification problem that 
ultimately is a performance killer.


> >                 (or, if "read-only", copied into the block object).
>
>-> same
>
> >                 (only activations that have captured variables need 
> allocate environment
> > objects, except for the following rule)
> >         The activation of a method that contains blocks with up arrow 
> returns
> > always allocates a heap environment object.
> >                 (it may be empty, in the sense that it contains no 
> variable, but it must
> > be allocated)
> >         Activation records reference ("point to") environment objects but
> > environment objects never point to activation records.
> >         Environment objects and block objects may reference other 
> environment objects.
>
>-> there are no environment objects at all (for returning BCSBlockContexts
>have a home pointer, see below),

Your environment objects are implicitly part of your context objects

>-> outer temps reside in indexable fields of outer contexts,
>-> outer temps are directly (oops for readonly vars) or indirectly (o/i pairs
>for read/write vars) referenced by entries in indexable fields of actual
>BCSBlockContext;
>
> >         A block (a closure) object references its enclosing 
> environment(s) (but
> > never the enclosing activation record(s))
>
>-> there is the sender pointer to the next outer BCSBlockContext;
>-> there is the home pointer to the MethodContext which has generated the
>outmost BCSBlockContext (of the BCSBlockContext chain);
>-> in addition outer contexts at arbitrary places in the BCSBlockContext chain
>may be referenced by entries in the indexable fields of the actual context
>(for read/write temps);

The home pointer should only be needed (in your design) if the block has an 
up arrow return.  Otherwise, the uplevel addressing mechanism should take 
care of everything else.




> >         A block object that implements an up arrow return captures a 
> reference to
> > the environment associated with its method activation.
> >                 (This is used as the continuation marker)
>
>-> home pointer
>
> >         An up arrow return is implemented by searching up the 
> activation stack for
> > an activation record whose environment reference is the same as the
> >                 continuation marker of the returning bock.  The 
> activation stack is
> > trimmed to that activation record and a return is forced.
> >                 (If no activation record with that environment is found 
> we have a
> > #cannotReturn: situation)
>
>-> just return to home sender
>
>
>I hope I haven't missed important things!
>
> >
> > Enough for part #1, with luck I'll still be motivated to keep writing on
> > part #2 tomorrow
> >
> > Allen
> >
>
>That'd be nice!
>
>
>Greetings,
>
>Stephan
>--
>Stephan Rudlof (sr at evolgo.de)
>    "Genius doesn't work on an assembly line basis.
>     You can't simply say, 'Today I will be brilliant.'"
>     -- Kirk, "The Ultimate Computer", stardate 4731.3





More information about the Squeak-dev mailing list