Implementing closures (was: Re: 3.9 vs. 3.10 : Closures, fixTemps)

Thu Dec 20 04:19:33 UTC 2007

Mathieu Suen wrote:
> Well so propose us your solution.

Okay, I promised not to invent a solution but it's one of "those" 
problems that I've thought about for a while in the past so let's see. 
Here is how I would address the problem of implementing full closures in 
Squeak:

1) First, I would fix the "attempt to evaluate a block that is already 
evaluated" problem. That's about ten lines of code in the VM - if you 
look at primitiveValue[withArgs] you'll find that we fail if the 
CallerIndex is non-nil. Instead of failing, I'd clone the original 
context and continue as usual.

At this point you'd be able to execute code like:

   | v |
   fac := [v > 0 ifTrue:[1] ifFalse:[v := v-1. fac value * (v+1)]].
   fac value: 10.

which allows recursive evaluation but without block args (since they get 
hacked in some terrible ways).

2) Next, I'd deal with block variables (temps and args). Instead of 
sharing the method contexts, blocks would store their own temps and 
access them via tempAt:. So that for example code like here:

   self do:[:v| | v2 | v2 := v squared. v2].

would be compiled into to:

   self do:[:v|
     thisContext push: nil. "allocate v2"
     thisContext tempAt: 2 "v2" put: (thisContext tempAt: 1 "v") squared.
     thisContext tempAt: 2.
   ].

This would be slower than what we have today but obviously, the pattern 
"thisContext tempAt:" is a prime candidate for optimization. At this 
point we'd be able to run the following correctly:

fac [:v| v = 0 ifTrue:[1] ifFalse:[ v * (fac value: v-1)].

The implementation would be pretty straightforward too, since all you'd 
have to do is to add BlockVariableNode that generates tempAt: messages 
(plus some initialization). Very, very straightforward.

3) Next, I would deal with "outer" state like here:

   self do:[:x|
     x do:[:y| x * y]
   ].

This is probably the hardest part since dealing with "non-method 
closed-over state" can be tricky due to life-time issues. Anyway, for 
starters I would just add a hidden temporary "outerContext" for all 
blocks that is just a temp like any other such that the compiler can 
generate code like:

   self do:[:x|
     x do:[:y|
       ((thisContext tempAt: 1) "outerContext" tempAt: 1 "x") *
         (thisContext tempAt: 2) "y"
     ].
   ].

At this point (I think - please correct me if I'm missing something) we 
have *complete* closure semantics in a straightforward way. What remains 
is optimization.

4) Lastly, optimization. I would start with the obvious 
push/popBlockTemp bytecodes which would bring all of the above up to par 
with current Squeak (except (3) which could still be slower). Finally, 
one might look at the life-time of closed-over state - it is certainly 
not necessary to preserve the entire outer context so there may be other 
things that could be done to fix it.

In any case, the above looks pretty simple and straightforward. Wouldn't 
you agree?

Cheers,
   - Andreas