[VM] Getting closure activation speed up to par

Anthony Hannan ajh18 at cornell.edu
Fri Sep 5 02:58:28 UTC 2003


"Andreas Raab" <andreas.raab at gmx.de> wrote:
> I've been recently (as in: today ;) been reminded that we should really make
> use of the new block closures (for which primitive support will be in any
> new VMs as they are part of VMMaker-3.6). IIRC, then only significant
> downside is activation speed which - according to your notes - was at about
> half of what using BlockContexts were today.
> 
> So what ways do we have to bring their activation speed closer to the speed
> of block activation today? I don't know if you've done any exhaustive
> measures to figure out where the time actually goes but my feeling is that
> there is a good chance that we may be overlooking something in general
> activation speed. Question: Do you have any feeling whatsoever, where the
> additional time is spent? Is is just context allocation? Somehow I don't
> quite think this would be the case as we should be pretty good with the
> context recycling here.

1. The Interpreter redundantly fetches receiver, method, initialIp, and
initialSp in #fetchContextRegisters: when it already had them in
#activateNewMethod.  So we can probably improve things a little without
changing any design just be inlining #newActiveContext: and its calls
into #activateNewMethod and manually streamlining it.

2. Slowness of independent context objects (even if recycled) compared
to stack frames are:
  a. Copying receiver and args into new context instead of reusing them
in place in the new frame.
  b. Memory locality of the contexts vs. stack frames for efficient
processor cacheing.

3. Finally, a compiler optimization could be used in either context
objects or stack frame to eliminate the need to pre-allocate temps. 
Instead they can be allocated in the code when first assigned.  For
example in:

| a b |
a _ 1 odd.
b _ 2 even.
^ a & b

Instead of pushing two nil then executing the following code:
 pushConstant: 1
 send: odd
 popIntoTemp: 0
 pushConstant: 2
 send: even
 popIntoTemp: 1
 pushTemp: 0
 pushTemp: 1
 send: &
 returnTop

we can avoid push nils and just start executing the following code:
 pushConstant: 1
 send: odd
 pushConstant: 2
 send: even
 send: &
 returnTop

The code will be responsible for allocating temps and can usually
allocate it with its initial value in one step.  Right now we initialize
with nil during activation, then the code pushes its initial value
higher on the stack then copies it down to its correct spot.

> what we may be
> able to short-cut specifically for block (as opposed to: general method)
> activation?

I don't think there is anything we can do specifically for block
closures to make the faster.  They act just like regular objects except
its value method is instance specific.

> One thought that occured to me is that there's some potential that the byte
> codes for #value/#value: could make some difference. This may not be much
> but there's definitely some overhead here so we may consider to make those
> bytecodes support closure activation instead of optimizing for BlockContexts
> as they do today - in particular as this would allow us to use the
> #internalXYZ variants for activating the closure, which would get inlined
> into the VM, etc. etc. etc.

I don't think we can do much better.  A block has its captured free vars
in its indexable slots and its method accesses them directly using
pushReceiverVar:.  Block creation uses a different #blockCopy: method
(#createBlockt:...) which isn't a special bytecode but this only affects
block creation not block activation.

> If you have any thoughts on this I'd be delighted to hear them. Making the
> closures as fast (or very close to) what we have in BlockContexts today
> would be a great step forward for (finally) getting true closures to work in
> Squeak. And perhaps be a perfect little project for 3.7alpha ;-)

There is hope.  I was able to double the send speed (according to
tinyBenchmarks) in my VI4 implementation using 3 optimizations described
above.

Cheers,
Anthony



More information about the Squeak-dev mailing list