More fun with VMs
David Farber
dfarber at numenor.com
Wed Mar 22 19:16:19 UTC 2006
Alan - Are you referring to this paper:
LAMBDA: The Ultimate GOTO
http://repository.readscheme.org/ftp/papers/ai-lab-pubs/AIM-443.pdf
Or one the others:
http://library.readscheme.org/page1.html
Sidenote: In googling for this paper, I found that Richard Gabriel
was going to publish a collection of Guy Steele's papers in book
form--but this doesn't appear to have happened. See http://
www.dreamsongspress.com. I, for one, would buy this in an instant.
david
On Mar 22, 2006, at 9:11 AM, Alan Kay wrote:
> Hi Dan --
>
> Wrt your first optimization ... The SCHEME guys used similar
> arguments in one or more of the SCHEME papers to show that there
> are many cases in which no stack has to be allocated or popped (so
> a simple goto in the code will do the job). Someone on the Squeak
> list probably has a reference to the paper or papers I'm talking
> about. Sounds like a good idea, and should work pretty well. The
> second idea sounds like it should work very well also.
>
> Cheers,
>
> Alan
>
> At 06:11 AM 3/22/2006, Dan Ingalls wrote:
>> Hi -
>>
>> I've got my little Squeak in Java running (hope to send out a link
>> soon), and I've been pondering how to make it run faster. In the
>> process, I've thought of two techniques, one of which is new (to
>> me) and the other occurred to me years ago, but I never tried it out.
>>
>> Since neither would really be all that hard to do in Squeak, I
>> thought I'd mention them here for those folks who delight in such
>> things, and with the further hope that someone might actually try
>> them out.
>>
>>
>> Lazy Activation
>> This was the next thing I was going to do for Apple Smalltalk back
>> when I got drafted to the hotel business back in 1987. The
>> essence of the idea is that the purpose of activating a context is
>> to save the execution state in case you have to do a send and,
>> conversely, you don't really need an activation if you never need
>> to do a real send.
>>
>> I had a lot of fun instrumenting the VM to figure out just how
>> many activations could be avoided in this way, and my recollection
>> is that it was roughly 50%. I believe the statistics were better
>> dynamically than statically, because there are a lot of methods
>> that, in general need to be activated, but they may begin with a
>> test such as
>> position > limit ifTrue: [^ false]
>> and for every time that this test succeeds, you can get away
>> without ever needing an activation.
>>
>> But, you say, you still need a pointer to the method bytes and a
>> stack frame, and this is true, but you don't need to allocate and
>> initialize a full context, nor to transfer the arguments. The
>> idea is that, when you hit the send, you do the lookup, find the
>> method, and then jump to a *separate copy* of the interpreter that
>> has a different set of bytecode service routines. For instance,
>> 'loadTemp' will, depending on the argument count, load from the
>> stack of the calling method (which is still the "active"
>> context). 'Push', since there is no allocated stack, pushes into
>> a static array and, eg, 'plus' does the same old add, but it gets
>> its args from the static array, and puts its result back there.
>> And if anything fancy, such as a real send, does occur, then a
>> special routine is called to do a real activation, copy this
>> static state into it appropriately, and retry the bytecode in the
>> normal interpreter.
>>
>> It's probably worth confirming the results that I remember, but I
>> wouldn't be surprised if one could almost double the speed of
>> Squeak in this manner.
>>
>>
>> Cloned Activation
>> This one I just thought of, but I can't believe someone hasn't
>> already tried it, either in squeak or some similar system. The
>> idea here is to provide a field in the method cache for an extra
>> copy of a properly initialized context for the method (ie, correct
>> frame size, method installed, pc and stack pointer set, etc).
>> Then, when a send occurs, all you have to do is an array copy into
>> blank storage, followed by a short copy of receiver and args from
>> the calling stack.
>>
>> There's a space penalty for carrying all the extra context
>> templates, of course, but I think it's not unreasonable. Also,
>> one could avoid it for all one-time code by only allocating the
>> extra clone on the second call (ie, first call gets it into the
>> method cache; second call allocates clone for the cache).
>>
>> I have little sense of how much this might help these days -- I
>> haven't looked in detail at the activation code for quite a while.
>> Obviously the worse it si right now, the more this technique might
>> help.
>>
>>
>> Mainly I just like to think about this stuff, and it occurred to
>> me that, if someone were looking for a fun experiment or two, it
>> might turn out to have some practical value. I haven't looked at
>> Exupery to know whether these things are already being done, or
>> whether they might fit well with the other techniques there, but
>> I'm sure Bryce could say right off the bat.
>>
>> - Dan
>
>
>
More information about the Squeak-dev
mailing list
|