More fun with VMs

Wed Mar 22 19:16:19 UTC 2006

Alan - Are you referring to this paper:

LAMBDA: The Ultimate GOTO
http://repository.readscheme.org/ftp/papers/ai-lab-pubs/AIM-443.pdf

Or one the others:
http://library.readscheme.org/page1.html

Sidenote: In googling for this paper, I found that Richard Gabriel  
was going to publish a collection of Guy Steele's papers in book  
form--but this doesn't appear to have happened.  See http:// 
www.dreamsongspress.com.  I, for one, would buy this in an instant.

david

On Mar 22, 2006, at 9:11 AM, Alan Kay wrote:

> Hi Dan --
>
> Wrt your first optimization ... The SCHEME guys used similar  
> arguments in one or more of the SCHEME papers to show that there  
> are many cases in which no stack has to be allocated or popped (so  
> a simple goto in the code will do the job). Someone on the Squeak  
> list probably has a reference to the paper or papers I'm talking  
> about. Sounds like a good idea, and should work pretty well. The  
> second idea sounds like it should work very well also.
>
> Cheers,
>
> Alan
>
> At 06:11 AM 3/22/2006, Dan Ingalls wrote:
>> Hi -
>>
>> I've got my little Squeak in Java running (hope to send out a link  
>> soon), and I've been pondering how to make it run faster.  In the  
>> process, I've thought of two techniques, one of which is new (to  
>> me) and the other occurred to me years ago, but I never tried it out.
>>
>> Since neither would really be all that hard to do in Squeak, I  
>> thought I'd mention them here for those folks who delight in such  
>> things, and with the further hope that someone might actually try  
>> them out.
>>
>>
>> Lazy Activation
>> This was the next thing I was going to do for Apple Smalltalk back  
>> when I got drafted to the hotel business back in 1987.  The  
>> essence of the idea is that the purpose of activating a context is  
>> to save the execution state in case you have to do a send and,  
>> conversely, you don't really need an activation if you never need  
>> to do a real send.
>>
>> I had a lot of fun instrumenting the VM to figure out just how  
>> many activations could be avoided in this way, and my recollection  
>> is that it was roughly 50%.  I believe the statistics were better  
>> dynamically than statically, because there are a lot of methods  
>> that, in general need to be activated, but they may begin with a  
>> test such as
>>         position > limit ifTrue: [^ false]
>> and for every time that this test succeeds, you can get away  
>> without ever needing an activation.
>>
>> But, you say, you still need a pointer to the method bytes and a  
>> stack frame, and this is true, but you don't need to allocate and  
>> initialize a full context, nor to transfer the arguments.  The  
>> idea is that, when you hit the send, you do the lookup, find the  
>> method, and then jump to a *separate copy* of the interpreter that  
>> has  a different set of bytecode service routines.  For instance,  
>> 'loadTemp' will, depending on the argument count, load from the  
>> stack of the calling method (which is still the "active"  
>> context).  'Push', since there is no allocated stack, pushes into  
>> a static array and, eg, 'plus' does the same old add, but it gets  
>> its args from the static array, and puts its result back there.   
>> And if anything fancy, such as a real send, does occur, then a  
>> special routine is called to do a real activation, copy this  
>> static state into it appropriately, and retry the bytecode in the  
>> normal interpreter.
>>
>> It's probably worth confirming the results that I remember, but I  
>> wouldn't be surprised if one could almost double the speed of  
>> Squeak in this manner.
>>
>>
>> Cloned Activation
>> This one I just thought of, but I can't believe someone hasn't  
>> already tried it, either in squeak or some similar system.  The  
>> idea here is to provide a field in the method cache for an extra  
>> copy of a properly initialized context for the method (ie, correct  
>> frame size, method installed, pc and stack pointer set, etc).   
>> Then, when a send occurs, all you have to do is an array copy into  
>> blank storage, followed by a short copy of receiver and args from  
>> the calling stack.
>>
>> There's a space penalty for carrying all the extra context  
>> templates, of course, but I think it's not unreasonable.  Also,  
>> one could avoid it for all one-time code by only allocating the  
>> extra clone on the second call (ie, first call gets it into the  
>> method cache; second call allocates clone for the cache).
>>
>> I have little sense of how much this might help these days -- I  
>> haven't looked in detail at the activation code for quite a while.  
>> Obviously the worse it si right now, the more this technique might  
>> help.
>>
>>
>> Mainly I just like to think about this stuff, and it occurred to  
>> me that, if someone were looking for a fun experiment or two, it  
>> might turn out to have some practical value.  I haven't looked at  
>> Exupery to know whether these things are already being done, or  
>> whether they might fit well with the other techniques there, but  
>> I'm sure Bryce could say right off the bat.
>>
>>         - Dan
>
>
>