[Vm-dev] Interpreter>>isContextHeader: optimization

Igor Stasenko siguctua at gmail.com
Mon Feb 23 06:28:14 UTC 2009


Some more observations:

#include <stdio.h>

int main(int argc , char** argv)
{
  int i;

  i = 1;
  i = ({ int y=i; int i=6; printf("foo"); 10+y+i; });
  printf("%d", i);

}

prints:
foo17

so, to generate a proper inlined code, we should care only about name clashing
for method arguments.
In above code, suppose 'y' is an argument var,
and 'i', declared in inner scope - a temporary var.


2009/2/23 Igor Stasenko <siguctua at gmail.com>:
> 2009/2/23 Igor Stasenko <siguctua at gmail.com>:
>> 2009/2/23 Eliot Miranda <eliot.miranda at gmail.com>:
>>>
>>>
>>>
>>> On Sun, Feb 22, 2009 at 8:08 PM, Igor Stasenko <siguctua at gmail.com> wrote:
>>>>
>>>> 2009/2/23 Igor Stasenko <siguctua at gmail.com>:
>>>> > 2009/2/22 Eliot Miranda <eliot.miranda at gmail.com>:
>>>> >>
>>>>
>>>> [snip]
>>>>
>>>> another idea how to make a cleaner slang code , was to introduce a
>>>> special 'C' shared variable.
>>>> So, then instead of writing something like:
>>>>
>>>> self ioRelinquishProcessorForMicroseconds: xxx.
>>>>
>>>> or even worse:
>>>>
>>>> self cCode:' ((sqInt (*) (sqInt, sqInt*, sqInt*, sqInt*, sqInt*))querySurfaceFn)
>>>>                (handle, &sourceWidth, &sourceHeight, &sourceDepth, &sourceMSB)'
>>>>                        inSmalltalk:[false]
>>>>
>>>> you merely write:
>>>>
>>>> C ioRelinquishProcessorForMicroseconds: xxx.
>>>> C querySurfaceFn: handle with: sourceWidth cReference with:
>>>> sourceHeight cReference ....
>>>
>>> this is scarily similar to Alien style FFI's, e.g. Vassili's Newspeak Windows GUI interface :)
>>>
>>>>
>>>> First , it lets a code generator know, that given message send is raw
>>>> C function call (by taking a first keyword as a function name).
>>>> Second, it can be simulated appropriately by a simulator, since you
>>>> can assign an object to 'C' pool var which having best-match
>>>> implementations for all such calls. And of course it helps greatly in
>>>> finding errors or mistypes!
>>>>
>>>> Then patterns like 'self foo' be exclusively treated by code generator
>>>> as method which belongs to an instance of class where  method which
>>>> containing such code residing, without any exceptions.
>>>>
>>>> So, if you write
>>>> 'self signalSemaphore: xx' in Interpreter's method
>>>> a code generator should lookup for #signalSemaphore: in Interpreter class.
>>>> And if you write 'self header'   in Oop class -- it will lookup
>>>> #header method in Oop class, but nowhere else!
>>>
>>> I also think we should do the following:
>>> a) mangle names of selectors so each is prefixed by e.g. the capitals in the class name, so that e.g. StackInterpreter>>popStack: gets mangled to SI_popStack, and Cogit>>cog:selector: gets mangled to C_cogselector etc so one can use super in Slang.
>> +1
>>
>>> b) handle variables thusly:
>>> Slang should provide unique names for all local variables as it creates TMethods.  These variable names can simply be integers (actually their key is an integer and their value is their original name).  Since they are all unique there can be no clashes.  Variables can safely be substituted by other variables since when one replaces one variable with another it cannot posibly accidentally clash with some other variable.
>>>
>>> Later, when a TMethod is output, variables are output not as their integer key but as their value (their original name) provided it doesn't clash.  The same variable renumbering scheme can be used to resolve clashes.  i.e. the renaming is deferred until a TMethod is output, and done once, not every time one tries to inline a method.  Renaming clashes is simple.  A dictionary maps original names to sequences of integer variable keys.  The renamed variable is the original name concatenated with the index of its integer key in the sequence of keys for the original name.
>>>
>>> When inlining a method into another one unifies the formals and actuals assigning new variable numbers for all new variable bindings.  That should simplify the inline considerably because the horrible variable renaming code will reduce to mapping old variable keys to new variable keys.
>>
>> Agree.
>> i never took a look how method inliner works in code generator. But
>> some of its discrepancies is a pain.
>> Like unable to inline a method which having cCode, or c variables
>> declarations/definitions.
>>
>> There are also some more syntax sugar, which i forgot to mention:
>>
>> methodFoo
>>  | x |
>> C initializer: (x := 5).
>> ^ x
>>
>> can produce a following:
>>
>> int methodFoo()
>> {
>>  int x=5;
>>  return x;
>> }
>>
>> and even if you inline such method:
>>
>> int result;
>> ...
>>   { int x = 5;
>>     result = x;
>>     goto l10;
>>   }
>> l10:
>>
>>
>> what is interesting about inlining, i discovered that GCC 2.95 deals
>> fine with following code:
>>
>> #include <stdio.h>
>>
>> int main(int argc , char** argv)
>> {
>>  int i;
>>
>>  i = ({ printf("foo"); 10; });
>>  printf("%d", i);
>>
>> }
>>
>> i think we can use this for inlining (if we using GCC everywhere, i
>> don't see why we can't use it), then we don't need to define any vars
>> in outer scope, like current inliner does. And we can avoid naming
>> clashes, except those, where arguments to inlined method clashing with
>> temps declared in it i.e..
>>
>> int param;
>>  param = computeParam();
>>  return computeSomethingElse(param);
>>
>> and
>> int  computeSomethingElse( int x)
>> {
>>  int param=10;
>>   return x + param;
>> }
>> so, if we try to inline computeSomethingElse, we will have a name
>> clashing 'x' -> 'param'
>> so, if naively implemented, it will produce
>>
>> ({int param=10; param+param;})
>>
>> inlined code.
>>
>
> Ha, even better.
>
> Suppose some code calls a method which has to be inlined in a following way:
>
> foo := self method: 5+i with: self bar.
>
> and method declared as following:
>
> method: arg1 with: arg2
>  ^ arg1 + arg2
>
> now, to inline it we can generate:
>
> ({
>  int arg1 = 5+i;
>  int arg2 = bar();
>  arg1+arg2;
> })
>
> the only difference here is the order of evaluation: arg1, then arg2
> while in C calls arguments are evaluated in reverse order - arg2 then arg1
> because C calling convention pushing last argument first on stack.
> But i think it will be more correct to evaluate them in smalltalk
> order, since we coding in smalltalk.
> so, simulated code will behave similar to compiled VM for all such cases.
> --
> Best regards,
> Igor Stasenko AKA sig.
>



-- 
Best regards,
Igor Stasenko AKA sig.


More information about the Vm-dev mailing list