[Vm-dev] Re: [squeak-dev] Inlining native code in Cog

Mon Aug 16 19:24:58 UTC 2010

On 16 August 2010 21:00, Eliot Miranda <eliot.miranda at gmail.com> wrote:
>
> (redirecting to vm-dev)
>
> On Mon, Aug 16, 2010 at 1:18 AM, Igor Stasenko <siguctua at gmail.com> wrote:
>>
>> Hello, Eliot.
>>
>> I just started exploring the Cog, and here's my first idea:
>> - with NativeBoost, i can actually generate the native code, which can
>> be inlined into other methods.
>>
>> So, then, whenever Cog sees, that primitive is NativeBoost primitive
>> and it going to invoke
>> a native code, then its possible to inline this method into outer
>> method by copying a native code into
>> generated method's code.
>
> Right, sounds good.
>
>>
>> I remember you mentioned that primitive invocation in Cog is rather
>> ineffective , because
>> you have to do some deoptimizations before entering a primitive.
>
> No, that's not the issue.  In Cog there are two kinds of primitives, generated primitives and C primitives.  Cog generates the code for generated primitives and these are fast; their code is inlined in the start of the method immediately following the unchecked entry-point and preceeding frame build.  See Cogit>>compilePrimitive and Cogit>>primitiveGeneratorOrNil.  C primitives are the conventional primitives that the interpreter can call, i.e. the contents of the PrimitiveTable plus all the named primitives.  The issue is that these C primitives must be called on the C stack since Cog has no way of knowing how much stack a C primitive uses and Cog stack pages are fixed size.  So the JIT must generate a stack switching call to invoke C primitives from machine code.  This call is very like a system call and it is certainly slower than a normal C call and hence slower than invoking a C primitive from within the interpreter.  See SimpleStackBasedCogit>>compileInterpreterPrimitive:.

Well, in most cases native code does some C calls on its own (like FFI
code), so switching to C stack will be necessary in 99% cases. :(
I could introduce second primitive, which could be used to run native
code w/o switching to C stack. This is, of course,
when developer can guarantee, that his code can't overflow Cog stack.
Doing automatic analysis , whether it overflow stack or not is
unlikely possible, since developer in full control, what machine code
to generate. Its currently too low level and there are no any
abstraction layer(s), which could handle this.

>>
>> I wonder if its possible to do such things.
>> The problem, what i see is that NativeBoost's native code works as a
>> primitive (its using interpreterProxy
>> and its functions, and generated code honoring all VM/primitive
>> conventions), so if primitive fails, it should
>> enter the method's bytecode.
>
> I think you'll need to generate two forms of the machine code, one that is used as NativeBoost works now, and one that is designed to be inlined into a Cog native method.

It would be cool to have some convention , which would allow to
generate a single version.

> You may also also want to include metadata so that the JIT can add metadata for embedded object references and relocateable calls (since Cog moves native methods) to the NB code it inlines into a native method.  See Cogit's method map protocol, and Cogit>>relocateMethodsPreCompaction & Cogit>>relocateCallsAndSelfReferencesInMethod:.
>

>> This maybe a major barrier, which could make a native code inlining
>> highly problematic.
>>
>> What you think about it?
>
> I think it should work well if different versions are created for the interpreter and the Cogit.  The issues are to do with knowing what are the conventions the Cogit requires and those that C requires.
> It strikes me that you may be able to use one single piece of machine code and just add metadata.  The machine code looks like:

> A NativeBoost primitive:
>      C frame building code
> start of common code:
>      code that actually implements the primitive
> start of exit code
>     C frame tear-down and return code
> start of primitive failure code:
>     call interpreterProxy->primitiveFail/primitiveFailFor
>     jump to start of exit code
> and add metadata for Cog that specifies where "start of common code" & "start of exit code" is, where in the common code any object references or relocateable call references are, (and perhaps where any jumps to "start of primitive failure code" are).

Yes, something like that.
All NB code contains no direct oop/call references. Moreover, all code
is location independent.
And there is already a metadata inside a compiled method: the entry
point of native code is relative to compiled method (if you remember,
i'm attaching it to method's trailer).

> When the Cogit compiles a method containing this NB primitive it inlines from "start of common code" up to (but not including) "start of exit code" and plants a jump over its generated primitive failure code.  Make sense?

Yes, it is. But the thing is, that primitive failure can be  triggered
by native code itself (like invalid argument etc),
so, the code, which is inlined already having jumps to failure code
location (to call interpreterProxy->primitiveFail/primitiveFailFor).
The inliner could override the 'normal exit code', but i don't want it
to mess with overriding all relative jumps in common code.
It is easier to copy common + normal exit + primitive fail code , and
then just hack the exit section to put jump there instead of return.

Thanks for details.

> best
> Eliot
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>
>
>

-- 
Best regards,
Igor Stasenko AKA sig.