[squeak-dev] [NativeBoost] Need your ideas, what platform-neutral assembler syntax you would like to have

Wed May 11 06:11:48 UTC 2011

With a single API, when one will implement another processor
instruction set, you'll win.
The naming of registers is application dependent, but these are just constants.
I wonder if Eliot gave a look at gcc?

Nicolas

2011/5/11 Igor Stasenko <siguctua at gmail.com>:
> Hi,
>
> i took a look at Cog's abstract opcode syntax.
> Check the CogRTLOpcodes class>>initialize for description,
> and check the Cogit category 'abstract instructions' at instance side
> for actual syntax.
>
> Things i don't like: register naming.
> In cog you should use following register names:
>        FPReg := -1.
>        SPReg := -2.
>        ReceiverResultReg := GPRegMax := -3.
>        TempReg := -4.
>        ClassReg := -5.
>        SendNumArgsReg := -6.
>        Arg0Reg := -7.
>        Arg1Reg := GPRegMin := -8.
>        DPFPReg0 := -9.
>        DPFPReg1 := -10.
>        DPFPReg2 := -11.
>        DPFPReg3 := -12.
>        DPFPReg4 := -13.
>        DPFPReg5 := -14.
>        DPFPReg6 := -15.
>        DPFPReg7 := -16.
>
> They are mapping 1:1 to real registers on your machine.
> But while such names could make sense in some cases for Cog,
> in NativeBoost using register named like SendNumArgsReg will be confusing..
> Okay... actually this is not a big deal, i could always use aliases
> with different names (suggestions?).
>
> But i would really like you opinion on Cog's native code syntax.
> Here how the native code assembly looks in cog:
>
> genDoubleArithmetic: arithmeticOperator preOpCheck: preOpCheckOrNil
>        "Receiver and arg in registers.
>         Stack looks like
>                return address"
>        <var: #preOpCheckOrNil declareC: 'AbstractInstruction
> *(*preOpCheckOrNil)(int rcvrReg, int argReg)'>
>        | jumpFailClass jumpFailAlloc jumpFailCheck jumpSmallInt doOp |
>        <var: #jumpFailClass type: #'AbstractInstruction *'>
>        <var: #jumpFailAlloc type: #'AbstractInstruction *'>
>        <var: #jumpSmallInt type: #'AbstractInstruction *'>
>        <var: #jumpFailCheck type: #'AbstractInstruction *'>
>        <var: #doOp type: #'AbstractInstruction *'>
>        self MoveR: Arg0Reg R: TempReg.
>        objectRepresentation genGetDoubleValueOf: ReceiverResultReg into: DPFPReg0.
>        self MoveR: Arg0Reg R: ClassReg.
>        jumpSmallInt := objectRepresentation genJumpSmallIntegerInScratchReg: TempReg.
>        objectRepresentation genGetCompactClassIndexNonIntOf: Arg0Reg into:
> SendNumArgsReg.
>        self CmpCq: objectMemory classFloatCompactIndex R: SendNumArgsReg.
>        jumpFailClass := self JumpNonZero: 0.
>        objectRepresentation genGetDoubleValueOf: Arg0Reg into: DPFPReg1.
>        doOp := self Label.
>        preOpCheckOrNil ifNotNil:
>                [jumpFailCheck := self perform: preOpCheckOrNil with: DPFPReg0 with:
> DPFPReg1].
>        self gen: arithmeticOperator operand: DPFPReg1 operand: DPFPReg0.
>        jumpFailAlloc := objectRepresentation
>                                                genAllocFloatValue: DPFPReg0
>                                                into: SendNumArgsReg
>                                                scratchReg: ClassReg
>                                                scratchReg: TempReg.
>        self MoveR: SendNumArgsReg R: ReceiverResultReg.
>        self RetN: 0.
>        "We need to push the register args on two paths; this one and the
> interpreter primitive path.
>        But the interpreter primitive path won't unless regArgsHaveBeenPushed
> is false."
>        self assert: methodOrBlockNumArgs <= self numRegArgs.
>        jumpFailClass jmpTarget: self Label.
>        preOpCheckOrNil ifNotNil:
>                [jumpFailCheck jmpTarget: jumpFailClass getJmpTarget].
>        self genPushRegisterArgsForNumArgs: methodOrBlockNumArgs.
>        jumpFailClass := self Jump: 0.
>        jumpSmallInt jmpTarget: self Label.
>        objectRepresentation genConvertSmallIntegerToIntegerInScratchReg: ClassReg.
>        self ConvertR: ClassReg Rd: DPFPReg1.
>        self Jump: doOp.
>        jumpFailAlloc jmpTarget: self Label.
>        self compileInterpreterPrimitive: (coInterpreter
>                                                                                functionPointerForCompiledMethod: methodObj
>                                                                                primitiveIndex: primitiveIndex).
>        jumpFailClass jmpTarget: self Label.
>        ^0
>
> And here an example of NativeBoost code:
>
> emitArgumentsCoercion: gen
>        " input - none,
>        output - an arguments array in remmappable oops stack top"
>        | asm proxy args argOop i |
>        asm := gen asm.
>        proxy := gen proxy.
>        args := gen reserveTemp.
>        argOop := gen reserveTemp.
>
>        asm label: 'emitArgumentsCoercion>>'.
>
>        proxy createInstanceOf: Array size: (gen fnSpec arguments size).
>        proxy pushRemappableOop: EAX.
>
>        i := 0.
>        gen fnSpec arguments do: [:arg |
>                arg type readOop: (EBP ptr + arg offset) generator: gen.
>                asm mov: EAX to: argOop.
>                proxy popRemappableOop.
>                asm mov: EAX to: args.
>                proxy storePointer: i ofObject: args withValue: argOop.
>                i := i+1.
>                proxy pushRemappableOop: args.
>        ].
>
>        asm label: '<<emitArgumentsCoercion'.
>
>        gen releaseTemps: 2.
>
> The code is of course doing different things.. but these example is
> actually to indicate a following observation:
>
> even when i will introduce platform-neutral code, in most cases
> you won't see a pure assembly language, but instead you will see a mix
> of usual smalltalk code and assembler.
>
> If you can see, in both examples above you see a direct instructions
> intermixed with another methods.
> And it is totally normal, because for obvious reasons, even for
> assembler code, there are a lot of repetitious patterns,
> and you refactor repetitive code into separate methods (a thing which
> you normally do if you follow DRY principle).
>
> So, i mean: its okay. I could change the assembler syntax to anything.
> But code will still be looking mostly the same,
> for most of code-generation routines.
> Of course, if you write a pure-assembly routine, it will be pure. But
> in practice you won't do it too often,
> because you can reuse stuff which already been done, or even use
> higher-level abstract syntax (i have one somewhere in my garage ;).
>
> But still, question remains open: should i take a Cog's assembler
> syntax as a base,
> or create own?
> The pro for Cog-syntax that if some day we move code generation from
> VM level to language,
> then we can just use existing code.
>
> But from another side, if i have tool which can produce a native code,
> from VM's perspective its completely irrelevant , where this code
> comes from and what assembler syntax were used to generate it.
>
> Also, we don't have to constrain ourselves, because Cog's assembler
> (and its implementation) were designed by taking into account that it
> should be translatable to C,
> while in NB we're always could use tricks like #perform: and block
> closures, which of course not available in slang.
>
> Sorry for long tiresome posts.. But i need your feedback and opinion.
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
>