[Vm-dev] VM Maker: VMMaker.oscog-rsf.2083.mcz

Ronie Salgado roniesalg at gmail.com
Wed Jan 11 23:31:50 UTC 2017


Hi Eliot,

Isn't it great when one has to work around compiler bugs??  ;-)
>
This is very annoying. And for me this is not the first time. It seems that
with the heavy inlining in GCC we are putting a bit too much stress on its
register allocator.

Isn't it great when one has to work around compiler bugs??  ;-)
>
> However, let me suggest that this this is perhaps a case where a macro
> would be better.  If you added the definition of the macro
> to StackInterpreter class>>#preambleCCode you'd be able to avoid the
> overhead with compilers that can correctly inline memcpy.  If required, a
> sqPlatform.h could define a value, say DontInlineMemcpyForLowcode and then
> in the preamble you could have
>
> #if DontInlineMemcpyForLowcode
> # define memcpy(a,b,c) noinline_memcpy(a,b,c)
> #endif
>
> ?  And then noinline_memcpy could be defined in some platform support
> file, sqWin32Main.c perhaps?  The simulator's noinline_memcpy would be
> defined as <doNotGenerate>.
>
> Anyway, some way of making this platform-dependent is nice as you'll get
> better performance on the other platforms, and on x64.
>
> And yes, feel free to ignore me as this perhaps does count as a premature
> optimization.
>

I was thinking on doing something like this, but I did not knew how to do
it because of Slang. Later in some time I will fix it.

Best regards,
Ronie

2017-01-11 16:38 GMT-03:00 Eliot Miranda <eliot.miranda at gmail.com>:

> Hi Ronie,
>
>     I see this :-)
>
> lowcode_mem: destAddress cp: sourceAddress y: bytes
> "This method is a workaround a GCC bug.
> In Windows memcpy is putting too much register pressure on GCC when used
> by Lowcode instructions"
> <inline: #never>
> <option: #LowcodeVM>
> <var: #destAddress type: #'void*'>
> <var: #sourceAddress type: #'void*'>
> <var: #bytes type: #'sqInt'>
> "Using memmove instead of memcpy to avoid crashing GCC in Windows."
> self mem: destAddress mo: sourceAddress ve: bytes
>
> Isn't it great when one has to work around compiler bugs??  ;-)
>
> However, let me suggest that this this is perhaps a case where a macro
> would be better.  If you added the definition of the macro
> to StackInterpreter class>>#preambleCCode you'd be able to avoid the
> overhead with compilers that can correctly inline memcpy.  If required, a
> sqPlatform.h could define a value, say DontInlineMemcpyForLowcode and then
> in the preamble you could have
>
> #if DontInlineMemcpyForLowcode
> # define memcpy(a,b,c) noinline_memcpy(a,b,c)
> #endif
>
> ?  And then noinline_memcpy could be defined in some platform support
> file, sqWin32Main.c perhaps?  The simulator's noinline_memcpy would be
> defined as <doNotGenerate>.
>
> Anyway, some way of making this platform-dependent is nice as you'll get
> better performance on the other platforms, and on x64.
>
> And yes, feel free to ignore me as this perhaps does count as a premature
> optimization.
>
> On Tue, Jan 10, 2017 at 11:42 PM, <commits at source.squeak.org> wrote:
>
>>
>> Ronie Salgado Faila uploaded a new version of VMMaker to project VM Maker:
>> http://source.squeak.org/VMMaker/VMMaker.oscog-rsf.2083.mcz
>>
>> ==================== Summary ====================
>>
>> Name: VMMaker.oscog-rsf.2083
>> Author: rsf
>> Time: 11 January 2017, 4:42:00.330997 am
>> UUID: 2debebfc-5008-4ab3-b16d-37ab942d9bc0
>> Ancestors: VMMaker.oscog-eem.2082
>>
>> Workaround a GCC crash in Windows when building a Lowcode VM. Too much
>> register allocation pressure for calling a builtin memcpy.
>>
>> =============== Diff against VMMaker.oscog-eem.2082 ===============
>>
>> Item was changed:
>>   ----- Method: StackInterpreter>>internalPushShadowCallStackStructure:size:
>> (in category 'internal interpreter access') -----
>>   internalPushShadowCallStackStructure: structurePointer size: size
>>         <option: #LowcodeVM>
>>         shadowCallStackPointer := shadowCallStackPointer - size.
>> +       self lowcode_mem: shadowCallStackPointer cp: structurePointer y:
>> size!
>> -       self mem: shadowCallStackPointer cp: structurePointer y: size!
>>
>> Item was changed:
>>   ----- Method: StackInterpreter>>lowcodePrimitiveInt32ToPointer (in
>> category 'inline primitive generated code') -----
>>   lowcodePrimitiveInt32ToPointer
>>         <option: #LowcodeVM>    "Lowcode instruction generator"
>>         | value result |
>>         <var: #value type: #'sqInt' >
>>         <var: #result type: #'char*' >
>>         value := self internalPopStackInt32.
>>
>> +       result := self cCoerce: (self cCoerce: value to: 'uintptr_t') to:
>> 'char*'.
>> -       result := self cCoerce: value to: 'uintptr_t'.
>>
>>         self internalPushPointer: result.
>>
>>   !
>>
>> Item was changed:
>>   ----- Method: StackInterpreter>>lowcodePrimitiveMemcpy32 (in category
>> 'inline primitive generated code') -----
>>   lowcodePrimitiveMemcpy32
>>         <option: #LowcodeVM>    "Lowcode instruction generator"
>>         | source dest size |
>>         <var: #source type: #'char*' >
>>         <var: #dest type: #'char*' >
>>         <var: #size type: #'sqInt' >
>>         size := self internalPopStackInt32.
>>         source := self internalPopStackPointer.
>>         dest := self internalPopStackPointer.
>>
>> +       self lowcode_mem: dest cp: source y: size.
>> -       self mem: dest cp: source y: size.
>>
>>
>>   !
>>
>> Item was changed:
>>   ----- Method: StackInterpreter>>lowcodePrimitiveMemcpy64 (in category
>> 'inline primitive generated code') -----
>>   lowcodePrimitiveMemcpy64
>>         <option: #LowcodeVM>    "Lowcode instruction generator"
>>         | source dest size |
>>         <var: #source type: #'char*' >
>>         <var: #dest type: #'char*' >
>>         <var: #size type: #'sqLong' >
>>         size := self internalPopStackInt64.
>>         source := self internalPopStackPointer.
>>         dest := self internalPopStackPointer.
>>
>> +       self lowcode_mem: dest cp: source y: size.
>> -       self mem: dest cp: source y: size.
>>
>>
>>   !
>>
>> Item was changed:
>>   ----- Method: StackInterpreter>>lowcodePrimitiveMemcpyFixed (in
>> category 'inline primitive generated code') -----
>>   lowcodePrimitiveMemcpyFixed
>>         <option: #LowcodeVM>    "Lowcode instruction generator"
>>         | source size dest |
>>         <var: #source type: #'char*' >
>>         <var: #dest type: #'char*' >
>>         size := extA.
>>         source := self internalPopStackPointer.
>>         dest := self internalPopStackPointer.
>>
>> +       self lowcode_mem: dest cp: source y: size.
>> -       self mem: dest cp: source y: size.
>>
>>         extA := 0.
>>
>>   !
>>
>> Item was changed:
>>   ----- Method: StackInterpreter>>lowcodePrimitivePerformCallStructure
>> (in category 'inline primitive generated code') -----
>>   lowcodePrimitivePerformCallStructure
>>         <option: #LowcodeVM>    "Lowcode instruction generator"
>>         | resultPointer result function structureSize |
>>         <var: #resultPointer type: #'char*' >
>>         <var: #result type: #'char*' >
>>         function := extA.
>>         structureSize := extB.
>>         result := self internalPopStackPointer.
>>
>>         self internalPushShadowCallStackPointer: result.
>>         resultPointer := self lowcodeCalloutPointerResult: (self cCoerce:
>> function to: #'char*').
>>
>>         self internalPushPointer: resultPointer.
>>         extA := 0.
>>         extB := 0.
>>         numExtB := 0.
>> +
>>   !
>>
>> Item was changed:
>>   ----- Method: StackInterpreter>>lowcodePrimitivePointerAddConstantOffset
>> (in category 'inline primitive generated code') -----
>>   lowcodePrimitivePointerAddConstantOffset
>>         <option: #LowcodeVM>    "Lowcode instruction generator"
>>         | base offset result |
>>         <var: #base type: #'char*' >
>>         <var: #result type: #'char*' >
>>         offset := extB.
>>         base := self internalPopStackPointer.
>>
>>         result := base + offset.
>>
>>         self internalPushPointer: result.
>>         extB := 0.
>>         numExtB := 0.
>>
>>   !
>>
>> Item was added:
>> + ----- Method: StackInterpreter>>lowcode_mem:cp:y: (in category 'inline
>> primitive support') -----
>> + lowcode_mem: destAddress cp: sourceAddress y: bytes
>> +       "This method is a workaround a GCC bug.
>> +       In Windows memcpy is putting too much register pressure on GCC
>> when used by Lowcode instructions"
>> +       <inline: #never>
>> +       <option: #LowcodeVM>
>> +       <var: #destAddress type: #'void*'>
>> +       <var: #sourceAddress type: #'void*'>
>> +       <var: #bytes type: #'sqInt'>
>> +
>> +       "Using memmove instead of memcpy to avoid crashing GCC in
>> Windows."
>> +       self mem: destAddress mo: sourceAddress ve: bytes!
>>
>> Item was changed:
>>   ----- Method: StackToRegisterMappingCogit>>genLowcodePerformCallStructure
>> (in category 'inline primitive generators generated code') -----
>>   genLowcodePerformCallStructure
>>         <option: #LowcodeVM>    "Lowcode instruction generator"
>>
>>         "Push the result space"
>>         self ssNativeTop nativeStackPopToReg: TempReg.
>>         self ssNativePop: 1.
>>         self PushR: TempReg.
>>         "Call the function"
>>         self callSwitchToCStack.
>>         self MoveCw: extA R: TempReg.
>>         self CallRT: ceFFICalloutTrampoline.
>>         "Fetch the result"
>>         self MoveR: backEnd cResultRegister R: ReceiverResultReg.
>>         self ssPushNativeRegister: ReceiverResultReg.
>>         extA := 0.
>>         extB := 0.
>>         numExtB := 0.
>>
>>         ^ 0
>>
>>   !
>>
>> Item was changed:
>>   ----- Method: StackToRegisterMappingCogit>>g
>> enLowcodePointerAddConstantOffset (in category 'inline primitive
>> generators generated code') -----
>>   genLowcodePointerAddConstantOffset
>>         <option: #LowcodeVM>    "Lowcode instruction generator"
>>         | base offset |
>>         offset := extB.
>>
>>         (base := backEnd availableRegisterOrNoneFor: self liveRegisters)
>> = NoReg ifTrue:
>>                 [self ssAllocateRequiredReg:
>>                         (base := optStatus isReceiverResultRegLive
>>                                 ifTrue: [Arg0Reg]
>>                                 ifFalse: [ReceiverResultReg])].
>>         base = ReceiverResultReg ifTrue:
>>                 [ optStatus isReceiverResultRegLive: false ].
>>         self ssNativeTop nativePopToReg: base.
>>         self ssNativePop: 1.
>>
>>         self AddCq: offset R: base.
>>         self ssPushNativeRegister: base.
>>
>>         extB := 0.
>>         numExtB := 0.
>>         ^ 0
>>
>>   !
>>
>>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170111/ec5618a7/attachment-0001.html>


More information about the Vm-dev mailing list