[Vm-dev] VM Maker: VMMaker.oscog-rsf.2083.mcz
Ronie Salgado
roniesalg at gmail.com
Wed Jan 11 23:31:50 UTC 2017
Hi Eliot,
Isn't it great when one has to work around compiler bugs?? ;-)
>
This is very annoying. And for me this is not the first time. It seems that
with the heavy inlining in GCC we are putting a bit too much stress on its
register allocator.
Isn't it great when one has to work around compiler bugs?? ;-)
>
> However, let me suggest that this this is perhaps a case where a macro
> would be better. If you added the definition of the macro
> to StackInterpreter class>>#preambleCCode you'd be able to avoid the
> overhead with compilers that can correctly inline memcpy. If required, a
> sqPlatform.h could define a value, say DontInlineMemcpyForLowcode and then
> in the preamble you could have
>
> #if DontInlineMemcpyForLowcode
> # define memcpy(a,b,c) noinline_memcpy(a,b,c)
> #endif
>
> ? And then noinline_memcpy could be defined in some platform support
> file, sqWin32Main.c perhaps? The simulator's noinline_memcpy would be
> defined as <doNotGenerate>.
>
> Anyway, some way of making this platform-dependent is nice as you'll get
> better performance on the other platforms, and on x64.
>
> And yes, feel free to ignore me as this perhaps does count as a premature
> optimization.
>
I was thinking on doing something like this, but I did not knew how to do
it because of Slang. Later in some time I will fix it.
Best regards,
Ronie
2017-01-11 16:38 GMT-03:00 Eliot Miranda <eliot.miranda at gmail.com>:
> Hi Ronie,
>
> I see this :-)
>
> lowcode_mem: destAddress cp: sourceAddress y: bytes
> "This method is a workaround a GCC bug.
> In Windows memcpy is putting too much register pressure on GCC when used
> by Lowcode instructions"
> <inline: #never>
> <option: #LowcodeVM>
> <var: #destAddress type: #'void*'>
> <var: #sourceAddress type: #'void*'>
> <var: #bytes type: #'sqInt'>
> "Using memmove instead of memcpy to avoid crashing GCC in Windows."
> self mem: destAddress mo: sourceAddress ve: bytes
>
> Isn't it great when one has to work around compiler bugs?? ;-)
>
> However, let me suggest that this this is perhaps a case where a macro
> would be better. If you added the definition of the macro
> to StackInterpreter class>>#preambleCCode you'd be able to avoid the
> overhead with compilers that can correctly inline memcpy. If required, a
> sqPlatform.h could define a value, say DontInlineMemcpyForLowcode and then
> in the preamble you could have
>
> #if DontInlineMemcpyForLowcode
> # define memcpy(a,b,c) noinline_memcpy(a,b,c)
> #endif
>
> ? And then noinline_memcpy could be defined in some platform support
> file, sqWin32Main.c perhaps? The simulator's noinline_memcpy would be
> defined as <doNotGenerate>.
>
> Anyway, some way of making this platform-dependent is nice as you'll get
> better performance on the other platforms, and on x64.
>
> And yes, feel free to ignore me as this perhaps does count as a premature
> optimization.
>
> On Tue, Jan 10, 2017 at 11:42 PM, <commits at source.squeak.org> wrote:
>
>>
>> Ronie Salgado Faila uploaded a new version of VMMaker to project VM Maker:
>> http://source.squeak.org/VMMaker/VMMaker.oscog-rsf.2083.mcz
>>
>> ==================== Summary ====================
>>
>> Name: VMMaker.oscog-rsf.2083
>> Author: rsf
>> Time: 11 January 2017, 4:42:00.330997 am
>> UUID: 2debebfc-5008-4ab3-b16d-37ab942d9bc0
>> Ancestors: VMMaker.oscog-eem.2082
>>
>> Workaround a GCC crash in Windows when building a Lowcode VM. Too much
>> register allocation pressure for calling a builtin memcpy.
>>
>> =============== Diff against VMMaker.oscog-eem.2082 ===============
>>
>> Item was changed:
>> ----- Method: StackInterpreter>>internalPushShadowCallStackStructure:size:
>> (in category 'internal interpreter access') -----
>> internalPushShadowCallStackStructure: structurePointer size: size
>> <option: #LowcodeVM>
>> shadowCallStackPointer := shadowCallStackPointer - size.
>> + self lowcode_mem: shadowCallStackPointer cp: structurePointer y:
>> size!
>> - self mem: shadowCallStackPointer cp: structurePointer y: size!
>>
>> Item was changed:
>> ----- Method: StackInterpreter>>lowcodePrimitiveInt32ToPointer (in
>> category 'inline primitive generated code') -----
>> lowcodePrimitiveInt32ToPointer
>> <option: #LowcodeVM> "Lowcode instruction generator"
>> | value result |
>> <var: #value type: #'sqInt' >
>> <var: #result type: #'char*' >
>> value := self internalPopStackInt32.
>>
>> + result := self cCoerce: (self cCoerce: value to: 'uintptr_t') to:
>> 'char*'.
>> - result := self cCoerce: value to: 'uintptr_t'.
>>
>> self internalPushPointer: result.
>>
>> !
>>
>> Item was changed:
>> ----- Method: StackInterpreter>>lowcodePrimitiveMemcpy32 (in category
>> 'inline primitive generated code') -----
>> lowcodePrimitiveMemcpy32
>> <option: #LowcodeVM> "Lowcode instruction generator"
>> | source dest size |
>> <var: #source type: #'char*' >
>> <var: #dest type: #'char*' >
>> <var: #size type: #'sqInt' >
>> size := self internalPopStackInt32.
>> source := self internalPopStackPointer.
>> dest := self internalPopStackPointer.
>>
>> + self lowcode_mem: dest cp: source y: size.
>> - self mem: dest cp: source y: size.
>>
>>
>> !
>>
>> Item was changed:
>> ----- Method: StackInterpreter>>lowcodePrimitiveMemcpy64 (in category
>> 'inline primitive generated code') -----
>> lowcodePrimitiveMemcpy64
>> <option: #LowcodeVM> "Lowcode instruction generator"
>> | source dest size |
>> <var: #source type: #'char*' >
>> <var: #dest type: #'char*' >
>> <var: #size type: #'sqLong' >
>> size := self internalPopStackInt64.
>> source := self internalPopStackPointer.
>> dest := self internalPopStackPointer.
>>
>> + self lowcode_mem: dest cp: source y: size.
>> - self mem: dest cp: source y: size.
>>
>>
>> !
>>
>> Item was changed:
>> ----- Method: StackInterpreter>>lowcodePrimitiveMemcpyFixed (in
>> category 'inline primitive generated code') -----
>> lowcodePrimitiveMemcpyFixed
>> <option: #LowcodeVM> "Lowcode instruction generator"
>> | source size dest |
>> <var: #source type: #'char*' >
>> <var: #dest type: #'char*' >
>> size := extA.
>> source := self internalPopStackPointer.
>> dest := self internalPopStackPointer.
>>
>> + self lowcode_mem: dest cp: source y: size.
>> - self mem: dest cp: source y: size.
>>
>> extA := 0.
>>
>> !
>>
>> Item was changed:
>> ----- Method: StackInterpreter>>lowcodePrimitivePerformCallStructure
>> (in category 'inline primitive generated code') -----
>> lowcodePrimitivePerformCallStructure
>> <option: #LowcodeVM> "Lowcode instruction generator"
>> | resultPointer result function structureSize |
>> <var: #resultPointer type: #'char*' >
>> <var: #result type: #'char*' >
>> function := extA.
>> structureSize := extB.
>> result := self internalPopStackPointer.
>>
>> self internalPushShadowCallStackPointer: result.
>> resultPointer := self lowcodeCalloutPointerResult: (self cCoerce:
>> function to: #'char*').
>>
>> self internalPushPointer: resultPointer.
>> extA := 0.
>> extB := 0.
>> numExtB := 0.
>> +
>> !
>>
>> Item was changed:
>> ----- Method: StackInterpreter>>lowcodePrimitivePointerAddConstantOffset
>> (in category 'inline primitive generated code') -----
>> lowcodePrimitivePointerAddConstantOffset
>> <option: #LowcodeVM> "Lowcode instruction generator"
>> | base offset result |
>> <var: #base type: #'char*' >
>> <var: #result type: #'char*' >
>> offset := extB.
>> base := self internalPopStackPointer.
>>
>> result := base + offset.
>>
>> self internalPushPointer: result.
>> extB := 0.
>> numExtB := 0.
>>
>> !
>>
>> Item was added:
>> + ----- Method: StackInterpreter>>lowcode_mem:cp:y: (in category 'inline
>> primitive support') -----
>> + lowcode_mem: destAddress cp: sourceAddress y: bytes
>> + "This method is a workaround a GCC bug.
>> + In Windows memcpy is putting too much register pressure on GCC
>> when used by Lowcode instructions"
>> + <inline: #never>
>> + <option: #LowcodeVM>
>> + <var: #destAddress type: #'void*'>
>> + <var: #sourceAddress type: #'void*'>
>> + <var: #bytes type: #'sqInt'>
>> +
>> + "Using memmove instead of memcpy to avoid crashing GCC in
>> Windows."
>> + self mem: destAddress mo: sourceAddress ve: bytes!
>>
>> Item was changed:
>> ----- Method: StackToRegisterMappingCogit>>genLowcodePerformCallStructure
>> (in category 'inline primitive generators generated code') -----
>> genLowcodePerformCallStructure
>> <option: #LowcodeVM> "Lowcode instruction generator"
>>
>> "Push the result space"
>> self ssNativeTop nativeStackPopToReg: TempReg.
>> self ssNativePop: 1.
>> self PushR: TempReg.
>> "Call the function"
>> self callSwitchToCStack.
>> self MoveCw: extA R: TempReg.
>> self CallRT: ceFFICalloutTrampoline.
>> "Fetch the result"
>> self MoveR: backEnd cResultRegister R: ReceiverResultReg.
>> self ssPushNativeRegister: ReceiverResultReg.
>> extA := 0.
>> extB := 0.
>> numExtB := 0.
>>
>> ^ 0
>>
>> !
>>
>> Item was changed:
>> ----- Method: StackToRegisterMappingCogit>>g
>> enLowcodePointerAddConstantOffset (in category 'inline primitive
>> generators generated code') -----
>> genLowcodePointerAddConstantOffset
>> <option: #LowcodeVM> "Lowcode instruction generator"
>> | base offset |
>> offset := extB.
>>
>> (base := backEnd availableRegisterOrNoneFor: self liveRegisters)
>> = NoReg ifTrue:
>> [self ssAllocateRequiredReg:
>> (base := optStatus isReceiverResultRegLive
>> ifTrue: [Arg0Reg]
>> ifFalse: [ReceiverResultReg])].
>> base = ReceiverResultReg ifTrue:
>> [ optStatus isReceiverResultRegLive: false ].
>> self ssNativeTop nativePopToReg: base.
>> self ssNativePop: 1.
>>
>> self AddCq: offset R: base.
>> self ssPushNativeRegister: base.
>>
>> extB := 0.
>> numExtB := 0.
>> ^ 0
>>
>> !
>>
>>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170111/ec5618a7/attachment-0001.html>
More information about the Vm-dev
mailing list