[Vm-dev] Re: Serious inlining problem

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Thu Apr 28 07:13:42 UTC 2016


Hi Eliot, I'll check tonight.

My concern was that isComplete state is bound to TMethod.
But once isComplete it seems to have effect on inlining of each send.
If inlining is decomposed as happens for one of the message sends, the
assignment is well distributed in each ifTrue:ifFalse: branch, but once
isComplete and applied onto other message sends, it seems that the inliner
does not take the same path.

It would be great to turn workspace snippets into non regression tests, or
better specification tests. We need more hands to engineer this.

2016-04-28 4:36 GMT+02:00 Eliot Miranda <eliot.miranda at gmail.com>:

>
> Hi Nicolas,
>
> On Wed, Apr 27, 2016 at 3:11 PM, Nicolas Cellier <
> nicolas.cellier.aka.nice at gmail.com> wrote:
>
>>
>> It sounds like related to ifTrue:ifFalse: alternatives.
>>
>
> Indeed it is.  The first bug is that inlineCodeOrNilForStatement:in: is
> wrong to state that there is no direct return for a send node.  If the send
> node is part of a returning if then there is a direct return.  The second
> bug is that if one fixes this bug then exitVar:label: is too naive in
> mapping ^expr into exitVar := expr; got exitLabel.  It uses replaceNodesIn:
> which is strictly top-down, and so the replacement for ^expr ifTrue:
> [...^fu...] ifFalse: [...^bar...] will prevent replacement of either ^fu or
> ^bar. I think I have the fix.  I'm just testing now...
>
>
>
>> When entering pass 6 inlined is still correct:
>>
>>     "begin push:".
>>     objectMemory wordSize = 8 ifTrue: [object3 := self
>> positive64BitIntegerFor: vmCallbackContext asUnsignedInteger] ifFalse: [
>>         "begin positive32BitIntegerFor:".
>>         false ifTrue: [
>>             object3 := objectMemory integerObjectOf: (vmCallbackContext
>> asUnsignedInteger asUnsignedLong bitAnd: 4294967295).
>>             goto l1
>>         ] ifFalse: [
>>             object3 := self maybeInlinePositive32BitIntegerFor:
>> vmCallbackContext asUnsignedInteger.
>>             goto l1
>>         ].
>>         l1:    "end positive32BitIntegerFor:"
>>     ].
>>     stackPages longAt: sp4 := stackPointer - objectMemory wordSize put:
>> object3.
>>     stackPointer := sp4.
>>
>> The bad step happens before entering in pass 7. It correctly applies to
>> above instance of this message send with object3 := assignment distributed
>> in each ifTrue:ifFalse: returning branch:
>>
>> objectMemory wordSize = 8 ifTrue: [object3 := self
>> positive64BitIntegerFor: vmCallbackContext asUnsignedInteger] ifFalse: [
>>         "begin positive32BitIntegerFor:".
>>         false ifTrue: [
>>             object3 := objectMemory integerObjectOf: (vmCallbackContext
>> asUnsignedInteger asUnsignedLong bitAnd: 4294967295).
>>             goto l1
>>         ] ifFalse: [
>>             "begin maybeInlinePositive32BitIntegerFor:".
>>             self deny: objectMemory hasSixtyFourBitImmediates.
>>             (self cCoerceSimple: vmCallbackContext asUnsignedInteger to:
>> unsigned int) <= objectMemory maxSmallInteger ifTrue: [
>>                 object3 := objectMemory integerObjectOf:
>> vmCallbackContext asUnsignedInteger.
>>                 goto l20
>>             ].
>>             "begin eeInstantiateSmallClassIndex:format:numSlots:".
>>             objFormat3 := self firstByteFormat + (8 - 4 bitAnd: self
>> wordSize - 1).
>>             self assert: (1 >= 0 and:
>> [ClassLargePositiveIntegerCompactIndex ~= 0]).
>>             self assert: (objFormat3 < self firstByteFormat ifTrue:
>> [objFormat3] ifFalse: [objFormat3 bitAnd: self byteFormatMask]) = (self
>> instSpecOfClass: (self knownClassAtIndex:
>> ClassLargePositiveIntegerCompactIndex)).
>>             "begin allocateSmallNewSpaceSlots:format:classIndex:".
>>             self assert: 1 < self numSlotsMask.
>>             newObj3 := freeStart.
>>             numBytes3 := self baseHeaderSize + (1 <= 1 ifTrue: [8]
>> ifFalse: [1 + (1 bitAnd: 1) * self bytesPerOop]).
>>             self assert: numBytes3 \\ self allocationUnit = 0.
>>             self assert: newObj3 \\ self allocationUnit = 0.
>>             freeStart + numBytes3 > scavengeThreshold ifTrue: [
>>                 needGCFlag ifFalse: [
>>                     "begin scheduleScavenge".
>>                     needGCFlag := true.
>>                     coInterpreter forceInterruptCheck
>>                 ].
>>                 freeStart + numBytes3 > scavenger eden limit ifTrue: [
>>                     self error: 'no room in eden for
>> allocateSmallNewSpaceSlots:format:classIndex:'.
>>                     newLargeInteger3 := 0.
>>                     goto l19
>>                 ]
>>             ].
>>             self long64At: newObj3 put: (self cCoerceSimple: 1 to:
>> usqLong) << self numSlotsFullShift + objFormat3 << self formatShift +
>> ClassLargePositiveIntegerCompactIndex.
>>             freeStart := freeStart + numBytes3.
>>             newLargeInteger3 := newObj3.
>>             l19:    "end allocateSmallNewSpaceSlots:format:classIndex:".
>>             self cppIf: SPURVM ifTrue: [
>>                 self long32At: newLargeInteger3 + self baseHeaderSize + 0
>> << 2 put: (objectMemory byteSwapped32IfBigEndian: vmCallbackContext
>> asUnsignedInteger).
>>                 self long32At: newLargeInteger3 + self baseHeaderSize + 1
>> << 2 put: 0
>>             ] ifFalse: [self long32At: newLargeInteger3 + self
>> baseHeaderSize + 0 << 2 put: (objectMemory byteSwapped32IfBigEndian:
>> vmCallbackContext asUnsignedInteger)].
>>             object3 := newLargeInteger3.
>>             l20:    "end maybeInlinePositive32BitIntegerFor:".
>>             goto l1
>>         ].
>>         l1:    "end positive32BitIntegerFor:"
>>     ]
>>
>> However, at same stage, the transformation also applies to previous
>> instance of same message, but this time without correct distribution of
>> object := assignment.
>>
>>         "begin push:".
>>         "begin positiveMachineIntegerFor:".
>>         value := vmCallbackContext thunkp asUnsignedInteger.
>>         object := objectMemory wordSize = 8 ifTrue: [self
>> positive64BitIntegerFor: value] ifFalse: [
>>                     "begin positive32BitIntegerFor:".
>>                     false ifTrue: [
>>                         objectMemory integerObjectOf: (value
>> asUnsignedLong bitAnd: 4294967295).
>>                         goto l5
>>                     ] ifFalse: [
>>                         "begin maybeInlinePositive32BitIntegerFor:".
>>                         self deny: objectMemory hasSixtyFourBitImmediates.
>>                         (self cCoerceSimple: value to: unsigned int) <=
>> objectMemory maxSmallInteger ifTrue: [
>>                             objectMemory integerObjectOf: value.
>>                             goto l4
>>                         ].
>>                         "begin
>> eeInstantiateSmallClassIndex:format:numSlots:".
>>                         objFormat := self firstByteFormat + (8 - 4
>> bitAnd: self wordSize - 1).
>>                         self assert: (1 >= 0 and:
>> [ClassLargePositiveIntegerCompactIndex ~= 0]).
>>                         self assert: (objFormat < self firstByteFormat
>> ifTrue: [objFormat] ifFalse: [objFormat bitAnd: self byteFormatMask]) =
>> (self instSpecOfClass: (self knownClassAtIndex:
>> ClassLargePositiveIntegerCompactIndex)).
>>                         "begin
>> allocateSmallNewSpaceSlots:format:classIndex:".
>>                         self assert: 1 < self numSlotsMask.
>>                         newObj := freeStart.
>>                         numBytes := self baseHeaderSize + (1 <= 1 ifTrue:
>> [8] ifFalse: [1 + (1 bitAnd: 1) * self bytesPerOop]).
>>                         self assert: numBytes \\ self allocationUnit = 0.
>>                         self assert: newObj \\ self allocationUnit = 0.
>>                         freeStart + numBytes > scavengeThreshold ifTrue: [
>>                             needGCFlag ifFalse: [
>>                                 "begin scheduleScavenge".
>>                                 needGCFlag := true.
>>                                 coInterpreter forceInterruptCheck
>>                             ].
>>                             freeStart + numBytes > scavenger eden limit
>> ifTrue: [
>>                                 self error: 'no room in eden for
>> allocateSmallNewSpaceSlots:format:classIndex:'.
>>                                 newLargeInteger := 0.
>>                                 goto l6
>>                             ]
>>                         ].
>>                         self long64At: newObj put: (self cCoerceSimple: 1
>> to: usqLong) << self numSlotsFullShift + objFormat << self formatShift +
>> ClassLargePositiveIntegerCompactIndex.
>>                         freeStart := freeStart + numBytes.
>>                         newLargeInteger := newObj.
>>                         l6:    "end
>> allocateSmallNewSpaceSlots:format:classIndex:".
>>                         self cppIf: SPURVM ifTrue: [
>>                             self long32At: newLargeInteger + self
>> baseHeaderSize + 0 << 2 put: (objectMemory byteSwapped32IfBigEndian: value).
>>                             self long32At: newLargeInteger + self
>> baseHeaderSize + 1 << 2 put: 0
>>                         ] ifFalse: [self long32At: newLargeInteger + self
>> baseHeaderSize + 0 << 2 put: (objectMemory byteSwapped32IfBigEndian:
>> value)].
>>                         newLargeInteger.
>>                         l4:    "end maybeInlinePositive32BitIntegerFor:".
>>                         goto l5
>>                     ].
>>                     l5:    "end positive32BitIntegerFor:"
>>                 ].
>>         stackPages longAt: sp := stackPointer - objectMemory wordSize
>> put: object.
>>         stackPointer := sp.
>>
>> What sounds strange is that previous instance of that message was really
>> behind when entering pass 6:
>>
>>         "begin push:".
>>         object := self positiveMachineIntegerFor: vmCallbackContext
>> thunkp asUnsignedInteger.
>>         stackPages longAt: sp := stackPointer - objectMemory wordSize
>> put: object.
>>         stackPointer := sp.
>>
>> It sounds like very tricky circumstances.
>> My finding is that this is due to a global state mutation (Tmethod
>> isComplete)
>> This is used in
>>
>> inlineableSend: aNode in: aCodeGen
>>     "Answer true if the given send node is a call to a method that can be
>> inlined."
>>
>>     | m |
>>     aCodeGen maybeBreakForTestToInline: aNode in: self.
>>     aNode isSend ifFalse: [ ^false ].
>>     m := aCodeGen methodNamed: aNode selector.  "nil if builtin or
>> external function"
>>     ^m ~= nil and: [m ~~ self and: [m isComplete and: [aCodeGen
>> mayInline: m selector]]]
>>
>> Now, thanks to this global state, we inline more than the very
>> restrictive rules, but unfortunately sometimes wrongly...
>>
>>
>> 2016-04-27 21:47 GMT+02:00 Nicolas Cellier <
>> nicolas.cellier.aka.nice at gmail.com>:
>>
>>> Hi,
>>> I changed the compiler flag -Wno-unused-value to -Wunused-value because
>>> there's no reason to carry dead code. Dead code should ring a bell.
>>> it seems it uncovers a problem in code generation.
>>> An example is in code generated for sendInvokeCallbackContext:
>>>
>>>        if ((argumentCountOfMethodHeader(methodHeaderOf(GIV(newMethod))))
>>> == 4) {
>>>                 /* begin push: */
>>>                 value = ((usqInt)((vmCallbackContext->thunkp)));
>>>                 /* begin positive32BitIntegerFor: */
>>>                 /* begin maybeInlinePositive32BitIntegerFor: */
>>>                 assert(!((hasSixtyFourBitImmediates())));
>>>                 if ((((unsigned int) value)) <= (MaxSmallInteger)) {
>>>                         ((value << 1) | 1);
>>>                        ^~~~~~~~~~~~~~ PROBLEM HERE: this result should
>>> be stored in variable 'object'
>>>                         goto l4;
>>>                 }
>>>                 /* begin eeInstantiateSmallClassIndex:format:numSlots: */
>>> ...snip...
>>>                 object = newLargeInteger;
>>>         l4:     /* end maybeInlinePositive32BitIntegerFor: */;
>>>                 goto l5;
>>>
>>>         l5:     /* end positive32BitIntegerFor: */;
>>>
>>>                 longAtput((sp = GIV(stackPointer) - BytesPerWord),
>>> object);
>>>
>>> The corresponding Slang code is:
>>>
>>>     (self argumentCountOf: newMethod) = 4 ifTrue:
>>>         [self push: (self positiveMachineIntegerFor: vmCallbackContext
>>> thunkp asUnsignedInteger).
>>>
>>> With push:
>>>
>>>     | sp |
>>>     <inline: true>
>>>     <var: #sp type: #'char *'>
>>>     stackPages longAt: (sp := stackPointer - objectMemory wordSize) put:
>>> object.
>>>     stackPointer := sp
>>>
>>> With positiveMachineIntegerFor:
>>>
>>>     <var: #value type: #'unsigned long'>
>>>     <inline: true>
>>>     ^objectMemory wordSize = 8
>>>         ifTrue: [self positive64BitIntegerFor: value]
>>>         ifFalse: [self positive32BitIntegerFor: value]
>>>
>>> With positive32BitIntegerFor:
>>>
>>>     <inline: true>
>>>     <var: 'integerValue' type: #'unsigned int'>
>>>     objectMemory hasSixtyFourBitImmediates
>>>         ifTrue:
>>>             [^objectMemory integerObjectOf: (integerValue asUnsignedLong
>>> bitAnd: 16rFFFFFFFF)]
>>>         ifFalse:
>>>             [^self maybeInlinePositive32BitIntegerFor: integerValue]
>>>
>>> With maybeInlinePositive32BitIntegerFor:
>>>
>>>     <notOption: #Spur64BitMemoryManager>
>>>     <var: 'integerValue' type: #'unsigned int'>
>>>     | newLargeInteger |
>>>     self deny: objectMemory hasSixtyFourBitImmediates.
>>>        "force coercion because slang inliner sometimes incorrectly pass
>>> a signed int without converting to unsigned"
>>>        (self cCode: [self cCoerceSimple: integerValue to: #'unsigned
>>> int']
>>>             inSmalltalk: [integerValue bitAnd: 1 << 32 - 1]) <=
>>> objectMemory maxSmallInteger ifTrue:
>>>         [^objectMemory integerObjectOf: integerValue].
>>>     newLargeInteger := objectMemory
>>>                             eeInstantiateSmallClassIndex:
>>>     ...snip...
>>>     ^newLargeInteger
>>>
>>>
>>
>>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20160428/6376b78b/attachment-0001.htm


More information about the Vm-dev mailing list