[Vm-dev] VM Maker: VMMaker-dtl.344.mcz

Eliot Miranda eliot.miranda at gmail.com
Wed Mar 12 17:39:51 UTC 2014


Hi David,


On Wed, Mar 12, 2014 at 10:09 AM, <commits at source.squeak.org> wrote:

>
> David T. Lewis uploaded a new version of VMMaker to project VM Maker:
> http://source.squeak.org/VMMaker/VMMaker-dtl.344.mcz
>
> ==================== Summary ====================
>
> Name: VMMaker-dtl.344
> Author: dtl
> Time: 12 March 2014, 12:57:02.35 pm
> UUID: d27f9db0-776a-4a76-aa71-2fc4aaa4885a
> Ancestors: VMMaker-dtl.343
>
> VMMaker 4.13.4
>
> Fix slang inlining regression introduced in VMMaker-dtl.339 caused by
> TMethod>>tryToInlineMethodsIn: that excluded nodes with selector
> #cCode:inSmalltalk: which must remain inlineable to support various methods
> for 32/64 bit image format from common code base, as well as for low level
> MemoryAccess in slang that must be fully inlined for performance.
>

I don't think that's the way to fix this.  Look at
Cog's MessageNode>>#asTranslatorNodeIn: and you'll see the phrase:

((sel == #cCode:inSmalltalk: "extracting here rather than in translation
allows inlining in the block."
  or: [sel == #cCode:])
 and: [arguments first isBlockNode]) ifTrue:
[| block |
 ^(block := arguments first asTranslatorNodeIn: aTMethod) statements size =
1
ifTrue: [block statements first]
ifFalse: [block]].

This inlines early and always (before CCodeGenerator inlines), which neatly
arranges that the rest of CCodeGenerator has less work to do.


> Change FloatArrayPlugin>>primitiveDivFloatArray and
> BitBltSimulation>>primitiveDisplayString to send intAtPointer and
> byteAtPointer to self rather than to interpreterProxy, because they are not
> defined in the interpreter proxy and cause compilation errors when slang
> inlining is turned off. These should be added to the interpreter proxy (or
> arguably to a hypothetical objects memory proxy) but send to self is a
> lesser evil for now.
>
> These changes restore a very small performance gain for the standard VM
> due to the inlining fix, and also restore inlining of low level
> MemoryAccess methods when added to the VM build. Inlined MemoryAccess
> methods are essentially equivalent to the C macros in sqMemoryAccess.h but
> can be run in simulation, and play nicely with C debuggers and profilers.
>
> =============== Diff against VMMaker-dtl.343 ===============
>
> Item was changed:
>   ----- Method: BitBltSimulation>>primitiveDisplayString (in category
> 'primitives') -----
>   primitiveDisplayString
>
>         | kernDelta xTable glyphMap stopIndex startIndex sourceString
> bbObj maxGlyph ascii glyphIndex sourcePtr left quickBlt |
>         <export: true>
>         <var: #sourcePtr type: 'char *'>
>         interpreterProxy methodArgumentCount = 6
>                 ifFalse:[^interpreterProxy primitiveFail].
>         kernDelta := interpreterProxy stackIntegerValue: 0.
>         xTable := interpreterProxy stackObjectValue: 1.
>         glyphMap := interpreterProxy stackObjectValue: 2.
>         ((interpreterProxy fetchClassOf: xTable) = interpreterProxy
> classArray and:[
>                 (interpreterProxy fetchClassOf: glyphMap) =
> interpreterProxy classArray])
>                         ifFalse:[^interpreterProxy primitiveFail].
>         (interpreterProxy slotSizeOf: glyphMap) = 256
> ifFalse:[^interpreterProxy primitiveFail].
>         interpreterProxy failed ifTrue:[^nil].
>         maxGlyph := (interpreterProxy slotSizeOf: xTable) - 2.
>
>         stopIndex := interpreterProxy stackIntegerValue: 3.
>         startIndex := interpreterProxy stackIntegerValue: 4.
>         sourceString := interpreterProxy stackObjectValue: 5.
>         (interpreterProxy isBytes: sourceString)
> ifFalse:[^interpreterProxy primitiveFail].
>         (startIndex > 0 and:[stopIndex > 0 and:[
>                 stopIndex <= (interpreterProxy byteSizeOf: sourceString)]])
>                         ifFalse:[^interpreterProxy primitiveFail].
>
>         bbObj := interpreterProxy stackObjectValue: 6.
>         (self loadBitBltFrom: bbObj) ifFalse:[^interpreterProxy
> primitiveFail].
>         (combinationRule = 30 or:[combinationRule = 31]) "needs extra
> source alpha"
>                 ifTrue:[^interpreterProxy primitiveFail].
>         "See if we can go directly into copyLoopPixMap (usually we can)"
>         quickBlt := destBits ~= 0 "no OS surfaces please"
>                                 and:[sourceBits ~= 0 "and again"
>                                 and:[noSource = false "needs a source"
>                                 and:[sourceForm ~= destForm "no blits onto
> self"
>                                 and:[(cmFlags ~= 0
>                                                 or:[sourceMSB ~= destMSB
>                                                 or:[sourceDepth ~=
> destDepth]]) "no point using slower version"
>                                 ]]]].
>         left := destX.
>         sourcePtr := interpreterProxy firstIndexableField: sourceString.
>         startIndex to: stopIndex do:[:charIndex|
> +               ascii := self byteAtPointer: sourcePtr + charIndex - 1.
> -               ascii := interpreterProxy byteAtPointer: sourcePtr +
> charIndex - 1.
>                 glyphIndex := interpreterProxy fetchInteger: ascii
> ofObject: glyphMap.
>                 (glyphIndex < 0 or:[glyphIndex > maxGlyph])
>                         ifTrue:[^interpreterProxy primitiveFail].
>                 sourceX := interpreterProxy fetchInteger: glyphIndex
> ofObject: xTable.
>                 width := (interpreterProxy fetchInteger: glyphIndex+1
> ofObject: xTable) - sourceX.
>                 interpreterProxy failed ifTrue:[^nil].
>                 self clipRange. "Must clip here"
>                 (bbW > 0 and:[bbH > 0]) ifTrue: [
>                         quickBlt ifTrue:[
>                                 self destMaskAndPointerInit.
>                                 self copyLoopPixMap.
>                                 "both, hDir and vDir are known to be > 0"
>                                 affectedL := dx.
>                                 affectedR := dx + bbW.
>                                 affectedT := dy.
>                                 affectedB := dy + bbH.
>                         ] ifFalse:[self copyBits]].
>                 interpreterProxy failed ifTrue:[^nil].
>                 destX := destX + width + kernDelta.
>          ].
>         affectedL := left.
>         self showDisplayBits.
>         "store destX back"
>         interpreterProxy storeInteger: BBDestXIndex ofObject: bbObj
> withValue: destX.
>         interpreterProxy pop: 6. "pop args, return rcvr"!
>
> Item was changed:
>   ----- Method: FloatArrayPlugin>>primitiveDivFloatArray (in category
> 'arithmetic primitives') -----
>   primitiveDivFloatArray
>         "Primitive. Add the receiver and the argument, both FloatArrays
> and store the result into the receiver."
>         | rcvr arg rcvrPtr argPtr length |
>         <export: true>
>         <var: #rcvrPtr type:'float *'>
>         <var: #argPtr type:'float *'>
>         arg := interpreterProxy stackObjectValue: 0.
>         rcvr := interpreterProxy stackObjectValue: 1.
>         interpreterProxy failed ifTrue:[^nil].
>         interpreterProxy success: (interpreterProxy isWords: arg).
>         interpreterProxy success: (interpreterProxy isWords: rcvr).
>         interpreterProxy failed ifTrue:[^nil].
>         length := interpreterProxy stSizeOf: arg.
>         interpreterProxy success: (length = (interpreterProxy stSizeOf:
> rcvr)).
>         interpreterProxy failed ifTrue:[^nil].
>         rcvrPtr := self cCoerce: (interpreterProxy firstIndexableField:
> rcvr) to: 'float *'.
>         argPtr := self cCoerce: (interpreterProxy firstIndexableField:
> arg) to: 'float *'.
>         "Check if any of the argument's values is zero"
>         0 to: length-1 do:[:i|
> +               ( self intAtPointer:(self cCoerce: (argPtr + i) to:
> 'char*')) = 0 ifTrue:[^interpreterProxy primitiveFail]].
> -               ( interpreterProxy intAtPointer:(self cCoerce: (argPtr +
> i) to: 'char*')) = 0 ifTrue:[^interpreterProxy primitiveFail]].
>         0 to: length-1 do:[:i|
>                 rcvrPtr at: i put: (self cCoerce: (rcvrPtr at: i) to:
> 'double') / (self cCoerce: (argPtr at: i) to: 'double').
>         ].
>         interpreterProxy pop: 1. "Leave rcvr on stack"!
>
> Item was changed:
> + ----- Method: TMethod>>copy (in category 'copying') -----
> - ----- Method: TMethod>>copy (in category 'utilities') -----
>   copy
>         "Make a deep copy of this TMethod."
>
>         ^ (self class basicNew)
>                 setSelector: selector
>                 returnType: returnType
>                 args: args copy
>                 locals: locals copy
>                 declarations: declarations copy
>                 primitive: primitive
>                 parseTree: parseTree copyTree
>                 labels: labels copy
>                 complete: complete;
>                 sharedLabel: sharedLabel;
>                 sharedCase: sharedCase;
>                 yourself
>   !
>
> Item was changed:
>   ----- Method: TMethod>>tryToInlineMethodsIn: (in category 'inlining')
> -----
>   tryToInlineMethodsIn: aCodeGen
>         "Expand any (complete) inline methods called by this method. Set
> the complete bit when all inlining has been done. Return true if something
> was inlined."
>
>         | stmtLists didSomething newStatements sendsToInline |
>         self definedAsMacro ifTrue:
>                 [complete := true.
>                  ^false].
>         didSomething := false.
>         sendsToInline := Dictionary new: 100.
>         parseTree
>                 nodesDo:
>                         [:node|
>                         (self inlineableFunctionCall: node in: aCodeGen)
> ifTrue:
>                                 [sendsToInline at: node put: (self
> inlineFunctionCall: node in: aCodeGen)]]
>                 unless: "Don't inline the arguments to asserts to keep the
> asserts readable"
>                         [:node|
> +                               node isSend
> +                                       and: [aCodeGen isAssertSelector:
> node selector]].
> -                       node isSend
> -                       and: [node selector == #cCode:inSmalltalk:
> -                               or: [aCodeGen isAssertSelector: node
> selector]]].
>
>         sendsToInline isEmpty ifFalse:
>                 [didSomething := true.
>                 self removeUnreferencedDeclarations.
>                 parseTree := parseTree replaceNodesIn: sendsToInline].
>
>         didSomething ifTrue:
>                 [writtenToGlobalVarsCache := nil.
>                 ^didSomething].
>
>         stmtLists := self statementsListsForInliningIn: aCodeGen.
>         stmtLists do:
>                 [:stmtList|
>                 newStatements := OrderedCollection new: 100.
>                 stmtList statements do:
>                         [:stmt|
>                         (self inlineCodeOrNilForStatement: stmt in:
> aCodeGen)
>                                 ifNil: [newStatements addLast: stmt]
>                                 ifNotNil: [:inlinedStmts|
>                                         didSomething := true.
>                                         newStatements addAllLast:
> inlinedStmts]].
>                 stmtList setStatements: newStatements asArray].
>
>         didSomething ifTrue:
>                 [writtenToGlobalVarsCache := nil.
>                 ^didSomething].
>
>         complete ifFalse:
>                 [self checkForCompleteness: stmtLists in: aCodeGen.
>                  complete ifTrue: [ didSomething := true ]].  "marking a
> method complete is progress"
>         ^didSomething!
>
> Item was changed:
>   ----- Method: VMMaker class>>versionString (in category 'version
> testing') -----
>   versionString
>
>         "VMMaker versionString"
>
> +       ^'4.13.4'!
> -       ^'4.13.3'!
>
>


-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20140312/e474654a/attachment-0001.htm


More information about the Vm-dev mailing list