[squeak-dev] Re: here's a curio for compiler mavens...

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Wed Feb 15 01:16:10 UTC 2012


1238 methods don't really need to factor pop if I also handle the case of return
Note that this does not always make a difference neither in method
length nor in execution time...
It does for example if you end with an assignment in one branch and a
return in the other, like:
    cond ifTrue: [x := 0] ifFalse: [x := -1. ^self]. self doFurtherProcessing

So the above count does not really tell which methods are worth changing...

The case of assignment in single branch is interesting because we
replace a store and a pop with a single storePop.
    cond ifTrue: [x := 0] ifFalse: [self doSome]. self doFurtherProcessing
Don't know if it gains anything when jitted though...

Nicolas

Le 15 février 2012 01:24, Nicolas Cellier
<nicolas.cellier.aka.nice at gmail.com> a écrit :
> It's not a clean image, but here are the results:
>
> Eliot code
> -> 503 methods
>
> Modified Eliot code
> doesEffectForIfRequiresAPop: encoder
>        "If both branch require a pop instruction, then the pop is factored
> after the jump.
>        If a branch is empty or ends with assignment, it don't ends with a
> pop, so we don't need to factor a pop"
>        ^arguments noneSatisfy: [:branch | (branch isJust: NodeNil) or:
> [branch statements last isAssignmentNode]].
> -> 880 methods
>
> Mine
> doesEffectForIfRequiresAPop: encoder
>        "If both branch require a pop instruction, then the pop is factored
> after the jump.
>        In this case, the translation of this node also require a final pop
> that can further be eliminated eventually."
>        ^arguments allSatisfy: [:branch | (branch isJust: NodeNil) not and:
> [branch doesEvaluatingEffectRequiresAPop: encoder]].
> -> 1202 methods (code attached)
>
> And I think I forgot the case of ReturnNode which don't require a pop...
>
> Nicolas
>
> Le 15 février 2012 00:57, Eliot Miranda <eliot.miranda at gmail.com> a écrit :
>>
>>
>> On Tue, Feb 14, 2012 at 2:55 PM, Igor Stasenko <siguctua at gmail.com> wrote:
>>>
>>> On 15 February 2012 00:50, Nicolas Cellier
>>> <nicolas.cellier.aka.nice at gmail.com> wrote:
>>> > Le 14 février 2012 23:10, Eliot Miranda <eliot.miranda at gmail.com> a
>>> > écrit :
>>> >> On Tue, Feb 14, 2012 at 5:30 AM, Nicolas Cellier
>>> >> <nicolas.cellier.aka.nice at gmail.com> wrote:
>>> >>>
>>> >>> Ah, I came up with a relatively explicit solution, but also quite
>>> >>> heavy in term of number of messages required to implement the
>>> >>> optimisation.
>>> >>> I'm curious to see your implementation :)
>>> >>
>>> >>
>>> >> similar, but should I commit it? I think mine is fine since it is only
>>> >> assignments that have storepop forms, so your generality doesn't win
>>> >> anything.  but i could be wrong.
>>> >>
>>> >
>>> > I was thinking of
>>> >
>>> > 1) nested ifTrue: ifFalse:
>>> > cond1 ifTrue: [x1 := y] ifFalse: [cond2 ifTrue: [x2 := y] ifFalse: [x3
>>> > := y]]
>>> >
>>> > elseExpr statements last is a message, not an assignment, though we
>>> > can factor the pop out...
>>> >
>>> > 2) inlined loop messages don't require a final pop...
>>> >
>>> > x <= y ifTrue: [x to: y do: [:i | (self at: i) doSome]] ifFalse: [y
>>> > to: x do: [:i | (self at: i) doSome]].
>>> >
>>> > An encoder pushing a value in both branches just to pop it after
>>> > wouldn't be that well behaved...
>>> >
>>> > I won't give any version publicly because I crashed several images in
>>> > the bootstrap process, and don't know if the bug is in the bootstrap
>>> > or in the code.
>>> >
>>> IMO, too much effort for such diminishing returns. :)
>>
>>
>> That was my thought too.  It would be interesting to quantify the benefits.
>>  I noticed some 451 methods that were improved by my simple assignment-only
>> mod.  How many additional methods get caught by yours Nicolas?
>>
>>>
>>>
>>> > Nicolas
>>> >
>>> >>  !MessageNode methodsFor: 'code generation' stamp: 'eem 2/13/2012
>>> >> 15:25'!
>>> >> ifThenElseForEffectShouldSharePop
>>> >>         | thenExpr elseExpr |
>>> >>         thenExpr := arguments at: 1.
>>> >>         elseExpr := arguments at: 2.
>>> >>         ^((thenExpr isJust: NodeNil)
>>> >>            or: [(elseExpr isJust: NodeNil)
>>> >>            or: [thenExpr statements last isAssignmentNode
>>> >>                   and: [elseExpr statements last isAssignmentNode]]])
>>> >> not
>>> >>
>>> >>         "^(thenExpr isJust: NodeNil) not
>>> >>            and: [(elseExpr isJust: NodeNil) not
>>> >>            and: [thenExpr statements last isAssignmentNode not
>>> >>                   or: [elseExpr statements last isAssignmentNode
>>> >> not]]]"! !
>>> >>
>>> >>> Also, I'm sure that multiple-jump-elimination proposed by Igor and
>>> >>> other kind of optimization would be a perfect subject for proving
>>> >>> concepts of NewCompiler IR infrastructure.
>>> >>>
>>> >>> MessageNode>>doesEffectRequiresAPop: encoder
>>> >>>        special > 0
>>> >>>                ifTrue:
>>> >>>                        [^self perform: (MacroTesters at: special)
>>> >>> with:
>>> >>> encoder].
>>> >>>        ^super doesEffectRequiresAPop: encoder
>>> >>>
>>> >>> MessageNode>>doesEffectForIfRequiresAPop: encoder
>>> >>>        "If both branch require a pop instruction, then the pop is
>>> >>> factored
>>> >>> after the jump.
>>> >>>        In this case, the translation of this node also require a final
>>> >>> pop
>>> >>> that can further be eliminated eventually."
>>> >>>        | thenExpr elseExpr |
>>> >>>        thenExpr := arguments at: 1.
>>> >>>        elseExpr := arguments at: 2.
>>> >>>        ^((thenExpr isJust: NodeNil) or: [elseExpr isJust: NodeNil])
>>> >>> not
>>> >>>                and: [(thenExpr doesEvaluatingEffectRequiresAPop:
>>> >>> encoder)
>>> >>> and:
>>> >>> [elseExpr doesEvaluatingEffectRequiresAPop: encoder]].
>>> >>>
>>> >>> BlockNode>>doesEvaluatingEffectRequiresAPop: anEncoder
>>> >>>        "Test whether a pop instruction is required to skip the value
>>> >>> on
>>> >>> the
>>> >>> stack when this block is inlined."
>>> >>>        ^statements size > 0 and: [statements last
>>> >>> doesEffectRequiresAPop:
>>> >>> anEncoder]
>>> >>>
>>> >>> AssigmentNode>>doesEffectRequiresAPop: anEncoder
>>> >>>        "Test whether a pop instruction is required to skip the value
>>> >>> on
>>> >>> the stack"
>>> >>>        ^(variable canStorePopInASingleInstruction: anEncoder) not
>>> >>>
>>> >>> InstanceVariableNode>>canStorePopInASingleInstruction: anEncoder
>>> >>>        ^true
>>> >>>
>>> >>> etc...
>>> >>>
>>> >>> Le 14 février 2012 01:27, Eliot Miranda <eliot.miranda at gmail.com> a
>>> >>> écrit
>>> >>> :
>>> >>> > OK, I fixed this (a pleasant Monday afternoon distraction) (at least
>>> >>> > I
>>> >>> > *think* I've fixed it; I still have to check the decompiler).
>>> >>> >  But... in
>>> >>> > doing so I realised I had to fix CompiledMethod>>#= since I needed
>>> >>> > to
>>> >>> > see
>>> >>> > exactly which methods the compiler change affected (just committed
>>> >>> > as
>>> >>> > Kernel-eem.669).  In doing this I noticed that recent compiler
>>> >>> > changes
>>> >>> > have
>>> >>> > not been applied by recompiling affected methods in the system.  If
>>> >>> > you
>>> >>> > look
>>> >>> > at Behavior>>#sourceMatchesBytecodeAt: it uses CompiledMethod>>#= to
>>> >>> > check
>>> >>> > that the compiler's output remains unchanged.  In the 4.3 trunk
>>> >>> > image
>>> >>> > with
>>> >>> > CompiledMethod>>#= fixed, sourceMatchesBytecodeAt: reports that
>>> >>> > some 19654
>>> >>> > methods need recompiling:
>>> >>> >
>>> >>> > (SystemNavigation new allSelect: [:m| (m methodClass
>>> >>> > sourceMatchesBytecodeAt: m selector) not]) size 19654
>>> >>> >
>>> >>> > Being selective in recompiling, via e.g.
>>> >>> >
>>> >>> > SystemNavigation new allSelect: [:m| (m methodClass
>>> >>> > sourceMatchesBytecodeAt:
>>> >>> > m selector) ifFalse: [m methodClass recompile: m selector]. false]
>>> >>> >
>>> >>> > one gets to the desired
>>> >>> >
>>> >>> > (SystemNavigation new allSelect: [:m| (m methodClass
>>> >>> > sourceMatchesBytecodeAt: m selector) not]) size 0
>>> >>> >
>>> >>> > So the question is should we add an update that does the selective
>>> >>> > recompile?  I think we should, e.g. by committing a version of
>>> >>> > Compiler
>>> >>> > that
>>> >>> > contains the selective recompile as a post-script.  Agreed?
>>> >>> >
>>> >>> > On Mon, Feb 13, 2012 at 11:58 AM, Eliot Miranda
>>> >>> > <eliot.miranda at gmail.com>
>>> >>> > wrote:
>>> >>> >>
>>> >>> >> Hi All,
>>> >>> >>
>>> >>> >>     (whitewash alert)  I just had occasion to look at the bytecode
>>> >>> >> generated by the standard compiler for HashedCollection
>>> >>> >> class>>#goodPrimeAtLeast:.  Here's the source, with the issue in
>>> >>> >> bold:
>>> >>> >>
>>> >>> >> goodPrimeAtLeast: lowerLimit
>>> >>> >> "Answer the next good prime >= lowerlimit.
>>> >>> >> If lowerLimit is larger than the largest known good prime,
>>> >>> >> just make it odd."
>>> >>> >> | primes low mid high prime |
>>> >>> >> primes := self goodPrimes.
>>> >>> >> low := 1.
>>> >>> >> high := primes size.
>>> >>> >> lowerLimit > (primes at: high) ifTrue: [
>>> >>> >> ^lowerLimit bitOr: 1 ].
>>> >>> >> [ high - low <= 1 ] whileFalse: [
>>> >>> >> mid := high + low // 2.
>>> >>> >> prime := primes at: mid.
>>> >>> >> prime = lowerLimit ifTrue: [ ^prime ].
>>> >>> >> prime < lowerLimit
>>> >>> >> ifTrue: [ low := mid ]
>>> >>> >> ifFalse: [ high := mid ] ].
>>> >>> >> (primes at: low) >= lowerLimit ifTrue: [ ^primes at: low ].
>>> >>> >> ^primes at: high
>>> >>> >>
>>> >>> >> The code for this sequence is
>>> >>> >> 58 <15> pushTemp: 5
>>> >>> >> 59 <10> pushTemp: 0
>>> >>> >> 60 <B2> send: <
>>> >>> >> 61 <9B> jumpFalse: 66
>>> >>> >> 62 <13> pushTemp: 3
>>> >>> >> 63 <81 42> storeIntoTemp: 2
>>> >>> >> 65 <92> jumpTo: 69
>>> >>> >> 66 <13> pushTemp: 3
>>> >>> >> 67 <81 44> storeIntoTemp: 4
>>> >>> >> 69 <87> pop
>>> >>> >>
>>> >>> >> where-as the following would be better:
>>> >>> >>
>>> >>> >> 58 <15> pushTemp: 5
>>> >>> >> 59 <10> pushTemp: 0
>>> >>> >> 60 <B2> send: <
>>> >>> >> 61 <9B> jumpFalse: 66
>>> >>> >> 62 <13> pushTemp: 3
>>> >>> >> 63 <82 42> popIntoTemp: 2
>>> >>> >> 65 <92> jumpTo: 69
>>> >>> >> 66 <13> pushTemp: 3
>>> >>> >> 67 <82 44> popIntoTemp: 4
>>> >>> >>
>>> >>> >> The reason is that the code generator favours using a single pop
>>> >>> >> for
>>> >>> >> both
>>> >>> >> arms of the if:
>>> >>> >>
>>> >>> >> MessageNode>>sizeCodeForIf: encoder value: forValue
>>> >>> >> | thenExpr elseExpr branchSize thenSize elseSize |
>>> >>> >> thenExpr := arguments at: 1.
>>> >>> >> elseExpr := arguments at: 2.
>>> >>> >> (forValue
>>> >>> >> or: [(thenExpr isJust: NodeNil)
>>> >>> >> or: [elseExpr isJust: NodeNil]]) not
>>> >>> >> "(...not ifTrue: avoids using ifFalse: alone during this compile)"
>>> >>> >> ifTrue:  "Two-armed IFs forEffect share a single pop"
>>> >>> >> [^super sizeCodeForEffect: encoder].
>>> >>> >> ...
>>> >>> >>
>>> >>> >> MessageNode>>emitCodeForIf: stack encoder: encoder value: forValue
>>> >>> >> | thenExpr thenSize elseExpr elseSize |
>>> >>> >> thenSize := sizes at: 1.
>>> >>> >> elseSize := sizes at: 2.
>>> >>> >> (forValue not and: [elseSize * thenSize > 0]) ifTrue:
>>> >>> >> "Two-armed IFs forEffect share a single pop"
>>> >>> >> [^super emitCodeForEffect: stack encoder: encoder].
>>> >>> >>
>>> >>> >> It would be nice if this only happened if doing so actually reduced
>>> >>> >> the
>>> >>> >> size of the generated code.
>>> >>> >> --
>>> >>> >> best,
>>> >>> >> Eliot
>>> >>> >>
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> > --
>>> >>> > best,
>>> >>> > Eliot
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> best,
>>> >> Eliot
>>> >>
>>> >>
>>> >>
>>> >>
>>> >
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Igor Stasenko.
>>>
>>
>>
>>
>> --
>> best,
>> Eliot
>>
>>
>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sharePop3.st
Type: application/octet-stream
Size: 10941 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20120215/9e8e898e/sharePop3.obj


More information about the Squeak-dev mailing list