[squeak-dev] Re: Optimizing RunArray

Igor Stasenko siguctua at gmail.com
Tue Aug 2 12:16:45 UTC 2011


Does it impacts the text rendering/processing speed?


On 2 August 2011 12:11, Nicolas Cellier
<nicolas.cellier.aka.nice at gmail.com> wrote:
> To complete myself, the fast #collect: already exists and is named
> #mapValues: except that it modifies the RunArray in place and also
> won't coalesce...
>
> I also gain a huge factor for #collect:as: be defining this method:
>
> RunArray>>fillFrom: aCollection with: aBlock
>        "Evaluate aBlock with each of aCollections's elements as the argument.
>        Collect the resulting values into self. Answer self."
>        | newRuns newValues lastLength lastValue |
>        newRuns := (Array new: aCollection size) writeStream.
>        newValues := (Array new: aCollection size) writeStream.
>        lastLength := 0.
>        lastValue := Object new.
>        aCollection do: [:each |
>                | value |
>                value := aBlock value: each.
>                lastValue = value
>                        ifTrue: [lastLength := lastLength + 1]
>                        ifFalse:
>                                [lastLength > 0
>                                        ifTrue:
>                                                [newRuns nextPut: lastLength.
>                                                newValues nextPut: lastValue].
>                                lastLength := 1.
>                                lastValue := value]].
>        lastLength > 0
>                ifTrue:
>                        [newRuns nextPut: lastLength.
>                        newValues nextPut: lastValue].
>        self setRuns: newRuns contents setValues: newValues contents
>
> [ (Array new: 1000) collect: [:e | 4 atRandom] as: RunArray] bench.
> BEFORE: '25.1 per second.'
> AFTER:  '1,080 per second.'
>
> It's worth a few lines of code.
>
> Nicolas
>
> 2011/8/2 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
>> I played a bit with RunArray, and found some un-optimized features.
>> First, I don't know why RunArray is an ArrayedCollection. It cannot
>> #add: but it can #addFirst: and #addLast:.
>> It cannot #add:withOccurrences: but it can #addLast:times:. Why
>> inventing new selectors for old behaviours ?
>> These operations will cost a realloc it the last value is different,
>> so the underlying runs/values could better be an OrderedCollection if
>> these operations are used often.
>> A RunArray cannot remove at all.
>> Very weird collection species, I don't like the implementation too much.
>>
>> Then, #do: loops could be far faster. They rely on ArrayedCollection
>> which inlines do: loops with #to:do: and #at:
>> But #at: is not that fast. Scanning the runs and counting elements
>> would result in a n^2 cost.
>> Fortunately there is a cache lastIndex,lastRun,lastOffset to keep a cost n.
>> Nonetheless, all the tests cost, and the loop is suboptimal.
>> Let use see:
>>
>> version 1:
>> RunArray>>fastDo: aBlock
>>        runs with: values do: [:r :v |
>>                r timesRepeat: [aBlock value: v]].
>>
>> | tmp |
>> tmp := ((Array new: 1000) collect: [:e | 4 atRandom]) as: RunArray.
>> {
>> [ tmp do: [:e |]] bench.
>> [ tmp fastDo: [:e |]] bench.
>> }
>> #('3,220 per second.' '6,290 per second.')
>>
>> But timesRepeat: is slow, it is unoptimized by the compiler and costs
>> a message send.
>> I think we should implement BlockClosure>>repeat: and optimize that
>> call in Compiler.
>> But let's not do it, and rather inline by ourself:
>>
>> version 2:
>>        runs with: values do: [:r :v |
>>                1 to: r do: [:i | aBlock value: v]].
>>
>> | tmp |
>> tmp := ((Array new: 1000) collect: [:e | 4 atRandom]) as: RunArray.
>> {
>> [ tmp do: [:e |]] bench.
>> [ tmp do2: [:e |]] bench.
>> }
>> #('3,070 per second.' '25,500 per second.')
>>
>> We can even inline the with:do: loop itself:
>> version 3:
>>        1 to: runs size do: [:i |
>>                | r v |
>>                v := values at: i.
>>                r := runs at: i.
>>                [( r := r - 1) >= 0]
>>                        whileTrue: [aBlock value: v]].
>>
>> | tmp |
>> tmp := ((Array new: 1000) collect: [:e | 4 atRandom]) as: RunArray.
>> {
>> [ tmp do: [:e |]] bench.
>> [ tmp do2: [:e |]] bench.
>> }
>> #('3,370 per second.' '32,200 per second.')
>>
>> Now the operation I wanted to use was reverseDo: so I implemented:
>> RunArray>>fastReverseDo: aBlock
>>        | i |
>>        i := runs size.
>>        [i > 0]
>>                whileTrue:
>>                        [ | r v |
>>                        v := values at: i.
>>                        r := runs at: i.
>>                        i := i - 1.
>>                        [( r := r - 1) >= 0]
>>                                whileTrue: [aBlock value: v]].
>> | tmp |
>> tmp := ((Array new: 1000) collect: [:e | 4 atRandom]) as: RunArray.
>> {
>> [ tmp reverseDo: [:e |]] bench.
>> [ tmp reverseDo2: [:e |]] bench.
>> }
>>  #('83.9 per second.' '32,600 per second.')
>>
>> Ouch! The cache is missing a lot of indices, and our loop turns into a n^2 cost.
>> I know, premature optimization bla bla bla, but a factor x400 is worth
>> some inlining no?
>>
>> I guess these features are never used.
>> By now RunArray is kind of private utility for Text implementation.
>> But it could / should be generic.
>>
>> I also have proposals for count: / select: / collect:. etc...
>> It would be to evaluate the block only once per group of values.
>> For example
>> RunArray>>collect: aBlock
>>        "Beware, the block will be evaluated only once per group of values."
>>        ^(self class runs: (runs collect: aBlock) contents values: values
>> copy) coalesce
>> But that's controversial, it would make the RunArray behave
>> differently if the block has side effects...
>>
>> | i tmp tmp2 tmp3 |
>> tmp := ((Array new: 1000) collect: [:e | 4 atRandom]).
>> i := 0.
>> tmp2 := tmp collect: [:e | i := i + 1].
>> i := 0.
>> tmp3 := (tmp as: RunArray) collect: [:e | i := i + 1].
>> tmp2 = tmp3 asArray
>> ==> false
>>
>> Nicolas
>>
>
>



-- 
Best regards,
Igor Stasenko AKA sig.



More information about the Squeak-dev mailing list