[squeak-dev] Collection's #fold: vs #reduce:
Eliot Miranda
eliot.miranda at gmail.com
Tue Nov 2 21:42:23 UTC 2010
Hi Juan,
On Tue, Nov 2, 2010 at 2:35 PM, Juan Vuletich <juan at jvuletich.org> wrote:
> Eliot Miranda wrote:
>
>>
>>
>> On Tue, Nov 2, 2010 at 1:50 PM, Juan Vuletich <juan at jvuletich.org<mailto:
>> juan at jvuletich.org>> wrote:
>>
>> Eliot Miranda wrote:
>>
>>
>>
>> On Tue, Nov 2, 2010 at 1:23 PM, Juan Vuletich
>> <juan at jvuletich.org <mailto:juan at jvuletich.org>
>> <mailto:juan at jvuletich.org <mailto:juan at jvuletich.org>>> wrote:
>>
>> Nicolas Cellier wrote:
>>
>>
>> I'd say which selector are implemented in Pharo and Cuis ?
>> We should better converge.
>>
>> Nicolas
>>
>> Hi folks,
>>
>> Cuis doesn't include any of them yet. I can add whatever people
>> prefer. What I'd do different is the implementation:
>>
>> fold: aBinaryBlock
>>
>> "Evaluate the block with the first two elements of the
>> receiver,
>> then with the result of the first evaluation and the
>> next element,
>> and so on. Answer the result of the final evaluation.
>> If the
>> receiver
>> is empty, raise an error. If the receiver has a single
>> element,
>> answer
>> that element."
>> "
>> #('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me')
>> fold: [:a
>> :b | a, ' ', b]
>> "
>> | noPreviousValue |
>> noPreviousValue := Object new. "something that can't be in
>> the receiver"
>> ^self inject: noPreviousValue into: [ :previousValue :each |
>> previousValue == noPreviousValue
>> ifTrue: [ each ]
>> ifFalse: [ aBinaryBlock value: previousValue
>> value: each ]]
>>
>> This is easier to understand, and it also makes clear the
>> relation
>> between #fold: (or #reduce:) and #inject:into: .
>>
>>
>> I disagree. inject:into: is not particularly easy to
>> understand, whereas both the existing fold: and reduce: are
>> understandable in terms of do:.
>>
>>
>> I say this is easier to understand, given that #inject:into: is
>> already understood. #inject:into: and #fold: / #reduce: are very
>> close in what they do. So close, that it is best to show how they
>> differ. That's what my implementation does. Showing the use of
>> #inject:into: (with proper names for block args) is a didactic
>> bonus.
>>
>>
>> Also the use of inject:into: isn't really buying you anything
>> since the logic in the block within the inject:into: usage is
>> as complex as that within fold: or reduce:. Further, this /is/
>> a lot slower because of the extra level of evaluation (there's
>> an extra block activation around each element).
>>
>> best,
>> Eliot
>>
>>
>> No, it is not slower:
>> [100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
>> 'to' 'me') fold: [:a :b | a, ' ', b]]] timeToRun 2710 | 879
>> [100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
>> 'to' 'me') foldx: [:a :b | a, ' ', b]]] timeToRun 2723 | 874
>> [100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
>> 'to' 'me') reduce: [:a :b | a, ' ', b]]] timeToRun 2780 | 881
>>
>>
>> Be very careful to run these under the same conditions. You'd need to run
>> each of these immediately following a GC to get meaningful results. Also
>> use 1 to: n do for the outer loop since timesRepeat: uses do: and so the
>> measurement harness is probably a significant fraction of the run-time.
>> Also measure the single-element cost. BTW, I would use (1 to: 10) fold:
>> [:a :b| a + b] to reduce the overhead of string concatenation & allocation
>> and focus more on the costs of the various fold: implementations.
>>
>
> You're right wrt performance (only tried Cog on this example):
> Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100) fold: [:a
> :b| a + b] ]] timeToRun 1367
> Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100) foldx: [:a
> :b| a + b] ]] timeToRun 1656
> Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100) reduce:
> [:a :b| a + b] ]] timeToRun 1306
Yes, alas one has to be hyper-careful. For example if I run the following
in my current image:
Smalltalk garbageCollect.
{ [1 to: 1000000 do: [ :i | (1 to: 100) fold: [:a :b| a + b] ]] timeToRun.
[1 to: 1000000 do: [ :i | (1 to: 100) foldAlt: [:a :b| a + b] ]]
timeToRun.
[1 to: 1000000 do: [ :i | (1 to: 100) reduce: [:a :b| a + b] ]] timeToRun
}
chances are that the second one will be fastest, irrespective of which
implementation is used. So change the order to foldAlt:, fold: reduce: and
fold: will win, etc.
Levente is very careful to measure the implementations precisely, so we can
trust his numbers.
cheers,
Eliot
>
>
>
>>
>>
>> #foldx: is my implementation. In all cases, the first number is
>> regular interpreter Squeak 4.2.4beta1U.app, the second number is
>> Cog Squeak 5.8b10-1.app on a 1.6GHz Mac mini.
>>
>> The argument about complexity is more aesthetic than anything.
>>
>>
>> Indeed; and so a matter of taste. My aesthetics end up preferring fold:
>> over the inject:into: based one :/
>>
>>
>> My version doesn't need assignment to outer temps, and I see that
>> 'cleaner' and closer to FP.
>>
>>
>>
>> But the assignment to the outer temp is hidden in inject:into:; it's still
>> there ;)
>>
>
> :)
>
> Cheers,
> Juan Vuletich
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20101102/35250c83/attachment.htm
More information about the Squeak-dev
mailing list
|