[squeak-dev] Collection's #fold: vs #reduce:

Juan Vuletich juan at jvuletich.org
Tue Nov 2 21:35:13 UTC 2010


Eliot Miranda wrote:
>
>
> On Tue, Nov 2, 2010 at 1:50 PM, Juan Vuletich <juan at jvuletich.org 
> <mailto:juan at jvuletich.org>> wrote:
>
>     Eliot Miranda wrote:
>
>
>
>         On Tue, Nov 2, 2010 at 1:23 PM, Juan Vuletich
>         <juan at jvuletich.org <mailto:juan at jvuletich.org>
>         <mailto:juan at jvuletich.org <mailto:juan at jvuletich.org>>> wrote:
>
>            Nicolas Cellier wrote:
>
>
>                I'd say which selector are implemented in Pharo and Cuis ?
>                We should better converge.
>
>                Nicolas
>                
>
>            Hi folks,
>
>            Cuis doesn't include any of them yet. I can add whatever people
>            prefer. What I'd do different is the implementation:
>
>            fold: aBinaryBlock
>
>              "Evaluate the block with the first two elements of the
>         receiver,
>               then with the result of the first evaluation and the
>         next element,
>               and so on.  Answer the result of the final evaluation.
>         If the
>            receiver
>               is empty, raise an error. If the receiver has a single
>         element,
>            answer
>               that element."
>              "
>              #('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up' 'to' 'me')
>         fold: [:a
>            :b | a, ' ', b]
>              "
>              | noPreviousValue |
>              noPreviousValue := Object new.    "something that can't be in
>            the receiver"
>              ^self inject: noPreviousValue into: [ :previousValue :each |
>                  previousValue == noPreviousValue
>                      ifTrue: [ each ]
>                      ifFalse: [ aBinaryBlock value: previousValue
>         value: each ]]
>
>            This is easier to understand, and it also makes clear the
>         relation
>            between #fold: (or #reduce:) and #inject:into: .
>
>
>         I disagree.   inject:into: is not particularly easy to
>         understand, whereas both the existing fold: and reduce: are
>         understandable in terms of do:.
>
>
>     I say this is easier to understand, given that #inject:into: is
>     already understood. #inject:into: and #fold: / #reduce: are very
>     close in what they do. So close, that it is best to show how they
>     differ. That's what my implementation does. Showing the use of
>     #inject:into:  (with proper names for block args) is a didactic
>     bonus.
>
>
>         Also the use of inject:into: isn't really buying you anything
>         since the logic in the block within the inject:into: usage is
>         as complex as that within fold: or reduce:. Further, this /is/
>         a lot slower because of the extra level of evaluation (there's
>         an extra block activation around each element).
>
>          
>         best,
>         Eliot
>
>
>     No, it is not slower:
>       [100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
>     'to' 'me') fold: [:a :b | a, ' ', b]]] timeToRun 2710 | 879
>       [100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
>     'to' 'me') foldx: [:a :b | a, ' ', b]]] timeToRun 2723 | 874
>       [100000 timesRepeat: [#('if' 'it' 'is' 'to' 'be' 'it' 'is' 'up'
>     'to' 'me') reduce: [:a :b | a, ' ', b]]] timeToRun 2780 | 881
>
>
> Be very careful to run these under the same conditions.  You'd need to 
> run each of these immediately following a GC to get meaningful 
> results.  Also use 1 to: n do for the outer loop since timesRepeat: 
> uses do: and so the measurement harness is probably a significant 
> fraction of the run-time.  Also measure the single-element cost.  BTW, 
> I would use (1 to: 10) fold: [:a :b| a + b] to reduce the overhead of 
> string concatenation & allocation and focus more on the costs of the 
> various fold: implementations.

You're right wrt performance (only tried Cog on this example):
    Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100) fold: 
[:a :b| a + b] ]] timeToRun 1367
    Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100) 
foldx: [:a :b| a + b] ]] timeToRun 1656
    Smalltalk garbageCollect. [1 to: 200000 do: [ :i | (1 to: 100) 
reduce: [:a :b| a + b] ]] timeToRun 1306

>
>
>
>     #foldx: is my implementation. In all cases, the first number is
>     regular interpreter Squeak 4.2.4beta1U.app, the second number is
>     Cog Squeak 5.8b10-1.app on a 1.6GHz Mac mini.
>
>     The argument about complexity is more aesthetic than anything.
>
>
> Indeed; and so a matter of taste.  My aesthetics end up preferring 
> fold: over the inject:into: based one :/
>
>  
>
>     My version doesn't need assignment to outer temps, and I see that
>     'cleaner' and closer to FP.
>
>
>
> But the assignment to the outer temp is hidden in inject:into:; it's 
> still there ;)

:)

Cheers,
Juan Vuletich



More information about the Squeak-dev mailing list