[squeak-dev] The defaullt implementation of isEmpty might do too much work

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Tue Oct 25 21:35:41 UTC 2016


Hi, Chris, Monty, Tim,
did you see the implementation of Collection>>size?

size
    "Answer how many elements the receiver contains."

    | tally |
    tally := 0.
    self do: [:each | tally := tally + 1].
    ^ tally

Let's forget a minute about specific subclasses.
Does iterating over the whole collection is really the best idea for
implementing isEmpty?
When we implement a message in Collection, we should think in term of
Collection, not Interval, Set nor Array.

We're talking of several things here:
- universality of the default implementation:
  we don't need to know about the size to tell if empty
   not all collection have a known size
- optimization:
  old implementation is fast for collections having an optimized size,
catastrophic for others.
  new implementation is reasonably fast, and full optimization can be
restored with a few 1-liner.
- cleverness:
  IMO this is not more clever than, and quite logically connected to
implementation of size

I agree that having several implementations for the sake of optimization is
not ideal.
That's what would have curbed my enthusiasm.
Though I doubt about any noticeable slowdown in anything else than
micro-benchmarks.



2016-10-25 22:49 GMT+02:00 Eliot Miranda <eliot.miranda at gmail.com>:

> Hi Tobias, Hi All,
>
> On Tue, Oct 25, 2016 at 1:03 PM, Tobias Pape <Das.Linux at gmx.de> wrote:
>
>>
>> On 25.10.2016, at 21:45, Chris Muller <asqueaker at gmail.com> wrote:
>>
>> > Folks there is NO BASIS at the level of Collection for assuming that
>> > do: [ : each | ^
>> > false ] is faster than ^self size=0.  In fact, the proper assumption
>> > is the opposite -- that #size is the optimized method, not #do:.
>>
>> I don't understand on what _either_ performance-assumption is grounded.
>> Since I've apparently not been around since the early years of either
>> ST-80 or Squeak, can someone enlighten me here?
>>
>> Both ways seem somewhat reasonable:
>>  `self size = 0` is simple, easy to understand on first sight and said
>> here to be more expensive than
>
>  `self do: [:ea |  ^false]`, which is clever[1], possibly needs a
>> double-take (or comment), but has a certain appeal of wholeness[2].
>>
>
> self size = 0 doesn't work for infinite collections.  The suggested
> implementation of isEmpty does indeed require a comment (I didn't implement
> it) but it does work for infinite collections. The suggested implementation
> os isEmpty is also faster than elf size = 0 for any collection of
> sufficient size.
>
> For example:
>
> | b n |
> b := Bag someInstance.
> n := 100000.
> { b size. [1 to: n do: [:ign| b isEmptyOId]] timeToRun. [1 to: n do:
> [:ign| b isEmpty]] timeToRun } #(20402 3293 21)
>
> and the cut off is quite low:
>
>
> | n |
> n := 10000.
> ((1 to: 10) collect:
> [:iguana|
> (1 to: SmallInteger maxVal) detect:
> [:i|
> b := (1 to: i) asBag.
> [1 to: n do: [:ign| b isEmptyOId]] timeToRun > [1 to: n do: [:ign| b
> isEmpty]] timeToRun]]) average asFloat 4.9
>
> So for collections that are 5 elements or more the "clever" implementation
> is faster.
>
> Best regards
>>         -Tobias
>>
>> [1]: sadly, cleverness in software typically is a drawback.
>> [2]: I mean, the minimalism that is possible by just using/providing #do:
>> somehow conveys closure/completeness
>>
>
>
> _,,,^..^,,,_
> best, Eliot
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20161025/a9eab4b4/attachment.htm


More information about the Squeak-dev mailing list