[squeak-dev] Re: Ideas about sets and dictionaries

Fri Nov 13 12:42:01 UTC 2009

On Fri, 13 Nov 2009, Bert Freudenberg wrote:

>
> On 13.11.2009, at 08:39, Andreas Raab wrote:
>
>> Igor Stasenko wrote:
>>> 2428 run, 2406 passes, 0 expected failures, 15 failures, 7 errors, 0
>>> unexpected passes
>>> After applying changes to sets using nil wrappers [1]:
>>> 2428 run, 2406 passes, 0 expected failures, 15 failures, 7 errors, 0
>>> unexpected passes
>>> After adding changes to sets using negative tally[2]:
>>> 2428 run, 2406 passes, 0 expected failures, 15 failures, 7 errors, 0
>>> unexpected passes
>>
>> Those are great results!
>>
>>> [1] http://bugs.squeak.org/file_download.php?file_id=3829&type=bug
>>
>> Yeah... seeing the code I like the wrapper solution even better. It's just so elegant. Virtually no overhead, nicely dealing with all sorts of nestings, having the option for future extensions (weak elements, collection elements etc). I think I've just promoted that to my top choice ;-)
>>
>> Seriously folks, look at that code. It's a great solution.
>>
>> Cheers,
>>  - Andreas
>>
>
> I agree it's elegant. Unless someone actually puts a nil in the Set, its inner structure is exactly the same as before. From the description it sounded like any element would be wrapped, but of course only nil gets wrapped (and instances of SetElement itself).
>
> There is a performance decrease due to the additional message sends. The following micro benchmark demonstrates it:
>
> | r a b s |
> r := Random seed: 1234567.
> a := (1 to: 100000) collect: [:i | r nextInt: 1000000].
> b := (1 to: 100000) collect: [:i | r nextInt: 1000000].
> s := Set new.
> Smalltalk garbageCollectMost.
> [	a do: [:each | s add: each].
> 	b do: [:each | s includes: each].
> 	b do: [:each | s like: each].
> 	a do: [:each | s remove: each ifAbsent: []].
> ] timeToRun
>
> I'm purposely not including my results, do measure yourself if you care (but don't spoil the fun for others).
>
> However, unless someone can demonstrate this affects a macro benchmark, I'd choose this for its generality and minimal compatibility risk.

I was about to publish my measurements when I read your mail.
I could write a macro benchmark that could abuse the "weakness" of this 
implementation and would show real slowdown, but it's easier to decrease 
the performance penalty of the implementation to almost 0.
Btw how often do you use Set >> #like:?

Levente

>
> +1 from me :)
>
> - Bert -
>
>
>