[squeak-dev] [Pharo-dev] Why do we have SmallDictionary?
Max Leske
maxleske at gmail.com
Fri Jun 8 07:30:45 UTC 2018
(Sorry, hit send accidentally).
On 8 Jun 2018, at 9:02, Marcus Denker wrote:
>> On 8 Jun 2018, at 08:46, Max Leske <maxleske at gmail.com> wrote:
>>
>> Hi,
>>
>> I was messing around with SmallDictionary when I suddenly realised
>> that I can't find a single reason to use it over a normal Dictionary.
>> While its name and class comment imply that it is somehow an
>> optimised Dictionary, I don't see any measurement where that actually
>> holds up. The following was run in a Pharo 7 image on a recent VM
>> (see below):
>>
>> | d |
>> d := SmallDictionary new.
>> d sizeInMemory. "24"
>> [100000 timesRepeat: [
>> 1 to: 100 do: [ :i | d at:i put: i] ] ] timeToRun. "0:00:00:05.226"
>>
>> [100000 timesRepeat: [
>> d at: 48 ifAbsent: [] ] ] timeToRun. "0:00:00:00.041"
>>
>>
>>
>> | d |
>> d := Dictionary new.
>> d sizeInMemory. "16"
>> [100000 timesRepeat: [
>> 1 to: 100 do: [ :i | d at:i put: i] ] ] timeToRun. "0:00:00:00.385"
>> [100000 timesRepeat: [
>> d at: 48 ifAbsent: [] ] ] timeToRun. "0:00:00:00.006"
>>
>>
> can you try with <10 elements?
The results are in!
Note that I had to increase the repetitions by a factor or 100.
| d |
d := SmallDictionary new.
[10000000 timesRepeat: [
1 to: 5 do: [ :i | d at:i put: i] ] ] timeToRun. "0:00:00:02.532"
[10000000 timesRepeat: [
d at: 4 ifAbsent: [] ] ] timeToRun. "0:00:00:00.686"
| d |
d := Dictionary new.
[10000000 timesRepeat: [
1 to: 5 do: [ :i | d at:i put: i] ] ] timeToRun. "0:00:00:02.102"
[10000000 timesRepeat: [
d at: 4 ifAbsent: [] ] ] timeToRun. "0:00:00:00.567"
So SmallDictionary is indeed very fast for only few elements. Once the
SmallDictionary has to grow its performance decreases rapidly, or at
least the grow operation takes a long time.
Dictionary still outperforms SmallDictionary though, so I guess the only
reason left is space efficency.
>>
>> Is anyone aware of a reason for hanging on to SmallDictionary? I'm
>> also curious to know how SmallDictionary came to be. There must have
>> been some advantage over Dictionary at some point in the past.
>>
> It came from RB… the idea was that there are (in the Refactoring
> engine) a lot of very small dictionaries with <10 elements.
> The idea is that for such dictionaries, the overhead of hashing was
> higher than just linear search.
>
> I am sure I did a benchmark back then, but of course this could have
> changed in the meantime.
>
> It would be nice to redo the benchmark.
>
> For size: does #sizeInMemory take the internal data into account? e.g
> for Dictionary there is an Array that contains Associations, you need
> to count those, too.
No, I didn't check for that initially. When I did, it became apparent
that the larger the dictionary the more space you save with
SmallDictionary.
>
> In the end this is a hack anyway: if needed Dictionary could do things
> like this transparently… without the user having to decide. And the
> question is if this is not just a bad side effect of the particular
> implementation.
Good point. Sounds like what Stefan Marr wrote in the paper where he
performed a field study of collection libraries ;)
Max
>
> Marcus
More information about the Squeak-dev
mailing list
|