How I wasted one our of my time :-)
Daniel Vainsencher
danielv at netvision.net.il
Thu Jun 15 21:13:12 UTC 2000
I like this kind of exploration too:
Stefan Matthias Aust <sma at 3plus4.de> wrote:
> bag performed so poorly. By average, six symbols should have the same
> identical hash value. Let us see:
And that's not even it's worst effect - having the first 4096 places in
the value array taken will cause the scanFor: method to scan ahead
through a few thousand entries for the first free place, *for every new
or missing key*.
Look at IdentityDictionary>>scanFor:, try
TimeProfileBrowser onBlock: [
| b |
b := IdentityDictionary new.
Smalltalk allClasses do: [:e | e selectors do: [:s | b at: s put: (b
at: s ifAbsent: [0]) + 1]]]
and weep. 22% of time is spent getting one specific classes selectors
because "selectors" uses a Set.
> is already sufficient as now the values are much better spread over the
> dictionary and resolving the average of 6 clashes is fast enough (this
> theoretically allows for up to 8 clashes or some 32000 symbols).
Yup, that's basically what IDDicts do. PluggableDictionary
integerDictionary shows another way to do this, that works also for
large integers.
> It might be a good idea to do this by default for all symbols. I'm not
> sure whether there's any part of the Smalltalk source dependent on the fact
> that hash values are < 4096. Anybody how knows may speak up.
Might be a good idea to simply use one of these tricks all the hashing
data structures...
Anyone know a good objection?
Anyway, food for thought:
On my system,
#(Set IdentitySet Dictionary IdentityDictionary) collect: [:e | c _
(Smalltalk at: e). {c allInstances size. (c allInstances select: [:i | i
size > colSize]) size}]
gets the following results -
size > 100 => #(#(2640 40) #(2268 0) #(1401 10) #(2644 9))
size > 10 => #(#(2638 906) #(2268 0) #(1401 167) #(2644 136))
size > 5 => #(#(2638 1594) #(2268 0) #(1396 368) #(2644 286))
I wonder at what size hashing becomes faster than a linear search...
[Into the abyss]
Nice sleuthing, sma. Yummi updates... ;-)
> bye
> --
> Stefan Matthias Aust // Bevor wir fallen, fallen wir lieber auf
More information about the Squeak-dev
mailing list
|