How I wasted one our of my time :-)

Daniel Vainsencher danielv at netvision.net.il
Thu Jun 15 21:13:12 UTC 2000


I like this kind of exploration too:
Stefan Matthias Aust <sma at 3plus4.de> wrote:
> bag performed so poorly.  By average, six symbols should have the same 
> identical hash value.  Let us see:
And that's not even it's worst effect - having the first 4096 places in
the value array taken will cause the scanFor: method to scan ahead
through a few thousand entries for the first free place, *for every new
or missing key*.

Look at IdentityDictionary>>scanFor:, try 
TimeProfileBrowser onBlock: [
 | b |
   b := IdentityDictionary new.
   Smalltalk allClasses do: [:e | e selectors do: [:s | b at: s put: (b
at: s  ifAbsent: [0]) + 1]]]
and weep. 22% of time is spent getting one specific classes selectors
because "selectors" uses a Set.

> is already sufficient as now the values are much better spread over the 
> dictionary and resolving the average of 6 clashes is fast enough (this 
> theoretically allows for up to 8 clashes or some 32000 symbols).
Yup, that's basically what IDDicts do. PluggableDictionary
integerDictionary shows another way to do this, that works also for
large integers.

> It might be a good idea to do this by default for all symbols.  I'm not 
> sure whether there's any part of the Smalltalk source dependent on the fact 
> that hash values are < 4096.  Anybody how knows may speak up.
Might be a good idea to simply use one of these tricks all the hashing
data structures...
Anyone know a good objection?

Anyway, food for thought:
On my system,
#(Set IdentitySet Dictionary IdentityDictionary) collect: [:e | c _
(Smalltalk at: e). {c allInstances size. (c allInstances select: [:i | i
size > colSize]) size}]
gets the following results -
size > 100 =>	#(#(2640 40) 	#(2268 0) 	#(1401 10) 	#(2644 9))
size > 10 =>	#(#(2638 906) 	#(2268 0) 	#(1401 167) 	#(2644 136))
size > 5 =>	#(#(2638 1594) 	#(2268 0) 	#(1396 368) 	#(2644 286))
I wonder at what size hashing becomes faster than a linear search...

[Into the abyss]
Nice sleuthing, sma. Yummi updates... ;-) 


> bye
> --
> Stefan Matthias Aust  //  Bevor wir fallen, fallen wir lieber auf





More information about the Squeak-dev mailing list