[ENH] HashCollisionsCheck

Stephan Rudlof sr at evolgo.de
Mon Aug 26 00:55:02 UTC 2002


Lex Spoon wrote:
> This kind of tool is neat.

:-)

> I get only four reported sets that have any
> collisions, which is nice at first.
> 
> Two odd things came up.  First, this seems really low.  I believe the
> issue is that it only counts when hashes *exactly* match, and not when
> hashes are the same after a mod operation.

Reason to do this has been some discussion about hash functions. So I wanted
to know, where there are really bad hash functions.

> Tweaking your code with a
> "\\ theRelevantCollection size" in the right place gives me 6300 sets
> with collisions, which seems more realistic.

I have updated my changeset accordingly: Look for
	[ENH] HashCollisionsCheck
.


> 
> Second, I found a couple of Identity dictionaries whose #size is 4 but
> which have only 3 keys.  Very strange!
> 
> For anyone interested, the bad sets in my system seem to be:
> 
> 	1. Celeste's message collections hash badly.  This is disturbing since
> they use a customized PluggableSet.  Maybe the customized function needs
> to be reexamined!
> 
> 	2. Lots of method dictionaries hash badly.
> 
> 
> Overall, it is quite nice to have tools like this to help improve hash
> functions.  Thanks for posting it,

> Stephen!

Stephan. ;-)

> If I recall correctly, you
> sent the last changeset of this nature around, too, several years ago.

Looking into my 'Posted' dir, I have found a prim method, which has speeded
up String>>hash. It is now in the system in similar form (made by Andreas Raab).


Greetings,

Stephan

> 
> 
> -Lex
> 
> 


-- 
Stephan Rudlof (sr at evolgo.de)
   "Genius doesn't work on an assembly line basis.
    You can't simply say, 'Today I will be brilliant.'"
    -- Kirk, "The Ultimate Computer", stardate 4731.3




More information about the Squeak-dev mailing list