On 28 May 2012 17:43, Colin Putney colin@wiresong.com wrote:
On 2012-05-28, at 7:59 AM, Frank Shearar wrote:
That would be a nice property, but hash functions are not injective. If they were, then either the codomain would be too large (not SmallInteger in our case, which makes hashing impractical) or there were no need for the use of hashing at all, since there were no collisions.
Sure, collisions mean that you can have a ~= b and yet a hash = b hash. Nevertheless, CompiledMethod define: #hash as: [^ 1] satisfies the above test, so on its own it's insufficient. I wouldn't ask for testing CM x CM - {CompiledMethods whose hashes collide} !
So, essentially, you want a test that ensures that we have a high-quality hash function. That would be nice, but I'm not sure it fits into the structure of a unit test. Porting Andres' hash function tools to Squeak would probably be the best way to do that. Short of that, I'd suggest a simple smoke test - say, asserting that there are few collisions between the methods of TestCase or something like that.
I just want a test demonstrating that, even if it's just for carefully constructed CompiledMethods, sometime cmA hash ~= cmB hash when cmA ~= cmB. At the moment the test thoroughly demonstrates half of the hash's behaviour - when things are =, their hashes are =.
I agree that demonstrating the collision rate of the hash function is beyond the scope of a unit test.
frank
Colin