<br><br><div class="gmail_quote">On Mon, May 28, 2012 at 11:13 PM, Frank Shearar <span dir="ltr">&lt;<a href="mailto:frank.shearar@gmail.com" target="_blank">frank.shearar@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div><div>On 28 May 2012 17:43, Colin Putney &lt;<a href="mailto:colin@wiresong.com" target="_blank">colin@wiresong.com</a>&gt; wrote:<br>

&gt;<br>

&gt; On 2012-05-28, at 7:59 AM, Frank Shearar wrote:<br>

&gt;<br>

&gt;&gt;&gt; That would be a nice property, but hash functions are not injective. If they<br>

&gt;&gt;&gt; were, then either the codomain would be too large (not SmallInteger in our<br>

&gt;&gt;&gt; case, which makes hashing impractical) or there were no need for the use of<br>

&gt;&gt;&gt; hashing at all, since there were no collisions.<br>

&gt;&gt;<br>

&gt;&gt; Sure, collisions mean that you can have a ~= b and yet a hash = b<br>

&gt;&gt; hash. Nevertheless, CompiledMethod define: #hash as: [^ 1] satisfies<br>

&gt;&gt; the above test, so on its own it&#39;s insufficient. I wouldn&#39;t ask for<br>

&gt;&gt; testing CM x CM  - {CompiledMethods whose hashes collide} !<br>

&gt;<br>

&gt; So, essentially, you want a test that ensures that we have a high-quality hash function. That would be nice, but I&#39;m not sure it fits into the structure of a unit test. Porting Andres&#39; hash function tools to Squeak would probably be the best way to do that. Short of that, I&#39;d suggest a simple smoke test - say, asserting that there are few collisions between the methods of TestCase or something like that.<br>


<br>

</div></div>I just want a test demonstrating that, even if it&#39;s just for carefully<br>

constructed CompiledMethods, sometime cmA hash ~= cmB hash when cmA ~=<br>

cmB. At the moment the test thoroughly demonstrates half of the hash&#39;s<br>

behaviour - when things are =, their hashes are =.<br>

<br>

I agree that demonstrating the collision rate of the hash function is<br>

beyond the scope of a unit test.<br></blockquote><div><br></div><div>How about adding a count that e.g. demands that the number of distinct hashes is better than half the number of distinct CompiledMethods.  e.g.</div><div>


<br></div><div><div>testHash</div><div><span style="white-space:pre-wrap">        </span>| ai |</div><div><span style="white-space:pre-wrap">        </span>ai := CompiledMethod allInstances.</div><div>

<span style="white-space:pre-wrap">        </span>ai do:</div><div><span style="white-space:pre-wrap">                </span>[:a|</div><div><span style="white-space:pre-wrap">                </span>ai do:</div>

<div><span style="white-space:pre-wrap">                        </span>[:b|</div><div><span style="white-space:pre-wrap">                        </span>a = b ifTrue: [self assert: a hash = b hash]]].</div></div><div><span style="white-space:pre-wrap">        </span>self assert: (ai collect: [:cm| cm hash]) asSet size * 2 &gt;= ai asSet size</div>


</div>-- <br>best,<div>Eliot</div><br>