<div dir="ltr">Hi Andres,<div class="gmail_extra"><br><div class="gmail_quote">On Tue, Apr 18, 2017 at 11:06 AM, Andres Valloud <span dir="ltr"><<a href="mailto:avalloud@smalltalk.comcastbiz.net" target="_blank">avalloud@smalltalk.comcastbiz.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><br>

The hashing technique is just a multiplicative hash function.  The multiplication technique is equivalent to splitting AX into AH and AL, and using the distributive rule to (AH * 2^8 + AL) * (BH * 2^8 + BL). The choice of 1664525 is due to Mark van der Gulik, he took the constant out of a Knuth table of good linear congruency pseudo-RNG factors.  It's been 17 years since that, I do not currently think the pseudo-RNG relationship is that relevant.  In the mean time, as hashing goes, it's good enough.</blockquote><div><br></div><div>I'm looking for more of a reference, say to a paper which describes the technique, or a reference implementation on the web.  Remember that one important proper of 1664525 is that it belongs to a set of integers for which n bitCount * 2 between: n highBit and: n highBit + 2, i.e. that about half the bits are set.  Further, the bits are reasonably well distributed throughout, unlike for example 16rF0F0F0F0F.</div><div><br></div><div>I'll look up the table in my Knuth vol 3.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><span class="gmail-"><br>

<br>

On 4/18/17 10:01 , Eliot Miranda wrote:<br>

</span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><span class="gmail-">

Hi Andres,<br>

<br>

On Tue, Apr 18, 2017 at 1:52 AM, Andres Valloud<br>

<<a href="mailto:avalloud@smalltalk.comcastbiz.net" target="_blank">avalloud@smalltalk.comcastbiz<wbr>.net</a><br></span><div><div class="gmail-h5">

<mailto:<a href="mailto:avalloud@smalltalk.comcastbiz.net" target="_blank">avalloud@smalltalk.com<wbr>castbiz.net</a>>> wrote:<br>

<br>

<br>

    Yes, the SqR initials are mine.  That's too bad about the comments,<br>

    at the very least there should be something like this:<br>

<br>

    "This code performs a multiplication by 1664525 mod 2^28 without<br>

    overflowing into large integers."<br>

<br>

<br>

Thanks.  I'll update the primitives appropriately.  Could you supply a<br>

reference both for the technique and the choice of  1664525 in<br>

particular?  I'd like to include that info in the comments too.<br>

<br>

<br>

    On 4/18/17 1:48 , Levente Uzonyi wrote:<br>

<br>

        The methods in SmallInteger and LargePositiveInteger have the<br>

        initials<br>

        SqR (that's yours IIRC), but they have no comment.<br>

        The variant in String is originally from 2001 and has never had that<br>

        comment.<br>

<br>

        Levente<br>

<br>

        On Mon, 17 Apr 2017, Andres Valloud wrote:<br>

<br>

<br>

            On 4/17/17 16:45 , Eliot Miranda wrote:<br>

<br>

                FYI, hash multiply is<br>

                hashMultiply<br>

                | low |<br>

                low := self bitAnd: 16383.<br>

                ^(16r260D * low + ((16r260D * (self bitShift: -14) +<br>

                (16r0065 * low)<br>

                bitAnd: 16383) * 16384))<br>

                bitAnd: 16r0FFFFFFF<br>

<br>

                and when written as a primitive is<br>

                hashMultiply<br>

                | low |<br>

                low := self bitAnd: 16383.<br>

                ^(16r260D * low + (16r260D * (self bitShift: -14) +<br>

                (16r0065 * low) *<br>

                16384))<br>

                bitAnd: 16r0FFFFFFF<br>

                because we don't care about the multiply by 16384<br>

                overflowing; when<br>

                written as a primitive it won't overflow into a<br>

                LargePositiveInteger.<br>

<br>

<br>

            Hopefully this doesn't mean the primitive got implemented by<br>

            actually<br>

            doing these operations verbatim.  As you have probably seen,<br>

            that<br>

            convoluted arithmetic is done this way in Smalltalk only to<br>

            simulate a<br>

            32x32 multiplication bit-anded down to 28 bits without<br>

            overflowing<br>

            into large integers (the original code from August 2000 had<br>

            my initials).<br>

<br>

            At a sufficiently low level such as C, all that complexity<br>

            is just an<br>

            unsigned multiplication by 1664525.  The image code should<br>

            still have<br>

            a comment to that effect, did it get lost?<br>

<br>

            Andres.<br>

<br>

<br>

<br>

<br>

<br>

--<br>

_,,,^..^,,,_<br>

best, Eliot<br>

</div></div></blockquote>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr"><div><span style="font-size:small;border-collapse:separate"><div>_,,,^..^,,,_<br></div><div>best, Eliot</div></span></div></div></div>

</div></div>