[Vm-dev] Cog Primitive Performance
Eliot Miranda
eliot.miranda at gmail.com
Tue Apr 18 19:08:22 UTC 2017
Hi Andres,
On Tue, Apr 18, 2017 at 11:06 AM, Andres Valloud <
avalloud at smalltalk.comcastbiz.net> wrote:
>
> The hashing technique is just a multiplicative hash function. The
> multiplication technique is equivalent to splitting AX into AH and AL, and
> using the distributive rule to (AH * 2^8 + AL) * (BH * 2^8 + BL). The
> choice of 1664525 is due to Mark van der Gulik, he took the constant out of
> a Knuth table of good linear congruency pseudo-RNG factors. It's been 17
> years since that, I do not currently think the pseudo-RNG relationship is
> that relevant. In the mean time, as hashing goes, it's good enough.
I'm looking for more of a reference, say to a paper which describes the
technique, or a reference implementation on the web. Remember that one
important proper of 1664525 is that it belongs to a set of integers for
which n bitCount * 2 between: n highBit and: n highBit + 2, i.e. that about
half the bits are set. Further, the bits are reasonably well
distributed throughout, unlike for example 16rF0F0F0F0F.
I'll look up the table in my Knuth vol 3.
>
> On 4/18/17 10:01 , Eliot Miranda wrote:
>
>> Hi Andres,
>>
>> On Tue, Apr 18, 2017 at 1:52 AM, Andres Valloud
>> <avalloud at smalltalk.comcastbiz.net
>> <mailto:avalloud at smalltalk.comcastbiz.net>> wrote:
>>
>>
>> Yes, the SqR initials are mine. That's too bad about the comments,
>> at the very least there should be something like this:
>>
>> "This code performs a multiplication by 1664525 mod 2^28 without
>> overflowing into large integers."
>>
>>
>> Thanks. I'll update the primitives appropriately. Could you supply a
>> reference both for the technique and the choice of 1664525 in
>> particular? I'd like to include that info in the comments too.
>>
>>
>> On 4/18/17 1:48 , Levente Uzonyi wrote:
>>
>> The methods in SmallInteger and LargePositiveInteger have the
>> initials
>> SqR (that's yours IIRC), but they have no comment.
>> The variant in String is originally from 2001 and has never had
>> that
>> comment.
>>
>> Levente
>>
>> On Mon, 17 Apr 2017, Andres Valloud wrote:
>>
>>
>> On 4/17/17 16:45 , Eliot Miranda wrote:
>>
>> FYI, hash multiply is
>> hashMultiply
>> | low |
>> low := self bitAnd: 16383.
>> ^(16r260D * low + ((16r260D * (self bitShift: -14) +
>> (16r0065 * low)
>> bitAnd: 16383) * 16384))
>> bitAnd: 16r0FFFFFFF
>>
>> and when written as a primitive is
>> hashMultiply
>> | low |
>> low := self bitAnd: 16383.
>> ^(16r260D * low + (16r260D * (self bitShift: -14) +
>> (16r0065 * low) *
>> 16384))
>> bitAnd: 16r0FFFFFFF
>> because we don't care about the multiply by 16384
>> overflowing; when
>> written as a primitive it won't overflow into a
>> LargePositiveInteger.
>>
>>
>> Hopefully this doesn't mean the primitive got implemented by
>> actually
>> doing these operations verbatim. As you have probably seen,
>> that
>> convoluted arithmetic is done this way in Smalltalk only to
>> simulate a
>> 32x32 multiplication bit-anded down to 28 bits without
>> overflowing
>> into large integers (the original code from August 2000 had
>> my initials).
>>
>> At a sufficiently low level such as C, all that complexity
>> is just an
>> unsigned multiplication by 1664525. The image code should
>> still have
>> a comment to that effect, did it get lost?
>>
>> Andres.
>>
>>
>>
>>
>>
>> --
>> _,,,^..^,,,_
>> best, Eliot
>>
>
--
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170418/7e5e7271/attachment.html>
More information about the Vm-dev
mailing list