[squeak-dev] WideString hash is way slower than ByteString hash.

Wed Sep 29 23:05:19 UTC 2010

On 15 May 2010 02:43, Igor Stasenko <siguctua at gmail.com> wrote:
> On 15 May 2010 02:35, Eliot Miranda <eliot.miranda at gmail.com> wrote:
>> The cardinal rule of running benchmarks is to compare apples to apples.
>>  You've compared apples to oranges, i.e. an optimized reimplementation of
>> WideString>>hash that eliminates the mapping of codes to characters, against
>> the vanilla Squeak implementation.  You need to at least compare the NB
>> implementation against
>> WideString methods for comparison
>> fastHash
>> | stringSize hash low |
>> stringSize := self size.
>> hash := ByteString identityHash bitAnd: 16rFFFFFFF.
>> 1 to: stringSize do: [:pos |
>> hash := hash + (self wordAt: pos).
>> "Begin hashMultiply"
>> low := hash bitAnd: 16383.
>> hash := (16r260D * low + ((16r260D * (hash bitShift: -14) + (16r0065 * low)
>> bitAnd: 16383) * 16384)) bitAnd: 16r0FFFFFFF.
>> ].
>> ^ hash
>> | s n |
>> s := (WideString with: (Character value: 16r55E4)) , 'abcdefghijklmno'.
>> n := 100000.
>> { [1 to: n do: [:i| s fastHash. s fastHash. s fastHash. s fastHash. s
>> fastHash. s fastHash. s fastHash. s fastHash. s fastHash. s fastHash]]
>> timeToRun.
>>  [1 to: n do: [:i| s hash. s hash. s hash. s hash. s hash. s hash. s hash. s
>> hash. s hash. s hash]] timeToRun. }
>>      #(829 1254)
>> ASo your measurements tell us nothing about a general comparison of NB
>> against the Squeak VM or Cog.  They only demonstrate (unsurprisingly) that a
>> loop summing integers in an array goes PDQ.  On the other hand my numbers
>> showed Cog 10x faster than the Squeak interpreter when executing exactly the
>> same bytecode.
>
> Yes, of course you're right.
> But i didn't compared it with Cog, because i can't.
> And NB is not for translating bytecodes into native code,
> it is for authoring a primitives using native code.
> So, my comparison is how faster it could be , if implemented primitively.
>
> I can only imagine, how faster the whole thing would be if we
> cross-hatch NB and Cog,
> where Cog will serve as a smalltalk optimizer , and NB will serve for
> manual optimizations
> for heavy numerical crunching :)
>

Its now time to verify things, since now i can run native code under Cog:

|ss s t1 t2 t3 |
s := (WideString with: (Character value: 16r55E4)) , 'abcdefghijklmno'.
ss := (String with: $z) , 'abcdefghijklmno'.

10 timesRepeat: [ s := s , s. ss := ss , ss ].
self assert: (s hash = s nbHash).
t1 := [1000 timesRepeat: [s hash]] timeToRun.
t2 := [1000 timesRepeat: [s nbHash]] timeToRun.
t3 := [1000 timesRepeat: [ss hash]] timeToRun.
{ t1. t2. t3 }

Squeak VM:

 #(7929 89 125)

Cog VM:

 #(1256 88 99)

However, in tight loops, where loop's payload comparable to function's
payload itself, Cog is a winner:

|s|
s := WideString with: (Character value: 16r55E4).
[10000000 timesRepeat: [s hash]] timeToRun.
1307

|s|
s := WideString with: (Character value: 16r55E4).
[ s hash ] bench
'2,840,000 per second.'

|s|
s := WideString with: (Character value: 16r55E4).
[10000000 timesRepeat: [s nbHash]] timeToRun.

1438

|s|
s := WideString with: (Character value: 16r55E4).
[ s nbHash ] bench
 '2,710,000 per second.'

obviously, because it can't inline a native code generated by NB :)

-- 
Best regards,
Igor Stasenko AKA sig.