2010/9/29 Levente Uzonyi <leves@elte.hu>
On Thu, 30 Sep 2010, Igor Stasenko wrote:

On 15 May 2010 02:43, Igor Stasenko <siguctua@gmail.com> wrote:
On 15 May 2010 02:35, Eliot Miranda <eliot.miranda@gmail.com> wrote:
The cardinal rule of running benchmarks is to compare apples to apples.
 You've compared apples to oranges, i.e. an optimized reimplementation of
WideString>>hash that eliminates the mapping of codes to characters, against
the vanilla Squeak implementation.  You need to at least compare the NB
implementation against
WideString methods for comparison
fastHash
| stringSize hash low |
stringSize := self size.
hash := ByteString identityHash bitAnd: 16rFFFFFFF.
1 to: stringSize do: [:pos |
hash := hash + (self wordAt: pos).
"Begin hashMultiply"
low := hash bitAnd: 16383.
hash := (16r260D * low + ((16r260D * (hash bitShift: -14) + (16r0065 * low)
bitAnd: 16383) * 16384)) bitAnd: 16r0FFFFFFF.
].
^ hash
| s n |
s := (WideString with: (Character value: 16r55E4)) , 'abcdefghijklmno'.
n := 100000.
{ [1 to: n do: [:i| s fastHash. s fastHash. s fastHash. s fastHash. s
fastHash. s fastHash. s fastHash. s fastHash. s fastHash. s fastHash]]
timeToRun.
 [1 to: n do: [:i| s hash. s hash. s hash. s hash. s hash. s hash. s hash. s
hash. s hash. s hash]] timeToRun. }
     #(829 1254)
ASo your measurements tell us nothing about a general comparison of NB
against the Squeak VM or Cog.  They only demonstrate (unsurprisingly) that a
loop summing integers in an array goes PDQ.  On the other hand my numbers
showed Cog 10x faster than the Squeak interpreter when executing exactly the
same bytecode.

Yes, of course you're right.
But i didn't compared it with Cog, because i can't.
And NB is not for translating bytecodes into native code,
it is for authoring a primitives using native code.
So, my comparison is how faster it could be , if implemented primitively.

I can only imagine, how faster the whole thing would be if we
cross-hatch NB and Cog,
where Cog will serve as a smalltalk optimizer , and NB will serve for
manual optimizations
for heavy numerical crunching :)


Its now time to verify things, since now i can run native code under Cog:

|ss s t1 t2 t3 |
s := (WideString with: (Character value: 16r55E4)) , 'abcdefghijklmno'.
ss := (String with: $z) , 'abcdefghijklmno'.

10 timesRepeat: [ s := s , s. ss := ss , ss ].
self assert: (s hash = s nbHash).
t1 := [1000 timesRepeat: [s hash]] timeToRun.
t2 := [1000 timesRepeat: [s nbHash]] timeToRun.
t3 := [1000 timesRepeat: [ss hash]] timeToRun.
{ t1. t2. t3 }

Squeak VM:

#(7929 89 125)

You should update your image, because WideString's hash is now ~2x faster than before with SqueakVM. I get #(3822 119) for t1 and t3.



Cog VM:

#(1256 88 99)


However, in tight loops, where loop's payload comparable to function's
payload itself, Cog is a winner:

|s|
s := WideString with: (Character value: 16r55E4).
[10000000 timesRepeat: [s hash]] timeToRun.
1307

|s|
s := WideString with: (Character value: 16r55E4).
[ s hash ] bench
'2,840,000 per second.'

|s|
s := WideString with: (Character value: 16r55E4).
[10000000 timesRepeat: [s nbHash]] timeToRun.

1438

|s|
s := WideString with: (Character value: 16r55E4).
[ s nbHash ] bench
'2,710,000 per second.'

obviously, because it can't inline a native code generated by NB :)

Can Cog inline message sends? I thought it can't ATM.

Right.  But the thought is that because Cog does inline machine-code for certain primitives (actually it generates machine-code versions of core primitives) that with a little work NB would be able to generate the machine code for primitives that Cog's code generator doesn't handle.  

Igor will correct me if I'm wrong but I think the idea is thus.  NB-generated code needs to be stored somewhere in a CompiledMethod where Cog can find it, along with some link-edit metadata to allow Cog to fix-up data references in the code etc.  If Cog finds a method containing NB code it copies this code into the start of the method (where primitive code is compiled) and does some simple link-editing on the code.  Now the Cog method starts with NB code which operates ;like a primitive, jumping to the start of the Cog method proper if it fails.

If we're talking about NB FFI calls then there's additional compilations.  A stack switch is required from Cog's stack pages to the real C stack, and back on return, etc.  But I'm encouraged that Igor has already got NB going on Cog.

Btw, can you send a message from NB code?

As I understand it, no.

best
Eliot
 


Levente



--
Best regards,
Igor Stasenko AKA sig.