[ANN] Nile 0.9.0

Sat Jun 9 20:54:19 UTC 2007

Hi Andreas,

thank you very much for your comments. Please help me a bit more
answere these questions:

2007/6/9, Andreas Raab <andreas.raab at gmx.de>:
> The real "lesson" of this benchmark is to only trust your micro
> benchmark as far as you can throw them ;-) Given that both
> implementation use primitiveNext and given that the code is designed to
> guarantee a hit in the VM's at-cache there shouldn't be any difference
> whatsoever.
>
> The difference we're seeing here (which made me investigate the matter
> more closely) should also be an indicator that there may be something
> wrong with the benchmarking process. Running it repeatedly gives:
>
> Squeak: 38.9  35.6  30.5  39.2  31.8
> Nile:   36.2  33.9  37.9  40.8  31.3
> Delta:   -7%   -5%   19%    4%   -1%
>
> meaning there is a 20% difference within five runs which seems extremely
> high for a microbenchmark that's just a primitive call. I think you'll
> have to fix the benchmarks to give more consistent results if you want
> to make any claims about relative speed improvements.

I noticed these changes too but I didn't understand why they happened.
Can you help me investigate please?

> > next:
> >    Squeak: 98.2
> >    Nile:     158.2
> >    38% faster
>
> On my machine:
>
>         Squeak result: 90.3
>         Nile result: 130.5
>         Comparison: 31%
>
> The lesson here is (at least for me) that after trying to wrap my brain
> around the code in NSStringReader>>next: I'm willing to give up the
> speed. A nice example for what can be done if you understand byte code
> execution but by no means production code (pity the bugger who at some
> point will need to understand that the "0-position" is required since
> position cannot occur on the right-hand side of that expression ;-)

I do not understand this paragraph.

> Oh, and of course "no comments == not helpful" in particular when it
> comes to that level of optimization. And while having tests is great,
> having 73 out of 87 classes without a single line of explanation (class
> comment) is pretty pathetic.

I'm really in favor of a lot of comments. If you look at the main
traits, they should be heavily commented. Unfortunately, I had a
strong deadline and all the comments I wrote are in an article and not
in the code. This will be corrected soon.

> > nextPut:
> >    Squeak: 24.9
> >    Nile:       42.4
> >    41% faster
>
> On my machine:
>
>         Squeak result: 44.8
>         Nile result: 71.4
>         Comparison: 37%
>
> Oddly, this benchmark scores the same with or without the primitive in
> WriteStream>>nextPut: ... which is pretty strange if you ask me. I have
> a suspicion that the primitiveNextPut hasn't been used in a long time
> and may need to be rewhacked to perform properly.

I noticed that too and Andrew P. Black wrote me a mail about that:

"
[...]A lot of time was going into WriteStream>>nextPut: , which is not
unexpected.  But most of that time was being spent in
isOctetCharacter, which WAS unexpected.[...] It seems to me that this
means that the primitive is failing.

I tired inserting

       PutCount := PutCount + 1.

immediately after the primitive pragma, and then tried printIt on:

       s :=  String streamContents:  [ :str | PutCount := 0. 10000
timesRepeat: [ str nextPut: $q ]] .  PutCount  ==> 10000

So, the primitive is always failing?  Why?  String new:100 now
creates a byteString.  Is it the case that the primitive is still
checking for an instance of * String * in the Stream's collection?
"

So I think there is a real problem here. Correcting it will probably
greatly enhance Squeak speed.

> > nextPutAll:
> >    Squeak: 115.9
> >    Nile: 120.2
> >    4% faster
>
> On my machine:
>
>         Squeak result: 117.8
>         Nile result: 114.8
>         Comparison: -3%
>
> Not really much of a lesson here other than if both implementations take
> "reasonable care" they'll likely end up with similar speed.
>
> So all in all, interesting benchmarks but you need to fix the variation
> issue. With the variations we're seeing, all but the nextPut: benchmark
> (which I suspect suffers from a broken nextPut: primitive) could fall
> either way so there isn't really much of claim to be made here.

In fact, these benchmarks were done to verify that Nile was at least
as fast as Squeak. I didn't want people to complain because Nile was
slower and this was due to traits... Now, I can say that Nile is as
fast as Squeak even with the better design.

I would really appreciate help to improve Nile. Don't forget you can
always commit directly if you want to;
http://www.squeaksource.com/Nile/ is writeable by anybody.

Bye

-- 
Damien Cassou