EndOfStream unused

Andreas Raab andreas.raab at gmx.de
Wed Nov 7 01:34:32 UTC 2007


nicolas cellier wrote:
> Wrong.

What exactly is wrong? That it does stack searches? That it evaluates 
handler blocks? That it will cause pain due to unforeseen interactions? 
That it is slow? Let's start there. How about a little benchmark:

ReadStream subclass: #ReadStreamWithNil

ReadStreamWithNil>>next
	<primitive: 65>
	position >= readLimit
		ifTrue: [^EndOfStream signal]
		ifFalse: [^collection at: (position := position + 1)]


And now:

streamClass := ReadStream. "vs. ReadStreamWithNil"
data := (1 to: 5) asArray.
[1 to: 100000 do:[:i|
	stream := streamClass on: data.
	[stream next == nil] whileFalse.
]] timeToRun.

If you run this trivial little benchmark, the results should be 
enlightening: With ReadStream it completes (on my box) within 280 msecs. 
With ReadStreamWithNil it completes in 1617 msecs. That is a factor of 
6x in speed. Even if you change it to 100 elements in data you are 
*still* at half of the speed (1645 vs. 3172 ms).

> VW does support it, and concerning efficiency, they are not that crazy.

VW isn't Squeak. If you think that VWs and Squeaks exception handling 
have comparable performance, allow me to laugh heartily. Besides which 
VW is a *lot* faster to begin with so the overhead is less noticable in 
real applications (though I'm sure the overhead is measurable).

> True, there is a stack scan, but ONLY ONCE at the end of the stream.
> If the stream is long enough, it will save A LOT of atEnd tests.

Err, only if the stream *has* atEnd tests. Most code that is concerned 
with efficient stream reads today goes like this:

   [(value := stream next) == nil] whileFalse:[].

No atEnd tests are saved in that situation. But even if we change our 
benchmark to, e.g.,

data := (1 to: 5) asArray.
[1 to: 100000 do:[:i|
	stream := ReadStream on: data.
	[stream atEnd] whileFalse:[stream next].
]] timeToRun.

It comes in at 325 msecs on my box with is only 30% worse than the 
"naked" stream next == nil test and about 4x *faster* than using 
EndOfStream. And if you extend data to 100 elements it still comes in 
right in the middle of the other variants (2500 msecs).

> This is called optimistic programming.

And what I do is called "measuring" ;-)

> Imagine that I ask you at each step "Are we arrived?"; you would not 
> bear a very long walk, would you? That's what the atEnd test is doing.

True, when it's there. But unless you change exception handling it's 
often (in particular for internal streams) still a *lot* faster since EH 
is expensive in tight loops.

> In following mail and at http://bugs.squeak.org/view.php?id=6755 I 
> already noticed possible exception handling problem that caused Marcus 
> to retract this change. This is because EndOfStream were declared an 
> Error instead of a Notification.

Have you actually *tried* it? Because you may be in for a nasty little 
surprise. I'm not sure if this problem is going to bite you or not but 
from the code it looks like it should so try it - it is just about 
*exactly* the kind of thing that goes wrong for "no good reason" when 
you change something as fundamental as this.

> And testing next == nil is a ugly hack i reject because i have some 
> collections with some nil.

Then use atEnd. That's what it's for.

> I want for example to use
>     aCollection lazily
>         collect: [:e | e color];
>         select: [:e | e notNil]
> using LazyStreams iterating only once.
> Absence of EndOfStream notification is just spoiling the game.

I don't know. I find it hard to imagine an atEnd test that could 
possibly be as costly as running the EH machinery. It's certainly worth 
measuring before conjecturing about it.

> But nevermind, I will just publish on VW public store where I will find 
> crazy guys interested.

As you'd like. If you are ever interested in having a serious discussion 
about the topic I'll be waiting here.

Cheers,
   - Andreas



More information about the Squeak-dev mailing list