EndOfStream performance cost [was EndOfStream unused]

nicolas cellier ncellier at ifrance.com
Fri Nov 9 22:29:21 UTC 2007


Yes as expected most are FileStream kind
However, using:

   SystemNavigation new browseAllCallsOn: #next and: #==

I found some other cases that might read pastEnd like for example:
- Number class>>#readExponent:base:from: might read pastEnd
- PositionableStream>>#next:into:startingAt: which has some senders...
- etc...

So I'm now trying to instrument with a tally...
It's hacky, but seems to work:

tally := MessageTally new.
tally spyEvery: 100 on: ['em ezilaitini ot si siht' reverse].
tally class: World class method: #doOneCycle.
    "middle lines begin"
tallyEnd := false.
[(Delay forSeconds: 300) wait. tallyEnd := true] fork.
[[World doOneCycle.  Processor yield.tallyEnd] whileFalse]
		on: EndOfStream
		do: [:exc | tally tally: exc signalerContext by: 1.
			exc resume].
(StringHolder new contents:
		(String streamContents: [:s | tally report: s]))
	openLabel: 'EndOfStream Spy Results'.
    "middle lines end"
tally close.

If you do not execute tally close, and execute middle lines again, I 
think results can be cumulated.

I effectively find Number readFrom: used from Compiler. But got only 40 
tallies from ReadStream in 5 minutes...
Gasp, did not find the gold mine of pastEnd...

If someone wants to try to help, load ConnectEndOfStream-M6755-nice.1.cs 
(http://bugs.squeak.org/file_download.php?file_id=3064&type=bug) from 
http://bugs.squeak.org/view.php?id=6755 and execute above hack in a 
workspace, then play your favourite activities...

In case of severe slow down please post results to me or the list.
Thanks

Nicolas

Paolo Bonzini a écrit :
>> - Most String processing loops use atEnd loop (upper level ReadStream
>> does not use == nil trick, client code might however uses few).
> 
> I found only one that doesn't, in InflateStream>>#atEnd, using this code:
> 
> (CompiledMethod allInstances
>   select: [ :m | | s |
>      s := m getSource asString.
>      ('*next*' match: s ) and: [
>         ('*next == nil*' match: s )  or: [
>         ('*next isNil'  match: s ) or: [
>         '*next notNil' match: s ]]]])
> 
> Here is the source code:
> 
>     atEnd
>     "Note: It is possible that we have a few bits left,
>     representing just the EOB marker. To check for
>     this we must force decompression of the next
>     block if at end of data."
>     super atEnd ifFalse:[^false]. "Primitive test"
>     (position >= readLimit and:[state = StateNoMoreData])
>         ifTrue:[^true].
>     self next == nil ifTrue:[^true].
>     position := position - 1.
>     ^false
> 
> and in this same class, we have:
> 
>     pastEndRead
>     state = StateNoMoreData ifTrue:[^nil].
>     ...
>     ^self next
> 
>     next
>     <primitive: 65>
>     position >= readLimit
>         ifTrue: [^self pastEndRead]
>         ifFalse: [^collection at: (position := position + 1)]
> 
> So, the "self next == nil" could be written "self pastEndRead == nil". 
> Or even better, one could simply rewrite "atEnd and next" like this:
> 
>     atEnd
>     position < readLimit ifFalse: [^false].
>     "for speed, we inline pastEndRead with these changes:"
>     state = StateNoMoreData ifTrue:[^true].
>     ...
>     ^false
> 
>     next
>     <primitive: 65>
>     "position >= readLimit test inlined for speed"
>     (position >= readLimit and: [self atEnd]) ifTrue: [ ^nil ].
>     ^collection at: (position := position + 1)
> 
> Actually, this implementation of #next could be moved up to ReadStream, 
> so that no override is necessary in InflateStream.  So, it looks like 
> the only explicit usage (in my image) of "self next == nil" ought to be 
> eliminated anyway.
> 
> In some cases, the "self next == nil" test is used implicitly, for example:
> 
> Number>>readExponent:base:
>     ('edq' includes: aStream next) ifFalse: [^ nil].
>     ...
> 
> This code however does not seem to be written with speed in mind.
> 
> Paolo
> 
> 




More information about the Squeak-dev mailing list