Simple String Question.

Bijan Parsia bparsia at email.unc.edu
Thu May 16 12:55:24 UTC 2002


Let me preface my remarks with the profiling injuncture. Profile! There,
whew.

Second, unless you presize the collection your WriteStreaming over, you'll
get copying there too. When you fill your stream's collection, it will
swap in a new, larger one. At some point, IIRC, it will grow by doubling
(which may be too little for a series of small additions, or too large
with a big one).

Hmm. looking at WriteStream>>pastEndPut:, I see that the grow algo is:
	(collection size max: 20) min: 20000

So it puts an upper limit on the doubling, growing in 20000 chunks past
20000. Again, this may not be what you want.

On Thu, 16 May 2002 goran.hultgren at bluefish.se wrote
[snip]
> In fact, two concatenations of small Strings seem to be more efficient
> using #, but after that Streams seem to win out (Note: I haven't really
> looked why, this is just observation), try Alt-p on the following
> expressions (my timings included after arrow):
> 
> Time millisecondsToRun: [10000 timesRepeat: ['abc', 'def', 'ghi',
> 'jkl']]     -> 276 284 295 263 268
> 
> Time millisecondsToRun: [10000 timesRepeat: [String streamContents: [:s
> | s nextPutAll: 'abc'; nextPutAll:'def'; nextPutAll: 'ghi'; nextPutAll:
> 'jkl']]]  -> 260 265 271 285 279
> 
> Actually a draw! :-) But if we change it and remove one concat...

On my slow mac, it varies a lot (prolly do to GC or background cycle
issues).

> Time millisecondsToRun: [10000 timesRepeat: ['abc', 'def', 'ghi']] ->
> 179 164 160 159 176 
> 
> Time millisecondsToRun: [10000 timesRepeat: [String streamContents: [:s
> | s nextPutAll: 'abc'; nextPutAll:'def'; nextPutAll: 'ghi']]] -> 224 223
> 223 220 218
> 
> ...the #, seems to win. So Streams don't seem to always be faster! :-)

Yes. The rule of thumb should be something like "for building *large*
strings out of a *fair number* of concats". Concating two large strings
into a non-presized stream will (prolly) be slower than just #,ing
them. (Because you'll have to grow the Stream's collection a lot.)

It makes me wonder if a copy-on-write type concat would be an overall win
for the system. My guess is that there's a lot more #,ing on strings than
#at:put:ing. This wouldn't work well for other sorts of collection (e.g.,
ordered collection) I'd imagine.

It's even prolly the case (profile!) that a #, of two strings, even
big ones, is faster than even a pre-sized Stream, as you don't have any
Stream related overhead.

Thanks for the catch, though. I've grown a bit sloppy about my Stream
recommendations....

Cheers,
Bijan Parsia.




More information about the Squeak-dev mailing list