Simple String Question.

goran.hultgren at bluefish.se goran.hultgren at bluefish.se
Thu May 16 14:11:41 UTC 2002


Bijan Parsia <bparsia at email.unc.edu> wrote:
> Let me preface my remarks with the profiling injuncture. Profile! There,
> whew.

:-)

> Second, unless you presize the collection your WriteStreaming over, you'll
> get copying there too. When you fill your stream's collection, it will
> swap in a new, larger one. At some point, IIRC, it will grow by doubling
> (which may be too little for a series of small additions, or too large
> with a big one).
> 
> Hmm. looking at WriteStream>>pastEndPut:, I see that the grow algo is:
> 	(collection size max: 20) min: 20000

I actually sent a small fix to that method to Ted yesterday. With double
performance. :-)
I raised the cap to 1000000 per Teds recommendation and "my" fix was
actually making the grow faster by not using #,.
Buffer growing with #, is not so smart. Guess why? :-)

> So it puts an upper limit on the doubling, growing in 20000 chunks past
> 20000. Again, this may not be what you want.
> 
> On Thu, 16 May 2002 goran.hultgren at bluefish.se wrote
[Yadayada]
> Yes. The rule of thumb should be something like "for building *large*
> strings out of a *fair number* of concats". Concating two large strings
> into a non-presized stream will (prolly) be slower than just #,ing
> them. (Because you'll have to grow the Stream's collection a lot.)

Yes, but also note that #streamContents: starts with a "WriteStream on:
(self new: 100)" so that isn't explaining it in my sample code.
Of course - by now I should look at the code and probably trivially
discover why #, is faster but... ;-)

> It makes me wonder if a copy-on-write type concat would be an overall win
> for the system. My guess is that there's a lot more #,ing on strings than
> #at:put:ing. This wouldn't work well for other sorts of collection (e.g.,
> ordered collection) I'd imagine.
> 
> It's even prolly the case (profile!) that a #, of two strings, even
> big ones, is faster than even a pre-sized Stream, as you don't have any
> Stream related overhead.
> 
> Thanks for the catch, though. I've grown a bit sloppy about my Stream
> recommendations....

I had always actually "thought" (without really thinking) that a Stream
implementation was faster in ALL cases - but got curious a few months
ago and noticed this little fact (that they are not). We are doing tons
of streaming and String concats in our project.

> Cheers,
> Bijan Parsia.

Cheers, Göran



More information about the Squeak-dev mailing list