[squeak-dev] news from the Xtream front

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Tue Dec 8 02:02:43 UTC 2009


To give a concrete view of what improment we might further get beyond
excellent changes from Levente, i just tried this in latest trunk,
with latest Xtream version:

{
[| tmp | tmp := (MultiByteFileStream readOnlyFileNamed: (SourceFiles
at: 2) name) ascii; wantsLineEndConversion: false; converter:
UTF8TextConverter new.
       1 to: 10000 do: [:i | tmp upTo: Character cr]. tmp close] timeToRun.
[| tmp | tmp := ((StandardFileStream readOnlyFileNamed: (SourceFiles
at: 2) name) readXtream ascii buffered decodeWith: (UTF8TextConverter
new installLineEndConvention: nil)) buffered.
       1 to: 10000 do: [:i | tmp upTo: Character cr]. tmp close] timeToRun.
}

#(1395 84)

The first is the recently optimized trunk version. Unfortunately,
MultiByteFileStream at work, you get a looong one by one decoding
The second is the Xtream version with crafted #buffered sends.
Hardly believable what you can do with a utf8ToSqueak-like hack and a buffer...

Of course, this version is optimized only in case of ASCII source
encoded in UTF8 (the easy case, but the most common case concerning
source files).
I don't know what hapens when encountering a multi-byte utf-8 char...
... all I know is that performance in this case is likely a disaster
(my code is a bit stupid, but it's too late do correct it now)

Oh, maybe Levente will just port the idea tomorrow in trunk, so I can
have a bit more rest ;)

Cheers

Nicolas



More information about the Squeak-dev mailing list