[squeak-dev] news from the Xtream front

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Tue Dec 8 15:19:41 UTC 2009


2009/12/8 Levente Uzonyi <leves at elte.hu>:
> On Tue, 8 Dec 2009, Nicolas Cellier wrote:
>
>>
>> Oh yes, like this ?
>>
>> | file |
>> [file := MultiByteFileStream newFileNamed: 'mbfs_skip.tst'.
>> file ascii; wantsLineEndConversion: false; converter: UTF8TextConverter
>> new.
>> file nextPutAll: 'Ceci doit changé'.
>> file skip: -1. "Oops - grammatically incorrect"
>> file nextPutAll: 'er'.
>> file close.
>>
>> file := StandardFileStream oldFileNamed: 'mbfs_skip.tst'.
>> file ascii.
>> file contentsOfEntireFile.]
>>        ensure: [file close.
>>                FileDirectory default deleteFileNamed: 'mbfs_skip.tst'].
>> -> 'Ceci doit chang?er' "Oops squeakly incorrect"
>>
>> Ah Ah, MultiByteFileStream let us see a stream of encoded characters,
>> but position over a stream of bytes...
>> The only programmer choice is to put marks (by inquiring aMBFS
>> position) and restore position using these marks...
>>
>
> Well, this part is broken, but the current fileIn/fileOut code relies on
> this bug/"feature", otherwise it would be easy to fix it in the utf8 case.
> Actually I was thinking about CompiledMethod >> #getPreambleFrom:at: or even
> worse PositionableStream >> #backChunk.
>

Oh, I see... It seems we're lucky to use a delimiter with charCode < 128 !
Among everal alternatives:
1) make a generic PositionableXtreamWrapper that memorize source
position at some mark (at each buffer for example).
2) make a reverseXtreamWrapper
...

>> Making something simple out of current MultiByteFileStream mess is a
>> challenge I don't even want to take, but you seem a but tougher than
>> me.
>>
>
> I think the current performance of MultiByteFileStream is acceptable for
> general use. According to my measurements the greatest bottleneck is
> WriteStream >> #nextPut: for typical operations.
>
>
> Levente
>

You mean streaming on a collection ? Didn't someone corrected nextPut:
primitive recently ?
Without this primitive, avoid the isOctetCharacter and co, ByteString
at:put: handles that...
See Xtream implementation:

{
[|ws |
	ws := (String new: 10000) writeStream.
	1 to: 20000 do: [:i | ws nextPut: $0]] bench.
[| ws |
	ws := (String new: 10000) writeXtream.
	1 to: 20000 do: [:i | ws nextPut: $0]] bench.
}
#('86.4789294987018 per second.' '128.374325134973 per second.')
1.5x speed up is already something...

Otherwise, you'll have to look at a higher level to see if you cannot
use a buffered technique and nextPutAll: instead. That would be a
major speed up (10x or +).

Nicolas

>> Cheers
>>
>> Nicolas
>>
>>>
>>> Levente
>>>
>>>> Cheers
>>>>
>>>> Nicolas
>>>>
>>>>
>>>
>>>
>>
>
>
>
>



More information about the Squeak-dev mailing list