TextConverter changes (was: Re: [squeak-dev] Re: The Trunk: Collections-HenrikSperreJohansen.335.mcz)

Levente Uzonyi leves at elte.hu
Mon Mar 15 23:56:24 UTC 2010


On Sun, 14 Mar 2010, Andreas Raab wrote:

> On 3/13/2010 11:27 AM, Levente Uzonyi wrote:
>> I had a different plan to attack this problem:
>
> Go for it. This one was simple because it's basically just replacing nextPut: 
> with nextPutAll: which is a no-brainer to me. If you've got something even 
> better, that's even better! :-)

I have to replicate the implementation of #utf8ToSqueak and friends from 
String and subclasses to the TextConverters before that. If ByteString's 
Latin1Utf8Map is not initialized then infinite recursion occurs with the 
code in the Inbox. This would also allow us to remove those class 
variables from ByteString.

I also think that TextConverters could be cleaned up a bit. For example 
UTF8TextConverter has two instance variables (currentCharSize and 
forceToEncodingTag) which are unused. There are methods which reference 
them, but those have no senders or they just set their values without 
ever reading them). I think those are just remnants and I'm about to 
remove them.


Levente

>
> Cheers,
>  - Andreas
>
>> 
>> Installer squeak
>> project: 'inbox';
>> install: 'Multilingual-ul.102.mcz';
>> install: 'Collections-ul.336.mcz'
>> 
>> Benchmark:
>> 
>> ((1 to: 10) detectSum: [ :run |
>> |converter|
>> converter := UTF8TextConverter new.
>> [ 1 to: 50000 do: [ :index |
>> 'abcaskjdhfáasiugbvsipruvnaséipvunasivunapiívunasieunó'
>> convertToWithConverter: converter ] ] timeToRun ]) / 10.0
>> "old ==> 355.6 new ==> 273.0"
>> 
>> 
>> Levente
>> 
>> On Sat, 13 Mar 2010, commits at source.squeak.org wrote:
>> 
>>> Andreas Raab uploaded a new version of Collections to project The Trunk:
>>> http://source.squeak.org/trunk/Collections-HenrikSperreJohansen.335.mcz
>>> 
>>> ==================== Summary ====================
>>> 
>>> Name: Collections-HenrikSperreJohansen.335
>>> Author: HenrikSperreJohansen
>>> Time: 12 March 2010, 3:38:49.316 pm
>>> UUID: 8b8f9b48-feb7-7d4d-9458-dcd334ad6e81
>>> Ancestors: Collections-ar.334
>>> 
>>> Faster String>>convertFromWithConverter: from Pharo.
>>> 
>>> Useful in f.ex. asVMPathName.
>>> 
>>> Test:
>>> 
>>> [|converter|
>>> converter := UTF8TextConverter new.
>>> 1 to: 50000 do: [:ix |
>>> 'abcćřĺaskjdhfasiugbvsipruvnasipvunasivunapivunasieun'
>>> convertToWithConverter: converter]] timeToRun
>>> 
>>> =============== Diff against Collections-ar.334 ===============
>>> 
>>> Item was changed:
>>> ----- Method: String>>convertToWithConverter: (in category
>>> 'converting') -----
>>> + convertToWithConverter: converter
>>> + converter
>>> + ifNil: [^ self].
>>> + ^ String
>>> + new: self size
>>> + streamContents: [:writeStream |
>>> - convertToWithConverter: converter
>>> -
>>> - converter ifNil: [ ^self ].
>>> - ^String new: self size streamContents: [ :stream |
>>> - | character |
>>> - 1 to: self size do: [ :index |
>>> converter
>>> + nextPutAll: self toStream: writeStream;
>>> + emitSequenceToResetStateIfNeededOn: writeStream]!
>>> - nextPut: (self at: index)
>>> - toStream: stream ].
>>> - converter emitSequenceToResetStateIfNeededOn: stream ]!
>>> 
>>> 
>>> 
>> 
>> 
>> 
>
>
>


More information about the Squeak-dev mailing list