[squeak-dev] leadingChar question

Levente Uzonyi leves at elte.hu
Sat Apr 30 01:51:29 UTC 2011


There were no objection, so the code is in the Trunk now.


Levente

On Tue, 26 Apr 2011, Levente Uzonyi wrote:

> No response, so I uploaded Collections-ul.440 and Multilingual-ul.141 to the 
> Inbox. In addition to the previously described ideas, I implemented the 
> various copy methods for Character, because they are not unique since Squeak 
> 3.8. The tests are green.
>
>
> Levente
>
> On Fri, 22 Apr 2011, Levente Uzonyi wrote:
>
>> Hi,
>> 
>> I think we found a bug, but I'm interested in your opinion before "fixing"
>> it. Some TextConverters (e.g. ISO88592TextConverter) implement 
>> #leadingChar. The problem is that this #leadingChar is added to all decoded 
>> characters. Since character equality takes leadingChar into account, these 
>> decoded characters will never be equal to unicode characters. The following 
>> example returns false, because the carriage return (13) will be decoded as 
>> (Character value: 58720269):
>> 
>> (String cr convertFromWithConverter: ISO88592TextConverter new) = String cr
>> 
>> The current system (Collections, Compiler, etc) assumes that the first 256 
>> characters are unique and doesn't care about the variants of these 
>> characters which have non-zero leadingChar.
>> 
>> So, I think we should change Character class >> #leadingChar:code: to 
>> ignore it's first argument, when the second is less than 256.
>> 
>> Also, I think only TextConverters of CJKV languages should implement 
>> #leadingChar, because AFAIK only the characters of those languages are 
>> unified.
>> 
>> What do you think?
>> 
>> Cheers,
>> Levente
>> 
>> 
>
>



More information about the Squeak-dev mailing list