[squeak-dev] leadingChar question
Levente Uzonyi
leves at elte.hu
Thu Apr 21 23:30:50 UTC 2011
Hi,
I think we found a bug, but I'm interested in your opinion before "fixing"
it. Some TextConverters (e.g. ISO88592TextConverter) implement
#leadingChar. The problem is that this #leadingChar is added to all
decoded characters. Since character equality takes leadingChar into
account, these decoded characters will never be equal to unicode
characters. The following example returns false, because the carriage
return (13) will be decoded as (Character value: 58720269):
(String cr convertFromWithConverter: ISO88592TextConverter new) = String cr
The current system (Collections, Compiler, etc) assumes that the first 256
characters are unique and doesn't care about the variants of these
characters which have non-zero leadingChar.
So, I think we should change Character class >> #leadingChar:code: to
ignore it's first argument, when the second is less than 256.
Also, I think only TextConverters of CJKV languages should implement
#leadingChar, because AFAIK only the characters of those languages are
unified.
What do you think?
Cheers,
Levente
More information about the Squeak-dev
mailing list
|