[Newbies] Re: Character #asciiValue vs #charCode
nicolas.cellier.aka.nice at gmail.com
Sat Jan 8 15:31:59 UTC 2011
Sean P. DeNigris <sean <at> clipperadams.com> writes:
> #asciiValue - could there be an ascii character with a leadingChar, or will
> this always be 0 for non-eastern characters? Should there be any error
> checking - what is the meaning of ascii value for a non-ascii char?
I would simply let asciiValue as is.
In method comment,
1) I would encourage for restricting usage to legacy code,
2) and warn for undefined behavior if the character is not in the ASCII set.
I don't know if there can be some ASCII characters with a leadingChar ~= 0.
But we should better not care too much of it...
Legacy code should only deal with ByteString.
ByteString can't have any leadingChar ~= 0 anyway.
> "In Squeak Character encoding, bits above 16r3FFFFF don't encode the
> character, but hold information about the language environment and the
> encoding which should be used to interpret the charCode. The background of
> which is Han unification (http://en.wikipedia.org/wiki/Han_unification)."
> How's that as a method comment? Is it really "In Squeak... encoding..." or
> does this apply to unicode in general?
Sure, IMO the whole thing deserve a good class comment too.
Maybe method comments should refer to class comment.
Very few people understand the issue...
Unless exposed to asian typographic problems.
More information about the Beginners