[Newbies] Re: Character #asciiValue vs #charCode

nicolas cellier nicolas.cellier.aka.nice at gmail.com
Sat Jan 8 15:31:59 UTC 2011


Sean P. DeNigris <sean <at> clipperadams.com> writes:

> 
> 
> #asciiValue - could there be an ascii character with a leadingChar, or will
> this always be 0 for non-eastern characters?  Should there be any error
> checking - what is the meaning of ascii value for a non-ascii char?
> 

I would simply let asciiValue as is.
In method comment,
1) I would encourage for restricting usage to legacy code,
2) and warn for undefined behavior if the character is not in the ASCII set.

I don't know if there can be some ASCII characters with a leadingChar ~= 0.
But we should better not care too much of it...
Legacy code should only deal with ByteString.
ByteString can't have any leadingChar ~= 0 anyway.

> #leadingChar
> "In Squeak Character encoding, bits above 16r3FFFFF don't encode the
> character, but hold information about the language environment and the
> encoding which should be used to interpret the charCode. The background of
> which is Han unification (http://en.wikipedia.org/wiki/Han_unification)."
> 
> How's that as a method comment?  Is it really "In Squeak... encoding..." or
> does this apply to unicode in general?
> 
> Sean


Sure, IMO the whole thing deserve a good class comment too.
Maybe method comments should refer to class comment.
Very few people understand the issue...
Unless exposed to asian typographic problems.

Nicolas



More information about the Beginners mailing list