[squeak-dev] Re: how to create an UTF-8 character
Andreas Raab
andreas.raab at gmx.de
Sat Sep 27 17:14:39 UTC 2008
Philippe Marschall wrote:
> 2008/9/27 stephane ducasse <stephane.ducasse at free.fr>:
>> do I understand correctly that such a aString is a sequence of unicode
>> codepoints?
>
> Plus leading char. If you look at UTF8TextConverter it will give every
> incoming character with an index higher than 255 the language of the
> image. I don't need to explain why this is problematic in the context
> of a web application, do I?
Actually, it *is* worthwhile to explain this. The problem is that since
UTF-8 doesn't have the notion of a leading char there is no way to tag
incoming data correctly. The leading char will be taken from the running
image, so an image running in the US (like our servers) will tag input
coming from Chinese browsers as Latin1. In these situations the leading
char isn't just useless, it is actively misleading.
Cheers,
- Andreas
More information about the Squeak-dev
mailing list
|