[squeak-dev] UTF-8

Philippe Marschall philippe.marschall at gmail.com
Sun Mar 29 10:13:34 UTC 2009


2009/3/29 Janko Mivšek <janko.mivsek at eranova.si>:
> Philippe Marschall pravi:
>> Michael Rueger:
>>> Pierre-Edouard PORTIER wrote:
>
>>>> But I would like to be able to *see* utf-8 characters inside the squeak
>>>> environment.
>
>>> Are you sure you are not confusing "utf-8" with "unicode"? utf-8 is
>>> just one way of encoding unicode (characters).
>>> You can import utf-8 encoded characters/strings, but once inside
>>> Squeak they are kept as unicode characters.
>
>> Plus leadingChar, which causes a lot of problems for web applications.
>
> We don't have any problems with Squeak Unicode in Aida/Web apps,
> probably because we strictly use Unicode internally,

You can not do that. Squeak stores the language of a character in
every character. In a web application you don't know the language of
the input and utf-8 certainly doesn't contain it. You could take the
language of the image but that is random and has no relation to the
input. You could also set the language of a character to unicode (255)
but that only works for non-Latin-1 characters, these are interned and
all have leadingChar 0. Did I already mention that the leadingChar is
used for #=? So no, I don't believe you.

Cheers
Philippe



More information about the Squeak-dev mailing list