[squeak-dev] UTF-8

Philippe Marschall philippe.marschall at gmail.com
Sun Mar 29 13:50:21 UTC 2009


2009/3/29 Janko Mivšek <janko.mivsek at eranova.si>:
> Philippe Marschall pravi:
>> Janko Mivšek:
>
>>> We don't have any problems with Squeak Unicode in Aida/Web apps,
>>> probably because we strictly use Unicode internally,
>
>> You can not do that. Squeak stores the language of a character in
>> every character. In a web application you don't know the language of
>> the input and utf-8 certainly doesn't contain it. You could take the
>> language of the image but that is random and has no relation to the
>> input. You could also set the language of a character to unicode (255)
>> but that only works for non-Latin-1 characters, these are interned and
>> all have leadingChar 0. Did I already mention that the leadingChar is
>> used for #=? So no, I don't believe you.
>
> Well, you should believe me, I have a proof!
>
> Look at this Aida/Scribo  multilingual demo served from Squeak image:
> http://demo.bioskop.fr/wiki/wiki.html, see specially Japanese and
> Russian  text. Even Japanese urls are working correctly:
> http://demo.bioskop.fr/wiki/%E3%83%86%E3%82%B9%E3%83%88.html

That's just external representation, that tells absolutely nothing
about internal representation and the implementation. I could easily
the the same result on a Squeak 3.7.

> About leading character, I even don't know what is that, except in
> theory. That is, I never encounter this character as a problem when
> porting Aida and its i8n support to Squeak.

How can you seriously say everything is working fine when in practice
you can't say what is happening and don't know how Strings and
Characters work in Squeak? I find that quite dubious hyping.

Cheers
Philippe



More information about the Squeak-dev mailing list