UTF8 Squeak

Yoshiki Ohshima yoshiki at squeakland.org
Thu Jun 7 21:11:36 UTC 2007


  Hi, Janko,

> >> 1. internally everything is in 16bit Unicode, without any additionally
> >>     encoding info attached to strings
> > 
> >   If they use 16-bit per char, how do they deal with surrogated pairs?
> 
> I looked once again and there is actually a FourByteString too. This 
> probably answer your question.

  Probably, yes.

  So, the question to you is that if you have a system with 8-bit
ByteString and 32-bit WideString in year 2007, would you add a class
to represent 16-bit string to that system?

> VW also support Japanese locale well.

  Oh, yes.  I know it.  In fact, the internationalization of
VisualWorks was done by a company that is my former employee. (The
work was done way before I joined, though).  I have seen some apps and
developers of the system.

  However, there is a reason to call our stuff m17n, instead of i18n.
It might be still an aspiration to it, but supporting one language at
a time "sort of localed based idea" is not enough for "real"
multilingualization, where you would like to mix strings from
different languages freely.

-- Yoshiki



More information about the Squeak-dev mailing list