UTF8 Squeak
Janko Mivšek
janko.mivsek at eranova.si
Thu Jun 7 21:49:29 UTC 2007
Hi Yoshiki,
Yoshiki Ohshima wrote:
>>>> 1. internally everything is in 16bit Unicode, without any additionally
>>>> encoding info attached to strings
>>> If they use 16-bit per char, how do they deal with surrogated pairs?
>> I looked once again and there is actually a FourByteString too. This
>> probably answer your question.
>
> Probably, yes.
>
> So, the question to you is that if you have a system with 8-bit
> ByteString and 32-bit WideString in year 2007, would you add a class
> to represent 16-bit string to that system?
I would say yes, because for most countries 16-bit is enough and 32-bit
is then just a waste of memory. And I just noticed that WideString is
actually fixed to 4 bytes. I would therefore think about renaming it to
ForByteString and add TwoByteString (or similar names). For user these
are always Strings anyway, as SmallIntegers and LargeIntegers are always
Integers.
>
>> VW also support Japanese locale well.
>
> Oh, yes. I know it. In fact, the internationalization of
> VisualWorks was done by a company that is my former employee. (The
> work was done way before I joined, though). I have seen some apps and
> developers of the system.
>
> However, there is a reason to call our stuff m17n, instead of i18n.
> It might be still an aspiration to it, but supporting one language at
> a time "sort of localed based idea" is not enough for "real"
> multilingualization, where you would like to mix strings from
> different languages freely.
I strongly agree and therefore a well thought-out effort to solve i18n
well in Squeak is a must. For me also, because I still need to find out
how to port Aida/Web i18n support to Squeak ...
Best regards
JAnko
--
Janko Mivšek
AIDA/Web
Smalltalk Web Application Server
http://www.aidaweb.si
More information about the Squeak-dev
mailing list
|