UTF8 Squeak
Yoshiki Ohshima
yoshiki at squeakland.org
Thu Jun 7 22:06:34 UTC 2007
Janko,
> > So, the question to you is that if you have a system with 8-bit
> > ByteString and 32-bit WideString in year 2007, would you add a class
> > to represent 16-bit string to that system?
>
> I would say yes, because for most countries 16-bit is enough and 32-bit
> is then just a waste of memory. And I just noticed that WideString is
> actually fixed to 4 bytes. I would therefore think about renaming it to
> ForByteString and add TwoByteString (or similar names). For user these
> are always Strings anyway, as SmallIntegers and LargeIntegers are always
> Integers.
Similar deal in Squeak, too. The system does the auto coertion
between WideString and ByteString, and the user doesn't have to deal
with them not all the time.
Adding 16-bit is surely an option. At the same time, there is
similar but different POV: "because for most users 8-bit is enough and
32-bit version is used not so frequently anyway". There is no "right"
answer, but different trade-offs. (That is why this problem is
interesting^^;)
And actually, adding more general character object that doesn't rely
on a particular bit-representation (and therefore can go beyond
32-bit), and make the strings be array of such characters will be
better eventually.
-- Yoshiki
More information about the Squeak-dev
mailing list
|