UTF8 Squeak

Colin Putney cputney at wiresong.ca
Mon Jun 11 18:00:45 UTC 2007


On Jun 11, 2007, at 4:04 AM, Janko Mivšek wrote:

> Anyone can definitively stay with UTF8 encoded strings in plan  
> BytString or subclass to UTF8String by himself. But I don't know  
> why we need to have UTF8String as part of string framework. Just  
> because of meaning? Then we also need to introduce an ASCIIString :)

> I think that preserving simplicity is also an important goal. We  
> need to find a general yet simple solution for Unicode Strings,  
> which will be good enough for most uses, as is the case for numbers  
> for instance. We deal with more special cases separately. I claim  
> that pure Unicode strings in Byte, TwoByte or FourByteString is  
> such a general support. UTF8String is already a specific one.

Ok, so what you're saying is this: ByteString, TwoByteString and  
FourByteString are good enough for the most purposes. Web developers  
and anyone else that needs to work with other encodings should roll  
their own solutions, so as not to burden the rest of the community  
with clutter caused by support for other encodings, or even hooks to  
make such things easy to integrate with the base string code.

Is that a fair characterization of your position?

Colin


More information about the Squeak-dev mailing list