3.7 moving to beta tomorrowish
avi at beta4.com
Wed Mar 31 00:09:26 UTC 2004
On Mar 30, 2004, at 3:57 PM, Bill Schwab wrote:
> An overly blunt way to look at unicode is that it offers us an
> opportunity to double the storage requirements for all of our text.
> In fact, one device that I have encountered uses "unicode" (it likely
> predates the standards), and ends up doing precisely that - each
> character it sends is followed by a gratuitous zero, in a world where
> every byte truly counts thanks to bandwidth restrictions.
> I understand the value of unicode, and want Squeak to embrace it.
> However, is unicode something that many of us would want to disable
> most of the time? I ask because, if true, we might want another
> solution to the underscore/:= collision.
No. I'm far from expert in this area, but my understanding is this:
UTF-8, which is the most common encoding for unicode, uses a single
byte to encode unicode/ascii values from 0-127. Larger values get the
high bit set and use multiple bytes. So for 7-bit ascii text, UTF-8
doesn't take any extra space.
More information about the Squeak-dev