3.7 moving to beta tomorrowish

Avi Bryant avi at beta4.com
Wed Mar 31 00:09:26 UTC 2004


On Mar 30, 2004, at 3:57 PM, Bill Schwab wrote:

> An overly blunt way to look at unicode is that it offers us an 
> opportunity to double the storage requirements for all of our text.  
> In fact, one device that I have encountered uses "unicode" (it likely 
> predates the standards), and ends up doing precisely that - each 
> character it sends is followed by a gratuitous zero, in a world where 
> every byte truly counts thanks to bandwidth restrictions.

> I understand the value of unicode, and want Squeak to embrace it.  
> However, is unicode something that many of us would want to disable 
> most of the time?  I ask because, if true, we might want another 
> solution to the underscore/:= collision.

No.  I'm far from expert in this area, but my understanding is this: 
UTF-8, which is the most common encoding for unicode, uses a single 
byte to encode unicode/ascii values from 0-127.  Larger values get the 
high bit set and use multiple bytes.  So for 7-bit ascii text, UTF-8 
doesn't take any extra space.

Avi




More information about the Squeak-dev mailing list