multilingual Squeak (Re: Must _ go like the Dodo?)

Marcel Weiher marcel at system.de
Tue Mar 16 12:38:20 UTC 1999


>   Suppose there is an (imagenary) multilingual Smalltalk.
> On the system, the instances of Character should carry
> enough information about the character itself.  But, as I
> wrote above, if the internal representation would be
> Unicode, this couldn't be true.

Apple/NeXT's Yellow-Box has a nice way of handling this.  Strings  
are defined to be in a certain encoding, with encoding objects  
specifying just which encoding that could be.  There are methods for  
(a) accessing the characters in the string's 'native' encoding (b)  
converting the string to another encoding (c) accessing as Unicode  
and (d) getting a string's encoding.  There are private subclasses to  
deal with commin formats ( ASCII and other eight bit encodings,  
Unicode) efficiently.

This way, there need not be a common encoding scheme for all  
characters, although it may be advantageous to stick to 7-bit ASCII  
for some system level stuff...  Fonts handle the mapping of  
character-codes to glyphs depending on the character encoding and  
their own code->glyph maps.

Marcel





More information about the Squeak-dev mailing list