UTF8 Squeak

subbukk subbukk at gmail.com
Fri Jun 8 08:40:39 UTC 2007

On Friday 08 June 2007 1:25 am, Yoshiki Ohshima wrote:
> > Well, UTF8 is just an encoding of Unicode code points, So, Squeak will
> > have to support Unicode. Its language and tools will need to handle
> > Unicode code points and UTF8 streams. Internally, whether code points or
> > UTF8 encoding is used would depend on the context.
>   Why do you get the impression that Squeak doesn't support it?
Squeak's Unicode/UTF8 support seemed incomplete. I couldn't get Squeak on 
Linux to take in ½ or π. How about :
a) Use Unicode chars in literals and text fields. I should be able to write 
math equations in PluggableText.
b) Use Unicode chars in names (object, method, variable, symbols). Children 
should be able to name their scripts and variables in their language in 
c) See fallback glyphs for Unicode. Like four hex digits laid out 2x2 in a 
small box the same height as the current font. It works much better than [] 
d) Have Buttons that generate Unicode. This could be used to build soft 
keyboards. (cf. PopUpMenu>>readKeyboard uses asciiValue :-().
e) Use Modal input - codes coming in from Sensors could be button presses 
(e.g. ESC, hotkeys to switch keyboard layouts, ) or multilingual text 
f) See 'current language' indicator in input fields. Handling backspace will 
be language dependent.
> Using UTF-8 internally throughot the system would be a challenge,
> especially thinking about that the overloaded methods like at:,
> at:put: and all of these have to be disambiguated as to what it means.
at:put: is a random access operation and UTF-8 is not meant for such purposes. 
UTF-8 works well for streams of characters and Unicode for random access and 
lookup. This is what I meant when I said it would depend on context. Then 
there are mixed streams like keyboard input. I could be reading button 
presses (like Enter for OK) or reading in a stream of characters in a text 
field. We may need instream character codes to switch modes and language.

I am still coming upto speed on Squeak multilingual support and these 
observations are based on my explorations so far. It is quite possible that I 
may have missed something.

Regards .. Subbu

More information about the Squeak-dev mailing list