Unicode support (File names wasRe: Warning:Large Babel translation)

Yoshiki Ohshima Yoshiki.Ohshima at acm.org
Sat Nov 22 23:13:04 UTC 2003


  Hello,

> If we decide that even this is too much effort, then it seems better to
> have no compatibility and simply fork the VM.  Old images use old VM's
> and new images use new ones.  In that case, is there anything but UTF-8
> we'd want to use for the  new VM's ?

  Probably UTF-32 for keyboard inputs, at least.

  Still, I like the idea of byte transparent VM for the next-gen VM.
Even UTF-8, the precomposed/decomposed characters handling or those
issues would be good to handle in the image.

  One reason is that saying "UTF-8" still doesn't specify unique
thing.  One platform can return different sequence of bytes for a
"same" string from another.  We should think about the policy as we
implement more sophisticated text rendering engine, etc.

  Another reason is that still the VM maintainers would not be able to
test the different languages for sure, unless he or she knows the
language.  It would be so frustrating for new comers if they found
that a bug in a VM prevent them from using Squeak, and the maintainer
says "It uses Unicode, so it should work.  I don't care."  etc.

-- Yoshiki



More information about the Squeak-dev mailing list