[squeak-dev] m17n simplification questions

Yoshiki Ohshima yoshiki at vpri.org
Wed Sep 2 17:53:23 UTC 2009


At Tue, 01 Sep 2009 21:53:13 -0700,
Andreas Raab wrote:
> 
> Hi Yoshiki (and everyone else knowledgeable in m17n) -
> 
> I've been looking through some of the m17n stuff to simplify things and 
> noticed some parts that I really don't know if they're still used or 
> not. I don't want to remove them if they're used but I want to make sure 
> we're not carrying dead weight (and some of it seems obsolete):
> 
> * HandMorph's CompositionManager: There is ImmAbstractPlatform, ImmWin32 
> and ImmX11. Are these still in use and functional? Should we continue to 
> support them?

  Yes, and yes.  I probably should put the plugin code up somewhere.
The Unix VM supports (or used to, I haven't tried it in the latest).

> * LanguageEnvironment converters: Is there any reason to assume that we 
> will ever need to support any encodings other than UTF8/Unicode for the 
> VM/image interface? Should we just get rid of all of these different 
> converter methods and use the UTF8/Unicode conversions directly, i.e., 
> instead of:
> 
>    converter := LanguageEnvironment defaultFileNameConverter.
>    squeakPathName := vmPathString convertFromWithConverter: converter.
> 
> the code becomes:
> 
>    squeakPathName := vmPathString utf8ToSqueak.

  For file names, in general, it is ok by now.

  The complication is reading the file names in a zip file.  The name
interpretation has to be special.  The zip files being created and had
been created use Shift-JIS for the archive members' names (I wonder it
is 8859-1 in Western Europe still?).  The #defaultSystemConverter
variant should stay for this purpose. 

> * Converter classes: If the answer to the previous question is that we 
> use UTF8/Unicode consistently, is there any reason whatsoever to keep 
> the clipboard or keyboard interpreter classes? (we're talking a *lot* of 
> classes here; keyboard interpreter has 15 subclasses; clipboard 
> interpreter 12 etc).

  Only reason would be to manage the language tag for some CJK language.

> * EncodedCharSet: Are any encodings other than Unicode currently in use? 
> Do we need to explicitly support domestic CJK encodings given that we 
> have Unicode + language tag?

  - Because Unicode doesn't offer round trip conversion from/to some
    of these encodings, one stance Squeak's m17n is alluding to and
    some other systems, like Ruby m17n and Gauche Scheme's mechanism
    try to do is to allow non-Unicode encoded chars stored in a
    similar manner we did with language tag and ensure the input and
    output of these strings consistent.  I would kind of like to keep
    the ability.

  - There are even Etoys project created from old days, that use JIS X
    0208.  If in the future to allow to load them into a possible
    Etoys on mainstream Squeak, we probably would rather keep them.

-- Yoshiki



More information about the Squeak-dev mailing list