multilingual Squeak

ohshima at is.titech.ac.jp ohshima at is.titech.ac.jp
Sun Mar 21 12:03:36 UTC 1999


  I'm sorry for my slow response.

> > From: "Michael S. Klein" <mklein at alumni.caltech.edu>
> >
> > What about comparing two strings?
> >
> > If you got two strings each with one character that is the same code  
> > point in unicode, but the strings are in diferent encodings, are the  
> > strings equal, or not.
> >
> > In other words, for example (from the first Han character in Unicode) 
> >
> > is U+4e00  =  G(5027) ?
> > is G(5027)  = J(1676) ?
> >
> > U means Unicode 2.0
> > G means GB 2312-80
> > J means JIS X 0208-1990

  They are 'not equal' to each other on the system which I
think of.  I think this is reasonable.
  
> > For the Eurocentric amongst us (most of the list),
> >
> > is a Unicode $r   the same as an ASCII $r  ?
> > is an English 'r' the same as a French 'r' ?

  The characters which can be mapped into 'A area'
consitently may be mapped to the area.

> > In defence of Unicode, it preserves round trip transcoding.
> > It would be my standard of choice.

  Actually, "J -> U -> J" transcoding does not preserve the
original text.  The two 'reverse sollidus'es are mapped to
the same code point and can't be reversed.

                                             OHSHIMA Yoshiki
                Dept. of Mathematical and Computing Sciences
                               Tokyo Institute of Technology 





More information about the Squeak-dev mailing list