Remaining to-do items for 3.7

Yoshiki Ohshima Yoshiki.Ohshima at acm.org
Fri Feb 20 01:56:04 UTC 2004


  Ned,

> Don't know. It looks like you could say: \x{2fffffff} as an escape sequence.

  Does this mean you can go up to four bytes (three and three quarter)
of UTF-8?  Kinda interesting number^^; I guess this probably means you
can go further than BMP...  I'll look at it when I have to use^^;

> >   It shouldn't be too hard to support it from m17n Squeak.  It would
> > give me one less reason to use Ruby now and then^^; However, we're
> > going to lose the ability to Japanese/Chinese sensitive search, hmm.
> 
> Could you decorate the strings before passing them to the regex engine and 
> then strip them after (i.e. add tags)?

  I doubt it.  Since Unicode (4.0) doesn't define any meaningful
semantics or anything of language tag, the common tool accept any of
those non-standard input.

-- Yoshiki



More information about the Squeak-dev mailing list