[squeak-dev] Re: a diacritics free version of a string
stephan at stack.nl
stephan at stack.nl
Wed Jun 3 11:23:47 UTC 2009
Philippe wrote:
> The Unicode solution would be to do normalization with full
> decomposition and then a regex on \p{InCombiningDiacriticalMarks} and
> replace it with an empty string or something similar.
I don't think that is enough. I think the normalization is language dependent.
o-umlaut is replaced by oe in German, but the equivalent in Dutch is o.
Stephan
More information about the Squeak-dev
mailing list
|