New subject: a diacritics free version of a string

3 Jun 2009


      Philippe wrote:
...
The Unicode solution would be to do normalization with full
decomposition and then a regex on \p{InCombiningDiacriticalMarks} and
replace it with an empty string or something similar.
I don't think that is enough. I think the normalization is language dependent.
o-umlaut is replaced by oe in German, but the equivalent in Dutch is o.
Stephan

Re: a diacritics free version of a string