UTF8 as default text encoding? (was: Re: m17n ready to go) - Squeak-dev

30 Jul 2004


      Ned Konz wrote:
...
On Thursday 29 July 2004 2:14 pm, Yoshiki Ohshima wrote:
...
...
...
Again, the default assumption is that the String will hold text --
even
...
...
...
...
though there's nothing in it yet! It seems to me that the default
converter for this stream should be the Latin1TextConverter. If a
particular user of a String has a need for or knowledge of a
particular
...
...
...
...
encoding, they can change the converter.
No.  If the default is Latin1TextConverter, there would be more
problems.
Like what? If everyone who wants text is specifying the type (like you
suggest
...
below) there shouldn't be any problems.
...
...
...
However, I don't think it's right to introduce new  and incompatible
character conversion semantics on the existing file API.
The rule of thumb is that if you open a file, you should think about
it is text or binary, and if it is text, you should think about how
it is interpreted.
Sure. And the authors of the code that was broken had done that when
they
...
wrote it.
This boils down to the question if Latin1 or UTF8 should be the default
text encoding. If one thinks backwards Latin1 is probably the choice
whereas if we look forward UTF8 is surely to be preferred.
I personally prefer UTF8.
But perhaps this decision might be postponed?
Hannes