On 31.07.2009, at 18:53, K. K. Subramaniam wrote:
On Friday 31 Jul 2009 9:49:29 pm Bert Freudenberg wrote:
$ squeak -help | grep UTF -pathenc <enc> set encoding for pathnames (default: UTF-8) -textenc <enc> set encoding for external text (default: UTF-8)
This is the VM setting for the encoding of file names and the clipboard. This has nothing to do with the contents of files. The contents is not interpreted by the VM.
All major Linux distros have settled on UTF-8 as default encoding for text files. Isn't it time we switched too? It could help flush out encoding bugs in Squeak code. CJK locales can use textenc overrides (or this could be set in the startup code).
One issue is that interpreting any file as utf-8 is not binary-safe. When I wrote we interpret as latin-1 this actually meant we do no converting at all, the bytes in the file correspond directly to a character in the file view. After all, this is not a text file editor.
- Bert -