File names was Re: Warning: Large Babel translation
Hannes Hirzel
hannes.hirzel.squeaklist at bluewin.ch
Sun Nov 16 22:38:06 UTC 2003
Yoshiki Ohshima wrote:
> Well, most of the non-English, non-Unicode encodings are more or
> less compatible with ASCII^^;
Hoping the following clarifies a point...
UTF-8 is compatible not only on the encoding level (assignment of code
numbers), but as well on the physical level (sequence of bytes).
Every ASCII string can be considered UTF-8 encoded already.
This is not the case for e.g. UTF-16. The code numbers of an
English-only text correspond to the ASCII codes but not the sequence of
bytes.
The physical level is important, I think, if we speak of VMs and
compatibility across platforms. ASCII is the only encoding for data
exchange which worked universally in the last 40 years (from a general
user point of view). UTF-8 might become the "ASCII" of the 21st century.
Hannes
Links:
http://en2.wikipedia.org/wiki/UTF-8
http://en2.wikipedia.org/wiki/UTF-16
More information about the Squeak-dev
mailing list
|