Unix VM path encodings
Andreas Raab
andreas.raab at gmx.de
Sun Dec 30 12:09:16 UTC 2007
Yoshiki Ohshima wrote:
>> Hm ... lemme try this ... ah, interesting. It appears
>> that I can make the Umlauts work on Unix correctly if and only if:
>> * I fix the above method to return UTF8TextConverter in every case [*1]
>> * I use -pathenc MacRoman -textenc MacRoman
>> Which makes no sense to me since neither the path nor the text encoding
>> is MacRoman but it appears to work. Huh?
>
> Yes, on Unix VM, another historical mishappen caused it; "MacRoman"
> still means "no conversion" so that if the image passes UTF-8 string,
> the UTF-8 string is passed to system calls.
Playing around a little it appears as if the Unix VM always converts
path names with the assumption that Squeak uses MacRoman in the image
and only -pathenc affects the translation between file system and the
image (i.e., -textenc has *no* effect on path name translation
whatsoever). Can someone confirm this? It would explain why -pathenc
MacRoman works (since like you say it's really the "no conversion" flag)
if combined with a proper file name converter in the image.
>> [*1] And that of course reminds me that nobody has really made any
>> comment on why the hell we still deal with all of these nonsensical
>> legacy encodings and don't just go straight to UTF-8 in the VM interface
>> which would simplify *lots* of cruft in the code.
>
> Well, nobody tried to change stuff on the all platforms at once.
> Windows is doing ok with 3.10 VM and OLPC Etoys image (there is still
> code that deals with older VM... typical installation for people is to
> install stuff from squeakland.org and then use Etoys image).
What encoding options are being used on OLPC? Do non-ascii file names,
clipboard, drag and drop etc. work on OLPC?
Cheers,
- Andreas
More information about the Squeak-dev
mailing list
|