[Vm-dev] Unix VM path encodings
Andreas Raab
andreas.raab at gmx.de
Sun Dec 30 07:32:04 UTC 2007
Hi -
Due to a bug reported against Qwaq Forums I needed to look into how the
Unix VM encodes file and path names and got terribly confused. My test
case was to create a file with an Umlaut("Jürgen") and to see what both
Squeak and the Unix shell reports with varying settings of -pathenc and
-textenc.
I started with the assumption that since the file system I was running
this on is UTF-8 the default settings (-textenc MacRoman and -pathenc
UTF-8) ought to be correct. However, the result was very surprising. The
file name was reported incorrectly both in the file list as well as by
the OS - the file list reported "J?" (truncated after the question mark)
and the Unix shell reported "J?rgen" but with a "funky ?" (the glyph is
hard to describe without a screenshot; it was neither an umlaut nor a
regular question mark).
Playing with the settings I could not find any combination that resulted
in a consistent representation for all the different views - either the
Unix shell was off or Squeak's view was off no matter how I set those
encodings. Can someone explain to me how I need to set these values to
get a consistent view on file names both from Squeak and Unix?
Cheers,
- Andreas
More information about the Vm-dev
mailing list