[bug linux/windows] 'higher' ascii fileOut

Lex Spoon lex at cc.gatech.edu
Tue Oct 19 10:47:28 UTC 1999


Another strategy would be to switch Squeak to the ISO Latin-1
characterset that most of the Internet uses.  Okay, I'm biased :)

If you want to support multiple charactersets well, it would probably be
worth looking at "Multilingual" project already underway:

	http://minnow.cc.gatech.edu/squeak.756


Okay, "underway" might be an overstatement, but it's probably worth at
least reading the ideas generated in that group.

Lex


michalstarke <Michal.Starke at lettres.unige.ch> wrote:
> 
> dear all:
> 
> here comes the daily bug ;-).
> 
> On windows and nux "higher ascii" characters such as, say e-acute, can
be typed-in correctly, but do not survive a trip to/from the filesystem.
Simple test are: 
> 
> (i) evaluate: 
> (FileStream newFileNamed: 'test.txt')
> 	nextPutAll: 'dÈsolÈ, Áa dÈconne';
> 	close
> and try to look at the resulting file with a native text editor.
> 
> (ii) vice-versa: open a fileList on a text file on your (nux/win)
system that contains such characters produced by a native text-editing
facility. Again, no-go.
> 
> 
> The simplest mechanism to handle that would seem to be (bottom-up): 
> - to create instances of the CharacterSet class for win/nux, 
> - use them to add a number of converters to the Character class
(asWinAscii, asIsoWhatever)
> - use these converters in the FileStream subclasses that do the
input/output, in a way similar to the current CrLfFileStream.
> 
> (One problem with this is that code is now reduplicated: the VM code
that translates keyboard input does the same conversion as an input
file-stream would. To solve this, maybe keyboard translation can be done
at the image level. There is already a convienient hook for this in
'InputSensor.characterForKeycode:' which is called from
InputSensor.keyboardPeek/keyboard anyway (and if that is too slow, a
primitive can be put there). That way all in/out conversion would be in
one place with seemingly minimal disruption to the current system, and
the VM is simplified, as a side-effect.)
> 
> Does that sound like a reasonable strategy (before full
internationalisation efforts come to fruition)?
> 
> michal
> 
> ps. once the above mechanism is in place, it can easily be used for
clipboard-content conversion too, as a bonus.





More information about the Squeak-dev mailing list