m17n ready to go

Yoshiki Ohshima Yoshiki.Ohshima at acm.org
Thu Jul 29 21:14:41 UTC 2004


  Hello,

> > As an example, the SqueakMap checkpoints are stored as compressed text. The
> > SqueakMap loader does something like:
> > 
> >         contents := (self directory oldFileNamed: fname) ascii upToEnd unzipped.
> >         stream := (RWBinaryOrTextStream with: contents) reset.
> > 
> > With these changes, though, oldFileNamed: returns a MultiByteFileStream. Which
> > would be OK if its converter was the Latin1TextConverter (which maps bytes to
> > characters 1:1), but it's not. It is, instead, a UTF8TextConverter.
> 
> Same thing happens in ChangeList when trying to read a gzipped file.
> 
> 	zipped _ GZipReadStream on: (FileStream readOnlyFileNamed: fullName).
> 	unzipped _ ReadStream on: zipped contents asString.
> 	ChangeList browseStream: unzipped
> 
> FileStream readOnlyFileNamed: returns a MultiByteFileStream and
> GZipReadStream fails.

  You can always specify your converter.  In this case, something like

         contents := (self directory oldFileNamed: fname) ascii upToEnd unzipped.
         stream := (MultiByteBinaryOrTextStream with: contents) reset.
         stream converter: Latin1TextConverter new.

should do it.

> > This would seem to be knowledge that only the user of that file
> > would have.

  And the user can specify it.

> > Again, the default assumption is that the String will hold text -- even though
> > there's nothing in it yet! It seems to me that the default converter for this
> > stream should be the Latin1TextConverter. If a particular user of a String
> > has a need for or knowledge of a particular encoding, they can change the
> > converter.

  No.  If the default is Latin1TextConverter, there would be more
problems.

> > However, I don't think it's right to introduce new  and incompatible character
> > conversion semantics on the existing file API.

  The rule of thumb is that if you open a file, you should think about
it is text or binary, and if it is text, you should think about how
it is interpreted.

-- Yoshiki



More information about the Squeak-dev mailing list