CrLfFileStream as default?

lex at cc.gatech.edu lex at cc.gatech.edu
Thu Oct 29 21:23:58 UTC 1998


"R. A. Harmon" <harmonra at webname.com> wrote:

> I think with the following proposed set of conventions that PILT could be
> effectively achieved and the odd "bite" handled on a case by case basis in
> some standard way.
> 
>         - Internally all lines end with a carriage return and contain no other
>           interline spacing (carriage return, form feed, line feed, and
> vertical tab).
>         - External text lines end with the platform default unless explicitly
>           set to some other line end.
>         - External binary lines must deal explicitly with interline spacing
> characters.
>         - External text is the open default.
>         - All objects that have lines and internal stuff that depends on them
>           (like Text string and runs) must have conversion methods for
>           internal to external and back if used made external.
>         - You run into something that doesn't follow the convention, send in
> a fix
>           or at least point it out.


This sounds basically like what CrLfFileStream does, if you set it as the default concreteStream.  For binary files, it does nothing: you're on your own.  For "ascii" files, it converts CR, LF, and CRLF into CR on input, and it saves output according to some consistent external convention.  The overall effect is that for ascii-mode files, internal strings have CR's, and external files have whatever the platform convention is.

UNLESS, the user is dealing with file positions.  If the external file has CRLF's in it, then a single character internally can become two characters externally.  So what should happen to "the" file position?

One answer, the way it is done right now, is to let "the position" jump by more than 1.  This is basically what ANSI C does with text files.
  Another answer, would be to calculate a "virtual position" and map that back and forth to the actual file.  This can require reading the entire beginnig of a file, though, and sucks for large files.  It's very clean though.

Or yet another answer, is to simply disable messing with file positions for ascii files.  If you really care about file positions, then maybe you are dealing with a binary file that happens to contain text?  

The second is upwardly compatible from the third solution.  Once you start down the path of positions not making sense, you get stuck with it.  So basically it's a chaice between the first way (easy, but with lower semantics) and the second+third way (hard, but you shouldn't be doing it), and then a choice as to whether saying you can't do it is acceptible.

So in my opinion, the only things stopping adoption of CrLfFileStream as a default are:

	1. a decision on the file positions issue (I vote for "it's illegal", migrating towards "it's expensive and unadvised")
	2. automatically choosing output line endings based on what the current platform is.  This could be done by putting a "CrLfFileStream guessLineEndConvention" in SystemDictionary.processStartupList.


Oh, and maybe it could be renamed, too.  CrLf isn't very descriptive, especially on a Unix machine that doesn't have any CrLf delimitted files.  "TextFileStream", maybe?

Lex





More information about the Squeak-dev mailing list