CrLfFileStream as default?
Michael S. Klein
mklein at alumni.caltech.edu
Sat Oct 31 23:28:18 UTC 1998
> I don't think there is any reason for "guessLineEndConvention" in the
> approach I propose and if it guesses wrong (especially on an already
> anomalous file), CrLfFileStream, seems to produce anomalies that I don't
> think are caused just by cut-and-paste.
Sometimes you may want to guess, sometimes you may want a rigid line end
policy.
> The native platform line termination conventions I know of are as follows:
>
> DOS/Windows on x86 CrLf
> UNIX Lf
> Mac Cr
Smalltalks use cr. There is also Unicode which has explicitly different
line separators and paragraph separators ( U+2028 and U+2029 ).
Personally, I think the whole idea of "Control Characters" is perverse.
Line end conventions are just the best-known symptom of this perversity.
> So when reading external text, all interline spacing (carriage return, form
> feed, line feed, and vertical tab) characters are handled as follows:
>
> Cr - add Cr to internal collection, if followed by Lf then read and
> ignore Lf.
> Lf - if proceeded by Cr then ignore Lf else add Cr to internal
> collection.
> Ff and VT - ignore.
This works sometimes, but there are some of us who actually use ff & vt's
placed in text by other people.
> This does require reading external text a character at a time, but doesn't
> seem prohibitively expensive.
First make it work.... then make it fast
> CrLfFileStream also doesn't deal with the problems like that of runs
> breaking in Text instances. It seemed to me that a lot of this kind of
> anomaly exists when just about any of the classes start reading and writing
> to external devices -- ports, disks, etc.
Yeah, strings are deceptively easy to externalize.
As far as line end convention goes, I think the important thing to do is
to factor out the handling into a Policy object. Otherwise the streaming
code just gets all krufted up with different cases.
If somebody wants a different policy, they add a new class instead of
futzing with the convoluted code.
-- Mike Klein
More information about the Squeak-dev
mailing list
|