Extending FileList with CrLf

Daniel Vainsencher danielv at netvision.net.il
Wed Jul 23 00:29:53 UTC 2003


Ned Konz <ned at bike-nomad.com> wrote:
> I think that it's quite valid to think that some files are "text".
> 
> Which means to me that they should be written in whatever the native 
> text format is.
Sure some files are text, the question is simply "who decides what
encoding to expect/use in text files". The answer is either - 
A. A stream class, and then we need to decide at the system level, what
it will default to.
B. The application.

I think it should be B. But assuming A (the current situation), I prefer
defaults that are naive to ones that are too smart. The current state
says "By default, the system will assume and use the Squeak native
encoding in text files". If assumptions are wrong, the application/user
can fix them. The proposed state says "By default, I will try to detect
the file's convention when reading, and when creating, use the local
platforms". Note that the proposed state has the follow interesting
effects -
A. The definition of created text files depends on the currently used
OS. For everything I do in Squeak, I try to consider my platform to be
Squeak. The OS it is currently hosted by is irrelevant, and should not
define my conventions for me. 
B. The definition of created files is not consistent with the
expectation from read files. This means that reading a files contents
and writing it modified into a new one will result in a file with
different conventions, without anyone being the wiser.

The first effect would be irritating for me, but I can cope. I consider
the second to be confusing and complex, bordering on malicious to the
naive user. I don't know that any platform is so inconsistent (though
I'm sure some actually are). 

> Which also means that on reading we should be more robust with respect 
> to differing line-ending conventions. Even if you have a mixture of 
> them in a file (which can happen when you append text with one kind 
> of line endings to a file you moved from another system).
I emphatically disagree - this would be even worse. Should an image
reading program, when detecting a corrupted jpeg image, try to interpret
the rest of it a BMP? exceptional formats are also an application level
concern. At the stream level, we should have a fixed expectation of a
specific encoding, and reading illegal strings for that encoding should
cause an exception. How this is handled should be left to the
application.

Daniel



More information about the Squeak-dev mailing list