Linefeeds in Squeak source code

Stephan B. Wessels stephan.wessels at sdrc.com
Tue May 11 21:14:29 UTC 1999


That sounds like a lot of work even though it should fly.  I understand the
problem.  Smalltalkers like having change logs they can read with text editors
so the base design makes a lot of sense from a point of view of easy recovery
and utility.  You are correct, the problem we recently saw was in the
FTP/SEA/ZIP user friendly tools.  Perhaps your technique of out-friendlying the
friendly tools is a wise one.

The use of Squeak has to be supported by simple download-and-go techniques.
When I tell people about Squeak they immediately want to go to the web site
sqeuak.org and get their own version to play with.  Then we run into this
platform problem and it looks like a toy.

Tom Burns wrote:

> > From: Stephan B. Wessels [mailto:stephan.wessels at sdrc.com]
> >
> > Seems like arbitrary removal of LFs is suspect.  I thought Unix systems
> used LFs
> > as line delimiters, Macs use CR and DOS derivatives used CR/LF pairs.
>
> That they do, however the significance of this is lessened since the files
> in question aren't really text files but rather binary files accessed
> through direct indexing.  The problem lies in that automated tools (like FTP
> and some extraction tools) think the files actually are text since they look
> like text.  ("It smells like text, looks like text, tastes like text, must
> be text!")
>
> My vote would be to add a footprint to the sources files that can easily be
> checked and is otherwise harmless.  I would suggest something like (bear
> with me here) 0x22 0x0A 0x20 0x0D 0x20 0x0A 0x0D 0x22 0x21 0x21 which
> translates to <quote> <cr> <space> <lf> <space> <cr> <lf> <quote> <bang>
> <bang>.  My reasoning:
>
> - The quote delimiters make the footprint "invisible" to any Smalltalk
> compiling.  The bangs on the end delimit it in "chunk" format, so filing the
> footprint in just results in the evaluation of a comment.
>
> - It's a small, fixed length, suitable for quick checking and along the
> lines of Dan's suggestion of checking the first 200 characters
>
> - Any translation done by a tool such as FTP would be immediately noticeable
> as at least one of the <CR>, <LF>, or <CR><LF> items would become mangled
>
> - This could easily and harmlessly be added to the front of any fileout, if
> that was desired.
>
> Of course, this would mean altering the existing sources files, but that
> could be done automatically as well - the lack of a footprint can be
> detected and the sources regenerated with the footprint and an automatic LF
> removal at the same time.
>
> t
> --
> Tom Burns -- tburns at appnet.net -- http://www.aisys.com
> AppNet Midwest -- http://www.appnet.net





More information about the Squeak-dev mailing list