[squeak-dev] Re: problems with line separators in Linux (Nicolas
nicolas.cellier.aka.nice at gmail.com
Fri Jun 11 22:26:30 UTC 2010
2010/6/11 Ralph Boland <rpboland at gmail.com>:
>> 10) This 6.a) strategy could eventually replace 2.a), but it does not
>> have to, and we didn't went this way...
>> So both Squeak 4.1 and Pharo 1.1 are not any worse than Squeak > always
>> has been with this respect.
> Except that now the conversion of Lf in Linux files to Cr in Squeak no longer
> occurs and this breaks things such as Menu labels. Thus things that used
> to work now don't.
I don't see what change could cause this problem...
The recent commit should solve the menu problem in presence of LF leakage.
>> 11) Strategy 6.a) DOES NOT replace 2.b). If our down-chain
>> applications are line-ending sensitive, then WE must care of producing
>> the expected convention.
>> So my opinion is that 6.a) did not make our life worse.
>> On the contrary, Squeak and Pharo are moving toward what I would call
>> a better behaved I.T. world citizen.
>> They now offers an API to handle line-endings transparently inside the image.
>> This is at the price of not-so-much complexity, and no noticeable slow down.
>> But now we have to learn new idioms (and I don't see nextLine as more
>> complex than upTo: Character cr)...
>> ... and apply it were due (like parsing menu specs) to obtain a
>> homogeneous behaviour- goal 3)
>> We still have to care of 2.b), and a bit less of 2.a) once 4.b) will
>> be achieved.
>> And maybe in the future, we will be able to get rid of 2.b) too when
>> all applications will be line-ending-insensitive.
>> In the meantime, nothing prevents us to improve 2.a) and 2.b) to
>> avoid LF leaking in or CR leaking out the image.
>> But untill 2) strategy is perfect, then we just act as one of the bad
>> world citizen perpetuating line-ending problems.
>> IMO reaching goal 3) is easier than reaching goal 2).
>> That's only my personal opinion, but it's based on pragmatic years of
>> using bad line-ending behaved apps and trying to program a bit better
>> There are alternate possible strategies, like in CUIS: display a boxed
>> [LF] explicitely in text editors so as to provide visual control to
>> Not sure I sold my POV. It's quite opposite to your proposition.
>> You don't have to adhere, but at least you have some rationale.
> I consider getting 2a) and 2b) both quite important to work and
> much more important than 6). I suggest getting 2) to work first
> and then worry about 6).
Yes, they are important !
But I bet making Squeak immune to line ending in parallel is an easier task.
This is because we are in control of in image behaviour, but not that
much of external world standards.
I see the tasks more as parallel.
> Also, beware with 6) that you don't want to fileIn a file from Linux
> (or other operating system) and then fileOut the same file
> only to find that a diff or cmp of the original and new versions of
> the file reports
> that they are different. Similarly you don't want to fileOut a
> and then file it in again only to find that any diff like utility now
> reports that the
> original and new version are different because the Cr-Lf representation
> of line separators has changed.
> Of course, If you do 6) properly, it will make this problem less
> likely rather than
Sure, I even got problems because a LF was missing in last line,
abusing some unix tools...
Making Squeak immune to line ending conventions does not solve this
problem at all.
Also note that Smalltalk code is still using #cr to produce line
breaks in many places inside the image. I don't think that would be
easily changed, and not sure at all we should take this path.
So we have to care of platform conventions/network conventions at
least on output and this is supposed to be handled by
Alternatively, you can also secure your tool chain using proper filters
That's not that uneasy in unix (dos2unix, tr '\r' '\n', ...).
Also inquire whether diff does not have the right options to deal with
foreign line endings...
> Beware too that any files, not necessarily .st files or Squeak files
> that use both
> Cr and Lf with distinct meanings will expect both to be loaded or
> filed out without
> any conversion of either character.
I'm not aware of any such case. I presume this is rare.
Since this is application dependent, it cannot be Smalltalk job to
make a wrong guess based on host platform.
Instead it relies entirely on progarmmer's knowledge.
All we can do is providing facility for handling transparent line
ending, and that currently is available thru MultiByteFileStream API.
Also the #crlf and #lf facilities were added recently where #cr was
implemented, so the programmer can be in control.
> How can I be cheerful when after 6 years of using Squeak I am still running
> into these problem! :-( :-)
Sure, it's a shame !
But you know things are getting more complex because network
conventions are not necessarily that of our local file system/local OS
So basing the whole strategy on host platform guess can't work...
My bet is that more and more tools will be able to handle mixed conventions.
> Ralph Boland
More information about the Squeak-dev