[squeak-dev] support of various line ends in trunk

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Mon Nov 16 23:15:51 UTC 2009


2009/11/16 Juan Vuletich <juan at jvuletich.org>:
> Hi Nicolas,
>
> Nicolas Cellier wrote:
>>
>> 1] ABOUT RECENT CHANGES IN TRUNK:
>>
>> st-80 line end policy was really simple (always CR).
>> It was the best choice possible as long as staying in Smalltalk world.
>> However, I was kind of fed-up with all the hacks for converting
>> to/from various line end flavours, half of which not working...
>>
>> - Since communicating with external world is vital for my own view of
>> Squeak
>> - Since It is far more simple to handle the zoo of line delimiters in
>> Kernel
>>  (CompositionScanner / DispalyScanner / String / Stream)
>> I just added this support in trunk.
>>
>> Now, we should be able to import any line termination transparently in
>> the image.
>> For exporting, nothing changed, we still have to care, no magic here,
>> this is driven by external applications requirements.
>>
>
> I think you got this one wrong. In Cuis, in a workspace you can tell the
> line ending of each line (cr, lf or crlf) and you can actually type all
> three. Please try it! Use <Enter>, <Shift-Enter> and <Cmd/Alt-Enter>. This
> way you can edit a text file, and keep it consistent. Otherwise, if you edit
> an existing file that was edited with a Unix or Windows editor and add CRs
> to it it will use more than one convention, without you realizing. Showing
> all in the same way is misleading. Different Strings should look different
> in the editor!
>

Hi Juan,

1) Having the possibility to handle mixed conventions does not mean we
are forced to use it !
We can continue to guessLineEndConvention, I did not change this.
For writing the file back, either a guess or an explicit requirement
might do as well, I did not break anything...

2) I have plenty of mixed conventions files in windows world, not
created by Squeak.
I cannot guessLineEndConvention on such files.
Generally Squeak do a real mess and introduce spurious empty lines in this case.
The lineEndTransparent is my best option for reading these files.

3) For displaying, I think we generally don't care whether a CR, LF or
CR-LF is used internally
What is the semantic of these characters ?
To me they all mean the same, no difference, I'm not a typewritter.
BUT SEEING CR-LF DISPLAYED SOMETIMES AS TWO LINES AND LF AS NO LINE
BREAK IS AWFULL

4) In case we have to compare strings with different lineEndConventions,
I suggest the comparing tools make the appropriate conversion first. I
don't think this is going to be a real problem.

5) In case we do care of differences because of external requirements,
and we want to visualize them,
then a specialized DisplayScanner could display a boxed glyph for CR
and LF, and an arrow for TAB.
(better than this empty box that doesn't tell which character is under)
That should be an option of ParagraphEditor

6) I'll have a look in Cuis (need some time...)

Nicolas

>> To profit by the new possibilities, just use:
>> - (String>>linesDo:) rather than searching indexOf: Character cr
>> - (Stream>>nextLine) rather than upTo: Character cr
>>
>> There might be some LF/CR-LF support lacking here and there (there are
>> so many #cr senders...), but that shouldn't be hard to fix.
>>
>> 2] IMPORTANT NOTE AND QUESTION:
>>
>> SocketStream>>nextLine does insist on finding a CR-LF pair.
>> This is used in some major protocols.
>> But I find this abusive, and would like to change the default behavior
>> to that of Stream.
>> This would be a nice property that a SocketStream behaves like a
>> FileStream or an ExternalStream.
>> Should I proceed ?
>>
>> Nicolas
>
> Cheers,
> Juan Vuletich
>
>



More information about the Squeak-dev mailing list