[squeak-dev] support of various line ends in trunk

Juan Vuletich juan at jvuletich.org
Tue Nov 17 00:00:02 UTC 2009


Hi Nicolas,

Nicolas Cellier wrote:
> 2009/11/16 Juan Vuletich <juan at jvuletich.org>:
>   
>> Hi Nicolas,
>>
>> Nicolas Cellier wrote:
>>     
>>> 1] ABOUT RECENT CHANGES IN TRUNK:
>>>
>>> st-80 line end policy was really simple (always CR).
>>> It was the best choice possible as long as staying in Smalltalk world.
>>> However, I was kind of fed-up with all the hacks for converting
>>> to/from various line end flavours, half of which not working...
>>>
>>> - Since communicating with external world is vital for my own view of
>>> Squeak
>>> - Since It is far more simple to handle the zoo of line delimiters in
>>> Kernel
>>>  (CompositionScanner / DispalyScanner / String / Stream)
>>> I just added this support in trunk.
>>>
>>> Now, we should be able to import any line termination transparently in
>>> the image.
>>> For exporting, nothing changed, we still have to care, no magic here,
>>> this is driven by external applications requirements.
>>>
>>>       
>> I think you got this one wrong. In Cuis, in a workspace you can tell the
>> line ending of each line (cr, lf or crlf) and you can actually type all
>> three. Please try it! Use <Enter>, <Shift-Enter> and <Cmd/Alt-Enter>. This
>> way you can edit a text file, and keep it consistent. Otherwise, if you edit
>> an existing file that was edited with a Unix or Windows editor and add CRs
>> to it it will use more than one convention, without you realizing. Showing
>> all in the same way is misleading. Different Strings should look different
>> in the editor!
>>
>>     
>
> Hi Juan,
>
> 1) Having the possibility to handle mixed conventions does not mean we
> are forced to use it !
>   

Sure. I didn't say otherwise.

> We can continue to guessLineEndConvention, I did not change this.
> For writing the file back, either a guess or an explicit requirement
> might do as well, I did not break anything...
>   

I didn't say you broke something. I say that you're not doing the best. 
I say that the way Cuis does it is better: Show the user the real 
contents in the string, and let him do whatever he wants.

> 2) I have plenty of mixed conventions files in windows world, not
> created by Squeak.
> I cannot guessLineEndConvention on such files.
> Generally Squeak do a real mess and introduce spurious empty lines in this case.
> The lineEndTransparent is my best option for reading these files.
>   

I think the best is to see all those different line endings, and be able 
to convert them to whatever you want with one keystroke. In Cuis, ctrl-u 
converts all line ends to cr. It is trivial to add options for the other 
conventions.

> 3) For displaying, I think we generally don't care whether a CR, LF or
> CR-LF is used internally
> What is the semantic of these characters ?
>   

I don't know. For different people they could have different semantics. 
The system should not make a decision for the user. Computers look 
really silly when they try to be smart...

> To me they all mean the same, no difference, I'm not a typewritter.
> BUT SEEING CR-LF DISPLAYED SOMETIMES AS TWO LINES AND LF AS NO LINE
> BREAK IS AWFULL
>   

Indeed. Check Cuis.

> 4) In case we have to compare strings with different lineEndConventions,
> I suggest the comparing tools make the appropriate conversion first. I
> don't think this is going to be a real problem.
>   

Sure.

> 5) In case we do care of differences because of external requirements,
> and we want to visualize them,
> then a specialized DisplayScanner could display a boxed glyph for CR
> and LF, and an arrow for TAB.
> (better than this empty box that doesn't tell which character is under)
> That should be an option of ParagraphEditor
>   

I disagree. That should be the default behavior, as in Cuis. The user 
should be in command.

> 6) I'll have a look in Cuis (need some time...)
>   

Great!

> Nicolas
>   

Cheers,
Juan Vuletich

>>> To profit by the new possibilities, just use:
>>> - (String>>linesDo:) rather than searching indexOf: Character cr
>>> - (Stream>>nextLine) rather than upTo: Character cr
>>>
>>> There might be some LF/CR-LF support lacking here and there (there are
>>> so many #cr senders...), but that shouldn't be hard to fix.
>>>
>>> 2] IMPORTANT NOTE AND QUESTION:
>>>
>>> SocketStream>>nextLine does insist on finding a CR-LF pair.
>>> This is used in some major protocols.
>>> But I find this abusive, and would like to change the default behavior
>>> to that of Stream.
>>> This would be a nice property that a SocketStream behaves like a
>>> FileStream or an ExternalStream.
>>> Should I proceed ?
>>>
>>> Nicolas
>>>       
>> Cheers,
>> Juan Vuletich
>>
>>
>>     
>
>
>   




More information about the Squeak-dev mailing list