[OT] MultiByteFileStream (was: Re: [Vm-dev] VM Maker: CMakeVMMaker-StasenkoIgor.104.mcz)

Levente Uzonyi leves at elte.hu
Tue May 17 14:14:18 UTC 2011


On Tue, 17 May 2011, Igor Stasenko wrote:

>
> On 17 May 2011 13:50, Levente Uzonyi <leves at elte.hu> wrote:
>>
>> On Tue, 17 May 2011, Igor Stasenko wrote:
>>
>>>
>>> On 16 May 2011 20:21, Levente Uzonyi <leves at elte.hu> wrote:
>>>>
>>>> On Mon, 16 May 2011, squeak-dev-noreply at lists.squeakfoundation.org wrote:
>>>>
>>>>>
>>>>> Igor Stasenko uploaded a new version of CMakeVMMaker to project VM
>>>>> Maker:
>>>>> http://www.squeaksource.com/VMMaker/CMakeVMMaker-StasenkoIgor.104.mcz
>>>>>
>>>>> ==================== Summary ====================
>>>>>
>>>>> Name: CMakeVMMaker-StasenkoIgor.104
>>>>> Author: StasenkoIgor
>>>>> Time: 16 May 2011, 10:33:56 am
>>>>> UUID: 68959e6f-18b1-304c-a3de-2b84a1f2c93b
>>>>> Ancestors: CMakeVMMaker-IgorStasenko.103
>>>>>
>>>>> workaround the MultiByteCrapStream, which converting line endings twice
>>>>> on
>>>>> windows.
>>>>
>>>> It's not crap, just broken.
>>>
>>> Well, i don't know better wording for seeing the mess in this
>>> implementation. It contains too much bells and whistles and therefore
>>> its very easy to break it.
>>
>> Tests can greatly decrease the chance of breaking it.
>>
>
> yes, i submitted a test yesterday.
> http://code.google.com/p/pharo/issues/detail?id=4236
>
> it written in a way to reproduce an issue when running on any system
> (not just Windows).

I also have some tests which cover more cases.

>
>>>
>>> I don't understand why it patching the UTF8Converter and turns it into
>>> line-ending converter..
>>> as to me, UTF8Converter have nothing to do with line ends, and such
>>> abuse of text converters should be strictly prohibited.
>>
>> The reason is performance, and the implementation is transparent. It also
>> simplifies MultiByteFileStream.
>>
>
> what about clarity? The problem is that this code is so optimized that
> it is no longer transparent. It is opaque for my eyes.
> I cannot fix it, because it looks very brittle and it is hard to get
> what it does, with all those branches, ifNotNils etc etc in almost
> every method there.

MultiByteFileStream is not clean yet, but it will be.

>
> And i cannot be sure if my fix won't break something else, because
> there is no clear separation of concerns:
>
> - use UTF8Converter for utf8 conversion and NOTHING else.
> - and let stream to filter input and replace cr character(s) to
> whatever it wants them to be
>
>
> And what about simplification?
>
> Do you think this is how a _simple_ method should look like?
>
> nextPut: aCharacter
> 	aCharacter isInteger
> 		ifTrue: [ ^ super nextPut: aCharacter ].
> 	(wantsLineEndConversion == true and: [ lineEndConvention notNil ])
> "#doConversion is inlined here"
> 		 ifTrue: [
> 			aCharacter = Cr
> 				ifTrue: [ converter nextPutAll: (LineEndStrings at:
> lineEndConvention) toStream: self ]
> 				ifFalse: [ converter nextPut: aCharacter toStream: self ].
> 			^aCharacter ].
> 	^ self converter nextPut: aCharacter toStream: self
>
> 4 branches.. while i, by following your premise, would expect a single
> line here:
>
> nextPut: aCharacter
>   ^ converter nextPut: aCharacter toStream: self
>
> because if you decided to delegate line ends conversion to converter
> (for whatever reason) , then you don't have to do anything outside of
> it.

Exactly. What I also have in my mind is to remove the #isBinary checks 
from the TextConverters and put them back to MultiByteFileStream, where 
they belong to IMHO.

>
>>>
>>>> I'll integrate the fix for it into Squeak soon.
>>>> If you're impatient you can fix it yourself, or use something else.
>>>>
>>>
>>> Well, i can't use something else, because its VMMaker which using
>>> CrLfFileStream for producing source files output.
>>> Maybe it worth to just use FileStream in binary mode and deal with
>>> line ends internally, to not leave chance for
>>> clever text streams to interfere.
>>
>> CrLfFileStream is not used, since Squeak 3.x. References to it should be
>> rewritten to use FileStream instead. Using binary mode to work around a bug
>> is bad idea, fixing the bug is the way to go.
>>
>
> But why you stating false statements? CrLfFileStream is used, at least
> by VMMaker.
> And i was never took a deep look on a file streams to clearly say if
> it still used or can be removed.

It's not a false statement, print it: CrLfFileStream new. In Squeak 
CrLfFileStream is deprecated and if I'm not mistaken the same is true in 
Pharo.

>
> So, can you clarify, can we deprecate and then completely remove
> CrLfFileStream from image
> and fix all references to it with FileStream instead?
> Because i like putting useless cra.. (err stuff) into trash bin.

See above.


Levente

>
>>> But then again, it means time, which i haven't because i was focused
>>> on completely different problem.
>>
>> I'll upload my fix to Squeak Trunk this week. Applying the changes to Pharo
>> probably won't work out of the box, because IIRC some issues were solved
>> differently in the past.
>>
>
> i'm not sure.
>
>>
>> Levente
>>
>
>
>
> -- 
> Best regards,
> Igor Stasenko AKA sig.
>


More information about the Vm-dev mailing list