[OT] MultiByteFileStream (was: Re: [Vm-dev] VM Maker: CMakeVMMaker-StasenkoIgor.104.mcz)

Igor Stasenko siguctua at gmail.com
Tue May 17 19:51:54 UTC 2011


On 17 May 2011 20:27, Eliot Miranda <eliot.miranda at gmail.com> wrote:
>
> Hi Levente,
>
> On Tue, May 17, 2011 at 7:14 AM, Levente Uzonyi <leves at elte.hu> wrote:
>>
>> On Tue, 17 May 2011, Igor Stasenko wrote:
>>
>>>
>>> On 17 May 2011 13:50, Levente Uzonyi <leves at elte.hu> wrote:
>>>>
>>>> On Tue, 17 May 2011, Igor Stasenko wrote:
>>>>
>>>>>
>>>>> On 16 May 2011 20:21, Levente Uzonyi <leves at elte.hu> wrote:
>>>>>>
>>>>>> On Mon, 16 May 2011, squeak-dev-noreply at lists.squeakfoundation.org wrote:
>>>>>>
>>>>>>>
>>>>>>> Igor Stasenko uploaded a new version of CMakeVMMaker to project VM
>>>>>>> Maker:
>>>>>>> http://www.squeaksource.com/VMMaker/CMakeVMMaker-StasenkoIgor.104.mcz
>>>>>>>
>>>>>>> ==================== Summary ====================
>>>>>>>
>>>>>>> Name: CMakeVMMaker-StasenkoIgor.104
>>>>>>> Author: StasenkoIgor
>>>>>>> Time: 16 May 2011, 10:33:56 am
>>>>>>> UUID: 68959e6f-18b1-304c-a3de-2b84a1f2c93b
>>>>>>> Ancestors: CMakeVMMaker-IgorStasenko.103
>>>>>>>
>>>>>>> workaround the MultiByteCrapStream, which converting line endings twice
>>>>>>> on
>>>>>>> windows.
>>>>>>
>>>>>> It's not crap, just broken.
>>>>>
>>>>> Well, i don't know better wording for seeing the mess in this
>>>>> implementation. It contains too much bells and whistles and therefore
>>>>> its very easy to break it.
>>>>
>>>> Tests can greatly decrease the chance of breaking it.
>>>>
>>>
>>> yes, i submitted a test yesterday.
>>> http://code.google.com/p/pharo/issues/detail?id=4236
>>>
>>> it written in a way to reproduce an issue when running on any system
>>> (not just Windows).
>>
>> I also have some tests which cover more cases.
>>
>>>
>>>>>
>>>>> I don't understand why it patching the UTF8Converter and turns it into
>>>>> line-ending converter..
>>>>> as to me, UTF8Converter have nothing to do with line ends, and such
>>>>> abuse of text converters should be strictly prohibited.
>>>>
>>>> The reason is performance, and the implementation is transparent. It also
>>>> simplifies MultiByteFileStream.
>>>>
>>>
>>> what about clarity? The problem is that this code is so optimized that
>>> it is no longer transparent. It is opaque for my eyes.
>>> I cannot fix it, because it looks very brittle and it is hard to get
>>> what it does, with all those branches, ifNotNils etc etc in almost
>>> every method there.
>>
>> MultiByteFileStream is not clean yet, but it will be.
>>
>>>
>>> And i cannot be sure if my fix won't break something else, because
>>> there is no clear separation of concerns:
>>>
>>> - use UTF8Converter for utf8 conversion and NOTHING else.
>>> - and let stream to filter input and replace cr character(s) to
>>> whatever it wants them to be
>>>
>>>
>>> And what about simplification?
>>>
>>> Do you think this is how a _simple_ method should look like?
>>>
>>> nextPut: aCharacter
>>>        aCharacter isInteger
>>>                ifTrue: [ ^ super nextPut: aCharacter ].
>>>        (wantsLineEndConversion == true and: [ lineEndConvention notNil ])
>>> "#doConversion is inlined here"
>>>                 ifTrue: [
>>>                        aCharacter = Cr
>>>                                ifTrue: [ converter nextPutAll: (LineEndStrings at:
>>> lineEndConvention) toStream: self ]
>>>                                ifFalse: [ converter nextPut: aCharacter toStream: self ].
>>>                        ^aCharacter ].
>>>        ^ self converter nextPut: aCharacter toStream: self
>>>
>>> 4 branches.. while i, by following your premise, would expect a single
>>> line here:
>>>
>>> nextPut: aCharacter
>>>  ^ converter nextPut: aCharacter toStream: self
>>>
>>> because if you decided to delegate line ends conversion to converter
>>> (for whatever reason) , then you don't have to do anything outside of
>>> it.
>>
>> Exactly. What I also have in my mind is to remove the #isBinary checks from the TextConverters and put them back to MultiByteFileStream, where they belong to IMHO.
>>
>>>
>>>>>
>>>>>> I'll integrate the fix for it into Squeak soon.
>>>>>> If you're impatient you can fix it yourself, or use something else.
>>>>>>
>>>>>
>>>>> Well, i can't use something else, because its VMMaker which using
>>>>> CrLfFileStream for producing source files output.
>>>>> Maybe it worth to just use FileStream in binary mode and deal with
>>>>> line ends internally, to not leave chance for
>>>>> clever text streams to interfere.
>>>>
>>>> CrLfFileStream is not used, since Squeak 3.x. References to it should be
>>>> rewritten to use FileStream instead. Using binary mode to work around a bug
>>>> is bad idea, fixing the bug is the way to go.
>>>>
>>>
>>> But why you stating false statements? CrLfFileStream is used, at least
>>> by VMMaker.
>>> And i was never took a deep look on a file streams to clearly say if
>>> it still used or can be removed.
>>
>> It's not a false statement, print it: CrLfFileStream new. In Squeak CrLfFileStream is deprecated and if I'm not mistaken the same is true in Pharo.
>
> I'm just trying to eliminate the explicit references to CrLfFileStream in the Cog VMMaker, but I've hit a snag.  e.g. there's this idiom:
> InterpreterPlugin class methods for translation
> storeString: s onFileNamed: fileName
> "Store the given string in a file of the given name."
> | f |
> f := CrLfFileStream forceNewFileNamed: fileName.
> f nextPutAll: s.
> f close.
> that I'd like to replace with
> VMMaker class methods for utilities
> newFileStream
> "Always output files in unix lf format.
> A single format is friendlier to e.g. external version control systems.
> The Microsoft and old MacOS classic C compilers all accept lf format files."
> ^MultiByteFileStream new
> lineEndConvention: #lf;
> yourself
> InterpreterPlugin class methods for translation
> storeString: s onFileNamed: fileName
> "Store the given string in a file of the given name."
> | f |
> f := VMMaker newFileStream forceNewFileNamed: fileName.
> f nextPutAll: s.
> f close.
> but only class-side methods understand forceNewFileNamed:, and hence one would have to write
>
>
> InterpreterPlugin class methods for translation
> storeString: s onFileNamed: fileName
> "Store the given string in a file of the given name."
> | f |
> f := MultiByteFileStream forceNewFileNamed: fileName.
> f lineEndConvention: #lf.
> f nextPutAll: s.
> f close
> which is a tad tedious and distributed.  I could have a set of convenience methods in VMMaker class, forceNewFileStream: oldFileStream etc.  There could be subclasses of MultiByteFileStream purely for instance creation (MultiByteLfFileStream et al).  Or...?  Any thoughts?

But why you want to explicitly control line ends?
Because if you do it like that its not too different to what i have
proposed: use a stream in binary mode and deal
with line ends inside VMMaker and don't bother with what
MultiByteFileStream provides in this regard.


> best,
> Eliot
>>

-- 
Best regards,
Igor Stasenko AKA sig.


More information about the Vm-dev mailing list