[OT] MultiByteFileStream (was: Re: [Vm-dev] VM Maker: CMakeVMMaker-StasenkoIgor.104.mcz)

Eliot Miranda eliot.miranda at gmail.com
Tue May 17 18:27:58 UTC 2011


Hi Levente,

On Tue, May 17, 2011 at 7:14 AM, Levente Uzonyi <leves at elte.hu> wrote:

>
> On Tue, 17 May 2011, Igor Stasenko wrote:
>
>
>> On 17 May 2011 13:50, Levente Uzonyi <leves at elte.hu> wrote:
>>
>>>
>>> On Tue, 17 May 2011, Igor Stasenko wrote:
>>>
>>>
>>>> On 16 May 2011 20:21, Levente Uzonyi <leves at elte.hu> wrote:
>>>>
>>>>>
>>>>> On Mon, 16 May 2011, squeak-dev-noreply at lists.squeakfoundation.orgwrote:
>>>>>
>>>>>
>>>>>> Igor Stasenko uploaded a new version of CMakeVMMaker to project VM
>>>>>> Maker:
>>>>>> http://www.squeaksource.com/VMMaker/CMakeVMMaker-StasenkoIgor.104.mcz
>>>>>>
>>>>>> ==================== Summary ====================
>>>>>>
>>>>>> Name: CMakeVMMaker-StasenkoIgor.104
>>>>>> Author: StasenkoIgor
>>>>>> Time: 16 May 2011, 10:33:56 am
>>>>>> UUID: 68959e6f-18b1-304c-a3de-2b84a1f2c93b
>>>>>> Ancestors: CMakeVMMaker-IgorStasenko.103
>>>>>>
>>>>>> workaround the MultiByteCrapStream, which converting line endings
>>>>>> twice
>>>>>> on
>>>>>> windows.
>>>>>>
>>>>>
>>>>> It's not crap, just broken.
>>>>>
>>>>
>>>> Well, i don't know better wording for seeing the mess in this
>>>> implementation. It contains too much bells and whistles and therefore
>>>> its very easy to break it.
>>>>
>>>
>>> Tests can greatly decrease the chance of breaking it.
>>>
>>>
>> yes, i submitted a test yesterday.
>> http://code.google.com/p/pharo/issues/detail?id=4236
>>
>> it written in a way to reproduce an issue when running on any system
>> (not just Windows).
>>
>
> I also have some tests which cover more cases.
>
>
>
>>
>>>> I don't understand why it patching the UTF8Converter and turns it into
>>>> line-ending converter..
>>>> as to me, UTF8Converter have nothing to do with line ends, and such
>>>> abuse of text converters should be strictly prohibited.
>>>>
>>>
>>> The reason is performance, and the implementation is transparent. It also
>>> simplifies MultiByteFileStream.
>>>
>>>
>> what about clarity? The problem is that this code is so optimized that
>> it is no longer transparent. It is opaque for my eyes.
>> I cannot fix it, because it looks very brittle and it is hard to get
>> what it does, with all those branches, ifNotNils etc etc in almost
>> every method there.
>>
>
> MultiByteFileStream is not clean yet, but it will be.
>
>
>
>> And i cannot be sure if my fix won't break something else, because
>> there is no clear separation of concerns:
>>
>> - use UTF8Converter for utf8 conversion and NOTHING else.
>> - and let stream to filter input and replace cr character(s) to
>> whatever it wants them to be
>>
>>
>> And what about simplification?
>>
>> Do you think this is how a _simple_ method should look like?
>>
>> nextPut: aCharacter
>>        aCharacter isInteger
>>                ifTrue: [ ^ super nextPut: aCharacter ].
>>        (wantsLineEndConversion == true and: [ lineEndConvention notNil ])
>> "#doConversion is inlined here"
>>                 ifTrue: [
>>                        aCharacter = Cr
>>                                ifTrue: [ converter nextPutAll:
>> (LineEndStrings at:
>> lineEndConvention) toStream: self ]
>>                                ifFalse: [ converter nextPut: aCharacter
>> toStream: self ].
>>                        ^aCharacter ].
>>        ^ self converter nextPut: aCharacter toStream: self
>>
>> 4 branches.. while i, by following your premise, would expect a single
>> line here:
>>
>> nextPut: aCharacter
>>  ^ converter nextPut: aCharacter toStream: self
>>
>> because if you decided to delegate line ends conversion to converter
>> (for whatever reason) , then you don't have to do anything outside of
>> it.
>>
>
> Exactly. What I also have in my mind is to remove the #isBinary checks from
> the TextConverters and put them back to MultiByteFileStream, where they
> belong to IMHO.
>
>
>
>>
>>>>  I'll integrate the fix for it into Squeak soon.
>>>>> If you're impatient you can fix it yourself, or use something else.
>>>>>
>>>>>
>>>> Well, i can't use something else, because its VMMaker which using
>>>> CrLfFileStream for producing source files output.
>>>> Maybe it worth to just use FileStream in binary mode and deal with
>>>> line ends internally, to not leave chance for
>>>> clever text streams to interfere.
>>>>
>>>
>>> CrLfFileStream is not used, since Squeak 3.x. References to it should be
>>> rewritten to use FileStream instead. Using binary mode to work around a
>>> bug
>>> is bad idea, fixing the bug is the way to go.
>>>
>>>
>> But why you stating false statements? CrLfFileStream is used, at least
>> by VMMaker.
>> And i was never took a deep look on a file streams to clearly say if
>> it still used or can be removed.
>>
>
> It's not a false statement, print it: CrLfFileStream new. In Squeak
> CrLfFileStream is deprecated and if I'm not mistaken the same is true in
> Pharo.


I'm just trying to eliminate the explicit references to CrLfFileStream in
the Cog VMMaker, but I've hit a snag.  e.g. there's this idiom:

InterpreterPlugin class methods for translation
storeString: s onFileNamed: fileName
"Store the given string in a file of the given name."

| f |
f := CrLfFileStream forceNewFileNamed: fileName.
f nextPutAll: s.
f close.

that I'd like to replace with

VMMaker class methods for utilities
newFileStream
"Always output files in unix lf format.
A single format is friendlier to e.g. external version control systems.
The Microsoft and old MacOS classic C compilers all accept lf format files."

^MultiByteFileStream new
lineEndConvention: #lf;
yourself

InterpreterPlugin class methods for translation
storeString: s onFileNamed: fileName
"Store the given string in a file of the given name."

| f |
f := VMMaker newFileStream forceNewFileNamed: fileName.
f nextPutAll: s.
f close.

but only class-side methods understand forceNewFileNamed:, and hence one
would have to write



InterpreterPlugin class methods for translation
storeString: s onFileNamed: fileName
"Store the given string in a file of the given name."

| f |
f := MultiByteFileStream forceNewFileNamed: fileName.
f lineEndConvention: #lf.
f nextPutAll: s.
f close

which is a tad tedious and distributed.  I could have a set of convenience
methods in VMMaker class, forceNewFileStream: oldFileStream etc.  There
could be subclasses of MultiByteFileStream purely for instance creation
(MultiByteLfFileStream et al).  Or...?  Any thoughts?

best,
Eliot

>
>
>
>> So, can you clarify, can we deprecate and then completely remove
>> CrLfFileStream from image
>> and fix all references to it with FileStream instead?
>> Because i like putting useless cra.. (err stuff) into trash bin.
>>
>
> See above.
>
>
> Levente
>
>
>
>>  But then again, it means time, which i haven't because i was focused
>>>> on completely different problem.
>>>>
>>>
>>> I'll upload my fix to Squeak Trunk this week. Applying the changes to
>>> Pharo
>>> probably won't work out of the box, because IIRC some issues were solved
>>> differently in the past.
>>>
>>>
>> i'm not sure.
>>
>>
>>> Levente
>>>
>>>
>>
>>
>> --
>> Best regards,
>> Igor Stasenko AKA sig.
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20110517/50e55214/attachment.htm


More information about the Vm-dev mailing list