[squeak-dev] StandardFileStream rant continued [was Problem handling nextLine]

Igor Stasenko siguctua at gmail.com
Tue Nov 17 22:15:29 UTC 2009


2009/11/18 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
> Beside,
> look at what happens when we execute:
>
> strm := StandardFileStream oldFileNamed: 'foo'.
> [[stream atEnd] whileFalse: [strm upTo: Character cr]] ensure: [strm close].
>
> We
> - first read 2000 characters in the buffer,
> - keep only a few ones (the first line),
> - abandon the rest of the buffer to the garbageCollector,
> - position the file back
> - and restart reading 2000 chars at next call...
> Prodigiously inefficient.
> I did not check in VMMaker what real work take place under the carpet,
> but the primitives looks enough like corresponding standard OS
> functions.
>
the call chain in #upTo: endig with primitive:
  count := self primRead: fileID into: aString
 				startingAt: startIndex count: n.

So, really, reading 2000 bytes using primitive could be faster than
reading 1 byte at a time (as in CrLfFileStream>>upTo:),
unless a file stream having own read-ahead buffer which is used
internally for all sort of things, including in #next and #next: and
#position:

> Why not just
> - reuse at image level the same buffer,
> - avoid spurious garbage collection,
> - keep a pointer on available data (already read in buffer),
> - avoid moving the file pointer back and forth,
> - and avoid reading the same file zone again and again ?
> We have to thank our OS and hardware to buffer disk data for us and be
> glad Squeak does not handle disk heads with such a policy !
>
> Wanna gain a bit of speed ? Don't wait Cog, there is plenty of room here !
>
Indeed :)

> Nicolas
>
> 2009/11/17 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
>> I get it.
>> I wanted to use #upToAnyOf: crlf.
>> Unfortunately, this one would not tell me whether a cr or a lf was hit
>> (or the end of stream).
>> So I inlined the sole implementor in the hierarchy naively thinking it
>> would just work...
>> MY ERROR!
>> not only is it sub optimal (slow), #upToAnyOf: is broken in
>> StandardFileStream...
>>
>> Recipe:
>> - create an inst var in abstract superclass,
>> - use it in many methods,
>> - define a subclass that do not use this inst var,
>> - define only a subset of superclass messages in the subclass,
>> congratulations you created at least as many bugs as the number of
>> methods you failed to override.
>>
>> When I read all this code, I get the feeling that the whole stream
>> hierarchy is broken...
>> If you trust library consistency, you're in trouble.
>> That is the next big core library deserving clean-up in trunk.
>> I guess Pharo will have an easier task integrating Nile complete rewrite ;)
>>
>> OK, end of rant :)
>>
>> Nicolas
>>
>> 2009/11/17 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
>>> Hi Casimiro,
>>> it would help if you would provide receiver class information.
>>> Otherwise, I see I probably messed up the horrific Stream hierarchy !
>>> I will revert this change until a better fix - mean more tests .
>>>
>>> Cheers
>>>
>>> 2009/11/17 Casimiro de Almeida Barreto <casimiro.barreto at gmail.com>:
>>>> After today's update, the following code:
>>>>
>>>>    [ fHandle atEnd ] whileFalse: [
>>>>        stringFromFile := fHandle nextLine.
>>>>        draw := Draw newFromString: stringFromFile.
>>>>        draw isNil
>>>>            ifTrue: [
>>>>                Transcript show: 'Failed to load draw.'; cr.
>>>>                fHandle close.
>>>>                ^nil ].
>>>>        model draws addLast: draw ].
>>>>
>>>> Results in the following error:
>>>>
>>>> nextLine
>>>>    "Answer next line (may be empty), or nil if at end.
>>>>    Handle a zoo of line delimiters CR, LF, or CR-LF pair"
>>>>
>>>>    | newStream element crlf |
>>>>    self atEnd ifTrue: [^nil].
>>>>    crlf := CharacterSet crlf.
>>>>    newStream := WriteStream on: (collection species new: 100).
>>>>    [self atEnd ifTrue: [^newStream contents].
>>>>    crlf includes: (element := self next)]
>>>>        whileFalse: [newStream nextPut: element].
>>>>    element = Character cr ifTrue: [self peekFor: Character lf]. "handle
>>>> an eventual CR LF pair"
>>>>    ^newStream contents
>>>>
>>>> =============================================================
>>>>
>>>> new: sizeRequested
>>>>    "Answer an initialized instance of this class with the number of
>>>> indexable
>>>>    variables specified by the argument, sizeRequested."
>>>>
>>>>    ^ (self basicNew: sizeRequested) initialize
>>>>
>>>> =============================================================
>>>>
>>>> basicNew: sizeRequested
>>>>    "Primitive. Answer an instance of this class with the number
>>>>    of indexable variables specified by the argument, sizeRequested.
>>>>    Fail if this class is not indexable or if the argument is not a
>>>>    positive Integer, or if there is not enough memory available.
>>>>    Essential. See Object documentation whatIsAPrimitive."
>>>>
>>>>    <primitive: 71>
>>>>    self isVariable ifFalse:
>>>>        [self error: self printString, ' cannot have variable sized
>>>> instances'].
>>>>    (sizeRequested isInteger and: [sizeRequested >= 0]) ifTrue:
>>>>        ["arg okay; space must be low."
>>>>        self environment signalLowSpace.
>>>>        ^ self basicNew: sizeRequested  "retry if user proceeds"].
>>>>    self primitiveFailed
>>>>
>>>> =============================================================
>>>>
>>>> error: aString
>>>>    "Throw a generic Error exception."
>>>>
>>>>    ^Error new signal: aString
>>>>
>>>> Error: UndefinedObject cannot have variable sized instances
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>



-- 
Best regards,
Igor Stasenko AKA sig.



More information about the Squeak-dev mailing list