(ReadStream on: #(1 2 3 4 5) from: 2 to: 4) reset next

Richard A. O'Keefe ok at cs.otago.ac.nz
Mon Aug 11 00:55:14 UTC 2003


Let's see what we can find in the ANSI Smalltalk standard
(ANSI INCITS 319-1998).
"ReadStream" implements the <ReadStream factory> protocol.
<ReadStream> is a subprotocol of <gettableStream> and <collectionStream>,
and <collectionStream> is a subprotocol of <sequencedStream>.

>From <sequencedStream>

close
    (break link between stream and underlying collection)

contents
    (return a collection containing the past and future sequence values
    in order; when there is an underlying collection it might be the
    same thing or it might not).  <collectionStream> spells out that
    the result either is the same object as the underlying collection
    or belongs to the same class as the result of #select: would have.

isEmpty
    returns true if both the set of past and future sequence values of
    the receiver are empty, otherwise returns false.

*** From this, it is quite clear that (ReadStream on: #(1 2 3)) isEmpty
*** may not be anything other than false.  It doesn't matter *what* the
*** position of the stream is; if it's at the beginning there are three
*** future values; it it's at the end there are three past values; only
*** if past and future values are BOTH empty should true be answered.

position
    Answer number of past sequence values.

position: amount
    Should have (amount between: 0 and: receiver contents size); makes
    the number of past sequence values be amount.

reset
    Seems to be equivalent to position: 0.

setToEnd
    Seems to be equivalent to position: receiver contents size.

>From <gettableStream>

atEnd
    For ReadStream, seems to be equivalent to receiver position =
    receiver contents size.

do:
next
    Further specified in <ReadStream>
next:
nextLine
nextMatchFor:
peek
peekFor:
skip:
skipTo:
upTo:
    Further specified in <ReadStream>

Ah, but maybe there is something strange hidden in
ReadStream class>>on:.

5.9.9 Protocol: <ReadStream factory?
5.9.9.1 Message: on: aCollection
  Synopsis
    Returns a stream that reads from the given collection.
  Definition:
    Returns an object conforming to <ReadStream> whose fugure sequence
    values initially consist of the elements of aCollection and which
    initially has no past sequence values.  The ordering of the sequence
    values is the same as the ordering used by #do: when sent to
    aCollection.  The stream backing store of the returned object is
    aCollection.
  Parameters:
    [aCollection must be a sequenced readable collection, and the resulting
    ReadStream object keeps a reference to it.]

OK, so we've sorted out that
    s := ReadStream on: #(1 2 3).
    s position			MUST BE 0			PASS
    s isEmpty			MUST BE false			FAIL!!!

Looking at PositionableStream, we find

    isEmpty
        "Answer whether the receiver's contents has no elements."
        ^position = 0

The comment and the code disagree.  The comment *is* compatible with
ANSI Smalltalk.  The code is *not*.  I think it should be

        ^readLimit = 0


What about
    (ReadStream on: #(1 2 3 4 5) from: 2 to: 4) reset next
There we are in muddy water, because there is no such method in ANSI
Smalltalk as ReadStream class>>on:from:to:.

The least surprising thing would be if
    (1) Like ReadStream on: (#(1 2 3 4 5) copyFrom: 2 to: 4)
        or ReadStream on: (#(1 2 3 4 5) collect: [:each | each] from: 2 to: 4)
        its contents should be #(2 3 4).
FAIL	(ReadStream on: #(1 2 3 4 5) from: 2 to: 4) contents
	is #(1 2 3 4), not #(1 2 3 4 5).

	This is really surprising, because
	(ReadStream on: ... from: 2 to: 4) next
	is 2, not 1.  

	The immediate problem is that ReadStream just inherits #contents
	from PositionableStream, where we find
	
	contents
	  "Answer with a copy of my collection from 1 to readLimit."
	  ^collection copyFrom: 1 to: readLimit

	This is a perfect example of a comment we'd be better without.
	
	The underlying problem is that ReadStream has no instance variable
	to record the lower bound of the reading area, so not only does it
	have nothing else but 1 to use in #contents, it has nothing else
	to use in #reset.

    (2) Like ReadStream on: #(1 2 3 4 5), the underlying (backing)
        collection should be #(1 2 3 4 5) itself, not some copy of (part of) it.
	PASS.

    (3) Like ReadStream on: (#(1 2 3 4 5) copyFrom: 2 to: 4)
        (ReadStream on: #(1 2 3 4 5) from: 2 to: 4) position
        should be 0.
FAIL	It's 1, because it's referred to the full underlying sequence,
	not the from:2 to:4 part of it.

The nasty thing here is that
    rs := ReadStream on: someArray from: here to: there.
    o1 := OrderedCollection new.
    rs do: [:each | o1 add: each].
    rs reset.
    o2 := rs contents asOrderedCollection.
    o1 = o2
will in general fail.

I think it's fair to say that #on:from:to: is only half finished, which
may be why I found PositionableStream and ReadStream hard to understand.
I couldn't figure out where the special magic was happening, and was not
at that time sufficiently cynical to suspect that it might not be
happening anywhere.

Oddly enough, most of the senders of #on:from:to: in the image have
a starting point of 1, so wouldn't notice the problem, and most of
the rest are in a context where the stream is traversed just once,
so no-one cares what the result of #contents or the effect of #reset
would be.

I can't see any easy way to fix this short of adding another instance
variable to ReadStream (or quite possibly to PositionableStream, because
WriteStream also offers on:from:to:), but most methods would not be
affected, only a few like position[:], reset, isEmpty, contents, ...



More information about the Squeak-dev mailing list