FilterStreams prototype
Andrew C. Greenberg
werdna at gate.net
Sat Apr 3 16:34:40 UTC 1999
>Hello Squeakers,
>
>in an earlier message ( Musings about "ma" ) I mentioned a simple OO
>pipe-and-filter implementation. I have now finally started porting
>this stuff from Objective-C to Squeak. Do you think this is
>worthwhile pursuing further or just another crazy idea? :-)
Look, we're all nuts, so don't consider that the notions of "crazy
idea" and "worthwhile pursuing" are mutually exclusive. :-)
>Motivation and Rationale
>------------------------
>
>The motivation for this package was the lack of reusability of
>printing and other encoding tasks. The problem is that each method
>taking part in an encoding task hard-codes the specific encoding
>task, for example a collection receiving a #printOn: message sends
>#printOn: to its elements.
>
>The problem becomes apparent when trying to implement a #debugOn:
>message that only differs from #printOn: for a couple of objects. At
>first, this seems easy, defining #debugOn: as a rename of #printOn:
>in Object, and implementing #debugOn: for those objects that should
>behave differently, let's say Forms. However, this fails as soon as
>there are nested objects to debug, for example a collection of Forms.
> The collection receives the #debugOn: message, which is translated
>to #printOn: and this in turn sends #printOn: to all the Forms, not
>#debugOn: !
This was similar to the bug I found and fixed in the printOut
mechanism that kept HtmlFileStreams from working properly.
>On an abstract level, this could be solved by adding 'subclassing'
>of operations that works just like subclassing of objects with a
>similar treatment of self (in this case, the current operation).
I disagree -- I think subclassing doesn't work well at all in
facilitating a reusable filter-and-pipeline model. It was precisely
this concern that lead to the HtmlFileStream bug I mentioned earlier!
A filter pipeline implemented by subclassing isn't terribly reusable
because each element is "stuck" in its hierarchy -- it can't easily
be reused "inserted" into another pipeline. The primary virtue of
pipes and filters is the ability to develop small hunks of code that
do "one thing well," changing inputs into ouputs (or generating
outputs on demand from inputs), and which can be composed into bigger
programs in the archetypal Unix model.
The real problem with subclassing implementations of pipe and filters
is that the hierarchy elements must know about each other to avoid
"restarting" the pipeline. Here's an example from HtmlFileStream.
The idea was to subclass FileStream, but to convert all text coming
in into HTML on the fly, changing HTML command codes into
corresponding HTML codes to generate literal text, and the like. A
separate facility was created to permit "quoting" of input, using a
method called #verbatim:, so HTML commands could be interspersed with
the output. #verbatim was simply implemented, thusly:
verbatim: aString
"Put out the string without HTML conversion."
super verbatim: aString
Relying upon a similar facility in the superClass. Unfortunately for
HTMLFileStream, "verbatim:" there was implemented by the message
"self nextPutAll:", which sent the message to
HTMLFileStream>>nextPutAll: and not to the version in the superclass.
Once identified, the bug was trivially easy to fix, but it was a
trick to identify, mostly because you couldn't debug HTMLFileStream
without understanding its place in the hierarchy. This is not the
way a filter should be developed. Filters should do one thing well,
and know nothing about their environment except for the fact that it
takes some input from an input and changes it to produce some output
from an output.
Filters appear much better adapted by means of a Composition.
Thoughts on Marcel's solution.
I recognize that I am looking at a slightly different problem than
Marcel's, but I fear his solution does not scale well for Smalltalk
-- at least to reach the general classes of problems he is
suggesting. In particular, since each new FileStream must create a
message in Object, this tends to make things highly non-reusable, and
invites substantial name conflicts and entanglements that would make
it wholly unsuitable as a general purpose tool for combining
Ritchie-esque tools.
Since FilterStream isn't a Stream, as that term is understood in
Squeak, since it is not a subclass of Stream and doesn't follow the
protocols of Stream, so it probably shouldn't be called
"FilterStream."
In particular, I am also not thrilled with the idea of a new
operation #write: used in the context of something called
FilterStream. As I saw it, a FilterStream class would have instances
that represent a filter pipeline -- sort of a super-Stream. Each
node of the pipeline would implement a Ritchie-esque
do-one-thing-well filter, knowing only that it would have two
objects, inStream and outStream, which it could treat using
traditional Stream protocols #next and #nextPut, and their progeny.
The FilterStream would handle the mechanics of coroutine management
and buffering throughout the pipeline, and would permit both
push-based and pull-based pipelining.
Thus, I saw the notion of a filter pipeline as an independent
instance of a collection of objects, building on the Stream
protocols, rather than building the pipeline into the Object
hierarchy itself. Marcel's solution seems, to me, to exacerbate
rather than facilitate difficulties of reusing such filters.
I accept, however, that it is likely that I am just not "getting it,"
or that Marcel is solving an entirely different problem (apparently
dealing with reflective properties of Smalltalk) that MUST be built
into an object hierarchy to work. To my mind, however, it would be
dangerous and confusing to call such a thing a "FilterStream."
More information about the Squeak-dev
mailing list
|