[squeak-dev] Re: squeak XTream

Thu Dec 3 01:35:40 UTC 2009

2009/12/2 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
> 2009/12/2 Eliot Miranda <eliot.miranda at gmail.com>:
>>
>>
>> On Wed, Dec 2, 2009 at 10:49 AM, Colin Putney <cputney at wiresong.ca> wrote:
>>>
>>> On 2-Dec-09, at 8:26 AM, Nicolas Cellier wrote:
>>>
>>>> Xtream is not functional yet, it is just a three evenings shot.
>>>
>>> At the rate you're going, you'll have caught up to the functionality in
>>> Filesystem in no time. ;-)
>>>
>>>> Especially pipelines are quite tricky with a forked process... I got
>>>> to rest a bit and think.
>>>> This kind of implementation natively has good parallelism properties,
>>>> unfortunately this won't exploit multi-core/processors any time soon
>>>> in Smalltalk...
>>>
>>> I'm guessing you mean running each stage of the stream in a separate
>>> Smalltalk Process, using the Pipes and Filters pattern?
>>> Stephen Pair did some neat stuff with that a few years ago. It's indeed
>>> tricky. I wonder if it's worth it in this case, though, exactly because
>>> Smalltalk doesn't exploit multiprocessors. Flow of control inside a stream
>>> might be complicated without parallelism, but it's probably easier to debug.
>>>
>>>> A few month ago, I implemented a simple Wrapper-like scheme, but was
>>>> not satisfied with end of stream handling. Both EndOfStream exception
>>>> capture and atEnd tests are expensive when processing elements 1 by 1.
>>>> Maybe I'll have to turn to such a more simple scheme though.
>>>
>>> I don't understand the issue with EndOfStream exceptions. Throwing and
>>> catching an exception is expensive, yes, but that should happen only once,
>>> right? Unless you're setting up exception handler inside a loop, the expense
>>> of a single exception shouldn't be a problem.
>>
>> Exception search and delivery is, uh, /expensive/.  The cost of propagating
>> an EndOfStream exception to its defaultAction and returning nil is huge
>> compared to simply answering an end-of-stream value.  So unless one really
>> wants exception handling one should strive to avoid raising EndOfStream
>> exceptions at end of stream.
>> I think in VisualWorks we noticed the extreme expense in the ChangeList
>> scanner where one is creating lots of streams on strings corresponding to
>> each chunk.  The end-of-stream exceptions on all these streams when doing
>> something like scanning a changes file would add up to a significant
>> percentage of the entire parse time.  So believe me, it does add up.
>>
>
> Yes, I arrived to same conlcusion:
> Exception is better than atEnd test quite soon for small collection.
> Exception is worse than == endMark test except for very big
> collections (large files).
>
> Notification with default ^nil action is the worse thing possible both
> for efficiency (whole stack walk) and for not scaling well in
> complexity (an upper stream catching an un-caught notification that
> should have returned nil in a low level function using streams...)
>
> I arrived to similar conclusion but prefer an endOfStreamAction to an
> endOfStreamValue because I like to be able to use a home return
> sometimes
>    stream endOfStreamAction: [^self]
> and I have the endOfStreamValue at not much higher cost:
>    stream endOfStreamAction: nil->endOfStreamValue
>

yes. using some state ivar (be it endOfStreamAction or
endOfStreamValue) to send #value to it when meeting end of stream is
most flexible & least expensible thing i can imagine.
A stream user then could put a non-local return in block, or signal
exception, or simply nil.. whatever he may desire.

> Cheers
>
> Nicolas
>
>>>
>>>> Definitely, we should exchange code/ideas.
>>>
>>> Agreed. We may find that doing parallel development with lots of
>>> cross-pollination is the best way to explore the design space.
>>>
>>> Colin
>>>
>>
>>
>>
>>
>>
>
>

-- 
Best regards,
Igor Stasenko AKA sig.