[squeak-dev] Faster FileStream experiments

Igor Stasenko siguctua at gmail.com
Wed Nov 18 11:58:14 UTC 2009


Hello Nicolas,
thanks for taking a time implementing this idea.

Since you are going to introduce something more clever than simple-minded
primitive based file operations, i think its worth to think about
creating a separate classes
for buffering/caching. Lets call it readStrategy, or writeStrategy or
cacheStrategy.
The idea is to redirect all read/write/seek operations to special layer, which
depending on implementation could choose, if given operation will be
just dumb primitive call,
or something more clever, like read-ahead etc.
So, then all streams (not only file stream) could be created using
choosen strategy
depending on user's will.

About BufferedFileStream implementation. There are some room for improvement:
cache should remember own starting position + size
then at #skip: you simply doing 	
 self primSetPosition: fileID to: filePosition \\ bufferSize.
but not touching the buffer, because you can't predict what next
operation is follows (it can be another #skip: or truncate or close),
which makes your read-ahead redundant.

The cache should be refreshed only on direct read request, when some
data which needs to be read
is ouside the range covered by cache.
Let me illustrate the case, which shows the suboptimal #skip: behavior:

........>........[..........<..........]........

Here, [ ] is enclosed cached data,
and > is file position, after #skip: send.
Then caller wants to read bytes up to < marker.
In your case, #skip: will refresh cache, causing part of data which
was already in buffer to be re-read again,
while it is possible to reuse already cached data, and read only bytes
between  > and [ ,
and rest can be delivered from cache.
Also, since after read request, a file pointer will point at < marker,
we are still inside a cache, and don't need to refresh it.


2009/11/18 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
> I just gave a try to the BufferedFileStream.
> As usual, code is MIT.
> Implementation is rough, readOnly, partial (no support for basicNext
> crap & al), untested (certainly has bugs).
> Early timing experiments have shown a 5x to 7x speed up on [stream
> nextLine] and [stream next] micro benchmarks
> See class comment of attachment
>
> Reminder: This bench is versus StandardFileStream.
> StandardFileStream is the "fast" version, CrLf anf MultiByte are far worse!
> This still let some more room...
>
> Integrating and testing a read/write version is a lot harder than this
> experiment, but we should really do it.
>
> Nicolas
>
>
>
>



-- 
Best regards,
Igor Stasenko AKA sig.



More information about the Squeak-dev mailing list