[squeak-dev] Faster FileStream experiments
siguctua at gmail.com
Wed Nov 18 11:58:14 UTC 2009
thanks for taking a time implementing this idea.
Since you are going to introduce something more clever than simple-minded
primitive based file operations, i think its worth to think about
creating a separate classes
for buffering/caching. Lets call it readStrategy, or writeStrategy or
The idea is to redirect all read/write/seek operations to special layer, which
depending on implementation could choose, if given operation will be
just dumb primitive call,
or something more clever, like read-ahead etc.
So, then all streams (not only file stream) could be created using
depending on user's will.
About BufferedFileStream implementation. There are some room for improvement:
cache should remember own starting position + size
then at #skip: you simply doing
self primSetPosition: fileID to: filePosition \\ bufferSize.
but not touching the buffer, because you can't predict what next
operation is follows (it can be another #skip: or truncate or close),
which makes your read-ahead redundant.
The cache should be refreshed only on direct read request, when some
data which needs to be read
is ouside the range covered by cache.
Let me illustrate the case, which shows the suboptimal #skip: behavior:
Here, [ ] is enclosed cached data,
and > is file position, after #skip: send.
Then caller wants to read bytes up to < marker.
In your case, #skip: will refresh cache, causing part of data which
was already in buffer to be re-read again,
while it is possible to reuse already cached data, and read only bytes
between > and [ ,
and rest can be delivered from cache.
Also, since after read request, a file pointer will point at < marker,
we are still inside a cache, and don't need to refresh it.
2009/11/18 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
> I just gave a try to the BufferedFileStream.
> As usual, code is MIT.
> Implementation is rough, readOnly, partial (no support for basicNext
> crap & al), untested (certainly has bugs).
> Early timing experiments have shown a 5x to 7x speed up on [stream
> nextLine] and [stream next] micro benchmarks
> See class comment of attachment
> Reminder: This bench is versus StandardFileStream.
> StandardFileStream is the "fast" version, CrLf anf MultiByte are far worse!
> This still let some more room...
> Integrating and testing a read/write version is a lot harder than this
> experiment, but we should really do it.
Igor Stasenko AKA sig.
More information about the Squeak-dev