Writing a large Collection of integers to a file fast

David T. Lewis lewis at mail.msen.com
Sun Jan 27 17:38:08 UTC 2008


On Sat, Jan 26, 2008 at 11:38:21PM +0100, nicolas cellier wrote:
> Bert Freudenberg a ?crit :
> >
> >Try printing to a memory buffer in chunks of 10000 integers, and putting 
> >that on the file. Unbuffered I/O is slow.
> >
> 
> Any reason why Squeak should use unbuffered I/O?
> 
> It sounds like strange we have to emulate a base function every 
> underlying OS would perform so well.

The Windows VM uses direct I/O to a Windows HANDLE, and all other
VMs are using buffered I/O. I have never seen any measurement data
to show that one is better than the other, or under what circumstances
one might be better than the other. I certainly would not assume
anything without seeing the numbers.

It would be straightforward to implement either approach on any
of the supported platforms, so I assume that these were simply
design choices of the individual VM implementers.

Furthermore, it is not necessarily the case that file I/O is the
performance bottleneck in this case. I did a quick check of this:

  TimeProfileBrowser onBlock:
    [aFilename := 'foo.txt'.
    aLargeCollection := 1 to: 100000.
    aFile := CrLfFileStream fileNamed: aFilename.
    aLargeCollection do: [ :int |
      aFile nextPutAll: int printString, String cr].
    aFile close]

Which shows that for the particular VM and image that I was using,
the majority of the processing time was spent in multibyte character
conversion and conversion of integers to strings, and less than
seven percent was spent in I/O primitives.

Dave




More information about the Squeak-dev mailing list