Writing a large Collection of integers to a file fast
David T. Lewis
lewis at mail.msen.com
Sun Jan 27 17:38:08 UTC 2008
On Sat, Jan 26, 2008 at 11:38:21PM +0100, nicolas cellier wrote:
> Bert Freudenberg a ?crit :
> >Try printing to a memory buffer in chunks of 10000 integers, and putting
> >that on the file. Unbuffered I/O is slow.
> Any reason why Squeak should use unbuffered I/O?
> It sounds like strange we have to emulate a base function every
> underlying OS would perform so well.
The Windows VM uses direct I/O to a Windows HANDLE, and all other
VMs are using buffered I/O. I have never seen any measurement data
to show that one is better than the other, or under what circumstances
one might be better than the other. I certainly would not assume
anything without seeing the numbers.
It would be straightforward to implement either approach on any
of the supported platforms, so I assume that these were simply
design choices of the individual VM implementers.
Furthermore, it is not necessarily the case that file I/O is the
performance bottleneck in this case. I did a quick check of this:
[aFilename := 'foo.txt'.
aLargeCollection := 1 to: 100000.
aFile := CrLfFileStream fileNamed: aFilename.
aLargeCollection do: [ :int |
aFile nextPutAll: int printString, String cr].
Which shows that for the particular VM and image that I was using,
the majority of the processing time was spent in multibyte character
conversion and conversion of integers to strings, and less than
seven percent was spent in I/O primitives.
More information about the Squeak-dev