Writing a large Collection of integers to a file fast
John M McIntosh
johnmci at smalltalkconsulting.com
Sun Jan 27 22:05:35 UTC 2008
Actually if you open a FileStream you get a MultiByteFileStream
If the stream is binary it invokes methods on the super class
StandardFileStream to put a character or a collection of characters.
However if it is text then it proceeds to read or write one character
at a time causes yes a discrete file I/O primitive call.
So say you need a UTF8 Stream and you have 1 million characters and
foo nextPutAll: millionCharacterString
This causes 1 million file I/O operations, that takes a *long* time.
In Sophie I coded a SophieMultiByteMemoryFileStream which fronts the
real stream with a buffer the size of the stream, that way the
Translators get/put bytes to the buffer, and at close time I flush
the entire buffer to disk as a binary file. Obviously this is not a
purpose solution since it relies on the fact in Sophie we know the
UTF8 files we are working with will only be a few MB in size.
On Jan 27, 2008, at 9:38 AM, David T. Lewis wrote:
> Which shows that for the particular VM and image that I was using,
> the majority of the processing time was spent in multibyte character
> conversion and conversion of integers to strings, and less than
> seven percent was spent in I/O primitives.
John M. McIntosh <johnmci at smalltalkconsulting.com>
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
More information about the Squeak-dev