Writing a large Collection of integers to a file fast

John M McIntosh johnmci at smalltalkconsulting.com
Sun Jan 27 22:05:35 UTC 2008


Actually if you open a FileStream you get a MultiByteFileStream  
instance.

If the stream is binary it invokes methods on the super class  
StandardFileStream to put a character or a collection of characters.

However if it is text then it proceeds to read or write one character  
at a time causes yes a discrete file I/O primitive call.


So say you need a UTF8 Stream and you have 1 million characters and  
you say
foo nextPutAll: millionCharacterString

This causes 1 million file I/O operations, that takes a *long* time.

In Sophie I coded a SophieMultiByteMemoryFileStream which fronts the  
real stream with a buffer the size of the stream, that way the
Translators get/put bytes to the buffer, and at close time I  flush  
the entire buffer to disk as a binary file.  Obviously this is not a  
general
purpose solution since it relies on the fact in Sophie we know the  
UTF8 files we are working with will only be a few MB in size.


On Jan 27, 2008, at 9:38 AM, David T. Lewis wrote:

> Which shows that for the particular VM and image that I was using,
> the majority of the processing time was spent in multibyte character
> conversion and conversion of integers to strings, and less than
> seven percent was spent in I/O primitives.
>
> Dave
>
>

--
= 
= 
= 
========================================================================
John M. McIntosh <johnmci at smalltalkconsulting.com>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
= 
= 
= 
========================================================================





More information about the Squeak-dev mailing list