[Newbies] Re: Binary file I/O performance problems
nicolas cellier
ncellier at ifrance.com
Fri Sep 5 21:00:07 UTC 2008
Yoshiki Ohshima a écrit :
> At Fri, 5 Sep 2008 10:59:03 -0700,
> David Finlayson wrote:
>> I re-wrote the test application to load the test file entirely into
>> memory before parsing the data. The total time to parse the file
>> decreased by about 50%. Now that I/O is removed from the picture, the
>> new bottle neck is turning bytes into integers (and then integers into
>> Floats).
>>
>> I know that Smalltalk isn't the common language for number crunching,
>> but if I can get acceptable performance out of it, then down the road
>> I would like to tap into the Croquet environment. That is why I am
>> trying to learn a way that will work.
>
> If the integers or floats are in the layout of C's int[] or float[],
> there is a better chance to make it much faster.
>
> Look at the method Bitmap>>asByteArray and
> Bitmap>>copyFromByteArray:. You can convert a big array of non-pointer
> words from/to a byte array.
>
> data := (1 to: 1000000) as: FloatArray.
> words := Bitmap new: data size.
> words replaceFrom: 1 to: data size with: data.
> bytes := words asByteArray.
>
> "and you write out the bytes into a binary file."
>
> "to get them back:"
>
> words copyFromByteArray: bytes.
> data replaceFrom: 1 to: words size with: words.
>
> Obviously, you can recycle some of the intermediate buffer allocation
> and that would speed it up.
>
> FloatArray has some vector arithmetic primitives, and the Kedama
> system in OLPC Etoys image have more elaborated vector arithmetic
> primitives on integers and floats including operations with masked
> vectors.
>
> -- Yoshiki
Hi David,
your applications is exciting my curiosity. Which company/organization
are you working for, if not indiscreet?
I think you will solve most performances problems following good advices
from Yoshiki.
You might also want to investigate FFI as a way for handling
C-layout-like ByteArray memory from within Smalltalk as an alternative.
I made an example of use in Smallapack-Collections (search Smallapack in
squeaksource, http://www.squeaksource.com/Smallapack/) .
ExternalArray is an abstract class for handling memory filled as a
C-Arrays of any type from within Smalltalk (only float double and
complex are programmed in subclasses, but you can extend), and in fact
FFI can handle any structure (though you'll might have to resolve
alignment problems by yourself).
There's a trade-off between fast reading (no conversion) and slower
access (conversion at each access), however with ByteArray>>#doubleAt:
and #floatAt: primitives (from FFI), and fast hacks to eventually
reverse endianness of a whole array at once, maintaining ExternalArrays
of elementary types or small structures procide access time still
reasonnable.
Nicolas
More information about the Beginners
mailing list