[Newbies] Re: Binary file I/O performance problems
nicolas cellier
ncellier at ifrance.com
Fri Sep 5 21:19:20 UTC 2008
nicolas cellier a écrit :
> Yoshiki Ohshima a écrit :
>> At Fri, 5 Sep 2008 10:59:03 -0700,
>> David Finlayson wrote:
>>> I re-wrote the test application to load the test file entirely into
>>> memory before parsing the data. The total time to parse the file
>>> decreased by about 50%. Now that I/O is removed from the picture, the
>>> new bottle neck is turning bytes into integers (and then integers into
>>> Floats).
>>>
>>> I know that Smalltalk isn't the common language for number crunching,
>>> but if I can get acceptable performance out of it, then down the road
>>> I would like to tap into the Croquet environment. That is why I am
>>> trying to learn a way that will work.
>>
>> If the integers or floats are in the layout of C's int[] or float[],
>> there is a better chance to make it much faster.
>>
>> Look at the method Bitmap>>asByteArray and
>> Bitmap>>copyFromByteArray:. You can convert a big array of non-pointer
>> words from/to a byte array.
>>
>> data := (1 to: 1000000) as: FloatArray.
>> words := Bitmap new: data size.
>> words replaceFrom: 1 to: data size with: data.
>> bytes := words asByteArray.
>>
>> "and you write out the bytes into a binary file."
>>
>> "to get them back:"
>>
>> words copyFromByteArray: bytes.
>> data replaceFrom: 1 to: words size with: words.
>>
>> Obviously, you can recycle some of the intermediate buffer allocation
>> and that would speed it up.
>>
>> FloatArray has some vector arithmetic primitives, and the Kedama
>> system in OLPC Etoys image have more elaborated vector arithmetic
>> primitives on integers and floats including operations with masked
>> vectors.
>>
>> -- Yoshiki
>
> Hi David,
> your applications is exciting my curiosity. Which company/organization
> are you working for, if not indiscreet?
>
> I think you will solve most performances problems following good advices
> from Yoshiki.
>
> You might also want to investigate FFI as a way for handling
> C-layout-like ByteArray memory from within Smalltalk as an alternative.
> I made an example of use in Smallapack-Collections (search Smallapack in
> squeaksource, http://www.squeaksource.com/Smallapack/) .
> ExternalArray is an abstract class for handling memory filled as a
> C-Arrays of any type from within Smalltalk (only float double and
> complex are programmed in subclasses, but you can extend), and in fact
> FFI can handle any structure (though you'll might have to resolve
> alignment problems by yourself).
> There's a trade-off between fast reading (no conversion) and slower
> access (conversion at each access), however with ByteArray>>#doubleAt:
> and #floatAt: primitives (from FFI), and fast hacks to eventually
> reverse endianness of a whole array at once, maintaining ExternalArrays
> of elementary types or small structures procide access time still
> reasonnable.
>
> Nicolas
forgot to provide some timing (Athlon 32bits 1GHz) for write/read access:
| a b c |
{
[a := FloatArray withAll: (1 to: 100000)] timeToRun.
[b := ExternalFloatArray withAll: (1 to: 100000)] timeToRun.
[c := ExternalDoubleArray withAll: (1 to: 100000)] timeToRun.
[a do: [:e | ]] timeToRun.
[b do: [:e | ]] timeToRun.
[c do: [:e | ]] timeToRun.
}.
#(142 312 335 80 181 182)
More information about the Beginners
mailing list