[Newbies] Re: Binary file I/O performance problems

nicolas cellier ncellier at ifrance.com
Fri Sep 5 21:19:20 UTC 2008


nicolas cellier a écrit :
> Yoshiki Ohshima a écrit :
>> At Fri, 5 Sep 2008 10:59:03 -0700,
>> David Finlayson wrote:
>>> I re-wrote the test application to load the test file entirely into
>>> memory before parsing the data. The total time to parse the file
>>> decreased by about 50%. Now that I/O is removed from the picture, the
>>> new bottle neck is turning bytes into integers (and then integers into
>>> Floats).
>>>
>>> I know that Smalltalk isn't the common language for number crunching,
>>> but if I can get acceptable performance out of it, then down the road
>>> I would like to tap into the Croquet environment. That is why I am
>>> trying to learn a way that will work.
>>
>>   If the integers or floats are in the layout of C's int[] or float[],
>> there is a better chance to make it much faster.
>>
>>   Look at the method Bitmap>>asByteArray and
>> Bitmap>>copyFromByteArray:.  You can convert a big array of non-pointer
>> words from/to a byte array.
>>
>>   data := (1 to: 1000000) as: FloatArray.
>>   words := Bitmap new: data size.
>>   words replaceFrom: 1 to: data size with: data.
>>   bytes := words asByteArray.
>>
>>   "and you write out the bytes into a binary file."
>>
>>   "to get them back:"
>>
>>   words copyFromByteArray: bytes.
>>   data replaceFrom: 1 to: words size with: words.
>>
>> Obviously, you can recycle some of the intermediate buffer allocation
>> and that would speed it up.
>>
>>   FloatArray has some vector arithmetic primitives, and the Kedama
>> system in OLPC Etoys image have more elaborated vector arithmetic
>> primitives on integers and floats including operations with masked
>> vectors.
>>
>> -- Yoshiki
> 
> Hi David,
> your applications is exciting my curiosity. Which company/organization 
> are you working for, if not indiscreet?
> 
> I think you will solve most performances problems following good advices 
> from Yoshiki.
> 
> You might also want to investigate FFI as a way for handling 
> C-layout-like ByteArray memory from within Smalltalk as an alternative.
> I made an example of use in Smallapack-Collections (search Smallapack in 
> squeaksource, http://www.squeaksource.com/Smallapack/) .
> ExternalArray is an abstract class for handling memory filled as a 
> C-Arrays of any type from within Smalltalk (only float double and 
> complex are programmed in subclasses, but you can extend), and in fact 
> FFI can handle any structure (though you'll might have to resolve 
> alignment problems by yourself).
> There's a trade-off between fast reading (no conversion) and slower 
> access (conversion at each access), however with ByteArray>>#doubleAt: 
> and #floatAt: primitives (from FFI), and fast hacks to eventually 
> reverse endianness of a whole array at once, maintaining ExternalArrays 
> of elementary types or small structures procide access time still 
> reasonnable.
> 
> Nicolas

forgot to provide some timing (Athlon 32bits 1GHz) for write/read access:

| a b c |
{
   [a := FloatArray withAll: (1 to: 100000)] timeToRun.
   [b := ExternalFloatArray withAll: (1 to: 100000)] timeToRun.
   [c := ExternalDoubleArray withAll: (1 to: 100000)] timeToRun.
   [a do: [:e | ]] timeToRun.
   [b do: [:e | ]] timeToRun.
   [c do: [:e | ]] timeToRun.
}.
  #(142 312 335 80 181 182)




More information about the Beginners mailing list