At Wed, 14 Apr 2010 23:18:48 +0200 (CEST), Levente Uzonyi wrote:
On Wed, 14 Apr 2010, Juan Vuletich wrote:
Ross Boylan wrote:
On Tue, 2010-04-13 at 21:05 -0700, Andreas Raab wrote:
Hi Ross -
Profiling is your friend. In most cases, 95% of the time are spent in 5% of the code. From what you're saying below it sounds that you're using one of the 'special' PNG modes (black and white, or gray-scale) that probably have seen less attention for optimization than others. Any chance you can post a sample image for profiling it?
I've attached a test image, which takes about 30 seconds for me. It is a 1 bit depth image. Not sure if the attachment will make it through....
Ross
Profiling is indeed your friend. There is some serious inefficiency there. Quickly hacking this (warning: will only work for 1bpp):
copyPixelsGray: y "Handle non-interlaced grayscale color mode (colorType = 0)" | word base ii | base := y * form width//32 +1. 0 to: thisScanline size-1 // 4 do: [ :i | ii := i * 4. word := (thisScanline at: ii+1) << 24 bitOr: ( (thisScanline at: ii+2) << 16 bitOr: ( (thisScanline at: ii+3) << 8 bitOr: ( (thisScanline at: ii+4)))). form bits at: base + i put: word ].
gives over 30x speed increase (from 10 seconds down to 310 mSec) on my system. This is not a solution, just some food for thought.
You can squeeze out even better performance from this, by using #* and #+ instead of #<< and #bitOr: when the result and arguments are SmallIntegers:
word := ((thisScanline at: ii+1) * 256 + (thisScanline at: ii+2) * 256 + (thisScanline at: ii+3) bitShift: 8) bitOr: (thisScanline at: ii+4).
Levente
Ah ha. If you define a method at Bitmap that looks like:
copyFromByteArray2: byteArray to: i "This method should work with either byte orderings"
| myHack byteHack | myHack := Form new hackBits: self. byteHack := Form new hackBits: byteArray. SmalltalkImage current isLittleEndian ifTrue: [byteHack swapEndianness]. byteHack displayOn: myHack at: 0@i
and make #copyPixelsGray: to be:
copyPixelsGray: y form bits copyFromByteArray2: thisScanline to: y * (form width // 32)
A test like this:
[(PNGReadWriter on: (FileStream readOnlyFileNamed: 'test.png')) nextImage] timeToRun
runs 2.5x faster than Juan's version on my computer.
-- Yoshiki