[squeak-dev] Re: quick handling of graphics files

Andreas Raab andreas.raab at gmx.de
Thu Apr 15 02:54:22 UTC 2010


On 4/14/2010 1:08 PM, Juan Vuletich wrote:
> Profiling is indeed your friend.
> There is some serious inefficiency there. Quickly hacking this (warning:
> will only work for 1bpp):

[... snip ...]

> gives over 30x speed increase (from 10 seconds down to 310 mSec) on my
> system. This is not a solution, just some food for thought.

Heh, heh. Very good. But now I'm gonna get serious ...

<pokerface on>

I see your 30x improvement and raise you another ... 6x for a total of 
200x speedup (from 10secs to 50 msecs). There! Take that! :-)

(but if Igor pulls out some asm I may have to fold :-)

<pokerface off>


Cheers,
   - Andreas
-------------- next part --------------
'From Squeak4.1.1 of 11 April 2010 [latest update: #9945] on 14 April 2010 at 7:52:07 pm'!
"Change Set:		PNGSpeedup
Date:			14 April 2010
Author:			Andreas Raab

For fun and education: Speed up PNGReadWriter's handling of black and white reading by 200x (yes that's 20,000%)"!


!PNGReadWriter methodsFor: 'pixel copies' stamp: 'ar 4/14/2010 19:50'!
copyPixelsGray1To1: y 
	"Handle non-interlaced black and white color mode (colorType = 0)"
	| source dest cmap |
	source := Form extent: 1 @ (thisScanline size // 4) depth: 32 bits: thisScanline.
	dest := Form extent: 1 @ (form bits size) depth: 32 bits: form bits.
	cmap := Smalltalk isLittleEndian 
		ifTrue:[ColorMap 
					shifts: #(-24 -8 8 24) 
					masks: #(16rFF000000 16r00FF0000 16r0000FF00 16r000000FF)].
	(BitBlt toForm: dest)
		sourceForm: source;
		destX: 0 destY: (y * form width//32) width: 1 height: (form width+31//32);
		colorMap: cmap;
		combinationRule: 3;
		copyBits.
! !

!PNGReadWriter methodsFor: 'pixel copies' stamp: 'ar 4/14/2010 19:50'!
copyPixelsGray: y 
	"Handle non-interlaced grayscale color mode (colorType = 0)"
	| blitter pixPerByte mask shifts pixelNumber rawByte pixel transparentIndex |

	"Start this off with optimized versions for particular variants we care about.
	General case code below is fairly slow; optimized versions are MUCH faster"
	(bitsPerChannel = 1 and:[form depth = 1]) 
		ifTrue:[^self copyPixelsGray1To1: y].

	blitter := BitBlt current bitPokerToForm: form.
	transparentIndex := form colors size.
	bitsPerChannel = 16
		ifTrue: [0
				to: width - 1
				do: [:x | blitter pixelAt: x @ y put: 255
							- (thisScanline at: x << 1 + 1)].
			^ self]
		ifFalse: [bitsPerChannel = 8
				ifTrue: [1
						to: width
						do: [:x | blitter
								pixelAt: x - 1 @ y
								put: (thisScanline at: x)].
					^ self].
			bitsPerChannel = 1
				ifTrue: [pixPerByte := 8.
					mask := 1.
					shifts := #(7 6 5 4 3 2 1 0 )].
			bitsPerChannel = 2
				ifTrue: [pixPerByte := 4.
					mask := 3.
					shifts := #(6 4 2 0 )].
			bitsPerChannel = 4
				ifTrue: [pixPerByte := 2.
					mask := 15.
					shifts := #(4 0 )].
			pixelNumber := 0.
			0 to: width - 1 do: [:x | 
				rawByte := thisScanline at: pixelNumber // pixPerByte + 1.
				pixel := rawByte
							>> (shifts at: pixelNumber \\ pixPerByte + 1) bitAnd: mask.
				pixel = transparentPixelValue ifTrue: [pixel := transparentIndex].
				blitter pixelAt: x @ y put: pixel.
				pixelNumber := pixelNumber + 1
			]
		]! !


!ZLibWriteStream class methodsFor: 'crc' stamp: 'ar 4/14/2010 19:50'!
updateAdler32: adler from: start to: stop in: aCollection
	"Update crc using the Adler32 checksum technique from RFC1950"
"
        unsigned long s1 = adler & 0xffff;
        unsigned long s2 = (adler >> 16) & 0xffff;
        int n;

        for (n = 0; n < len; n++) {
          s1 = (s1 + buf[n]) % BASE;
          s2 = (s2 + s1)     % BASE;
        }
        return (s2 << 16) + s1;
"
	| s1 s2 |
	<primitive: 'primitiveUpdateAdler32' module: 'ZipPlugin'>
	s1 := adler bitAnd: 16rFFFF.
	s2 := (adler bitShift: -16) bitAnd: 16rFFFF.
	start to: stop do: [ :n | | b |
		b := aCollection byteAt: n.
		s1 := (s1 + b) \\ 65521.
		s2 := (s2 + s1) \\ 65521. ].
	^(s2 bitShift: 16) + s1! !



More information about the Squeak-dev mailing list