[squeak-dev] Re: quick handling of graphics files
Andreas Raab
andreas.raab at gmx.de
Thu Apr 15 02:54:22 UTC 2010
On 4/14/2010 1:08 PM, Juan Vuletich wrote:
> Profiling is indeed your friend.
> There is some serious inefficiency there. Quickly hacking this (warning:
> will only work for 1bpp):
[... snip ...]
> gives over 30x speed increase (from 10 seconds down to 310 mSec) on my
> system. This is not a solution, just some food for thought.
Heh, heh. Very good. But now I'm gonna get serious ...
<pokerface on>
I see your 30x improvement and raise you another ... 6x for a total of
200x speedup (from 10secs to 50 msecs). There! Take that! :-)
(but if Igor pulls out some asm I may have to fold :-)
<pokerface off>
Cheers,
- Andreas
-------------- next part --------------
'From Squeak4.1.1 of 11 April 2010 [latest update: #9945] on 14 April 2010 at 7:52:07 pm'!
"Change Set: PNGSpeedup
Date: 14 April 2010
Author: Andreas Raab
For fun and education: Speed up PNGReadWriter's handling of black and white reading by 200x (yes that's 20,000%)"!
!PNGReadWriter methodsFor: 'pixel copies' stamp: 'ar 4/14/2010 19:50'!
copyPixelsGray1To1: y
"Handle non-interlaced black and white color mode (colorType = 0)"
| source dest cmap |
source := Form extent: 1 @ (thisScanline size // 4) depth: 32 bits: thisScanline.
dest := Form extent: 1 @ (form bits size) depth: 32 bits: form bits.
cmap := Smalltalk isLittleEndian
ifTrue:[ColorMap
shifts: #(-24 -8 8 24)
masks: #(16rFF000000 16r00FF0000 16r0000FF00 16r000000FF)].
(BitBlt toForm: dest)
sourceForm: source;
destX: 0 destY: (y * form width//32) width: 1 height: (form width+31//32);
colorMap: cmap;
combinationRule: 3;
copyBits.
! !
!PNGReadWriter methodsFor: 'pixel copies' stamp: 'ar 4/14/2010 19:50'!
copyPixelsGray: y
"Handle non-interlaced grayscale color mode (colorType = 0)"
| blitter pixPerByte mask shifts pixelNumber rawByte pixel transparentIndex |
"Start this off with optimized versions for particular variants we care about.
General case code below is fairly slow; optimized versions are MUCH faster"
(bitsPerChannel = 1 and:[form depth = 1])
ifTrue:[^self copyPixelsGray1To1: y].
blitter := BitBlt current bitPokerToForm: form.
transparentIndex := form colors size.
bitsPerChannel = 16
ifTrue: [0
to: width - 1
do: [:x | blitter pixelAt: x @ y put: 255
- (thisScanline at: x << 1 + 1)].
^ self]
ifFalse: [bitsPerChannel = 8
ifTrue: [1
to: width
do: [:x | blitter
pixelAt: x - 1 @ y
put: (thisScanline at: x)].
^ self].
bitsPerChannel = 1
ifTrue: [pixPerByte := 8.
mask := 1.
shifts := #(7 6 5 4 3 2 1 0 )].
bitsPerChannel = 2
ifTrue: [pixPerByte := 4.
mask := 3.
shifts := #(6 4 2 0 )].
bitsPerChannel = 4
ifTrue: [pixPerByte := 2.
mask := 15.
shifts := #(4 0 )].
pixelNumber := 0.
0 to: width - 1 do: [:x |
rawByte := thisScanline at: pixelNumber // pixPerByte + 1.
pixel := rawByte
>> (shifts at: pixelNumber \\ pixPerByte + 1) bitAnd: mask.
pixel = transparentPixelValue ifTrue: [pixel := transparentIndex].
blitter pixelAt: x @ y put: pixel.
pixelNumber := pixelNumber + 1
]
]! !
!ZLibWriteStream class methodsFor: 'crc' stamp: 'ar 4/14/2010 19:50'!
updateAdler32: adler from: start to: stop in: aCollection
"Update crc using the Adler32 checksum technique from RFC1950"
"
unsigned long s1 = adler & 0xffff;
unsigned long s2 = (adler >> 16) & 0xffff;
int n;
for (n = 0; n < len; n++) {
s1 = (s1 + buf[n]) % BASE;
s2 = (s2 + s1) % BASE;
}
return (s2 << 16) + s1;
"
| s1 s2 |
<primitive: 'primitiveUpdateAdler32' module: 'ZipPlugin'>
s1 := adler bitAnd: 16rFFFF.
s2 := (adler bitShift: -16) bitAnd: 16rFFFF.
start to: stop do: [ :n | | b |
b := aCollection byteAt: n.
s1 := (s1 + b) \\ 65521.
s2 := (s2 + s1) \\ 65521. ].
^(s2 bitShift: 16) + s1! !
More information about the Squeak-dev
mailing list
|