Hi Tim,
(below) ...
Next issue was the fact that for some crazy reason, they'd decided to renormalise the colour components after the multiplication by doing a divide by 255, rounding to +infinity. This is neither the "best" approach (arguably round to nearest with rounding half-values to odd or even would be the best), nor is it particularly efficient to implement - divisions rarely are.
There is a reason for this:
255*255 >>8 = 254 255*255/255 = 255 This means that if you divide by 256, you can no longer set a result pixel to 255.
...
Yesterday I stumbled upon this old thread and remembered a trick to avoid the division. I think it would be good to apply it in BitBlt, it should enhance performance on the PI.
The idea is to approximate 256/255 by 257/256. So, instead of doing x/255 (slow) or x>>8 (incorrect), you do x*257 >> 16, or better yet (x<<8 + x) >> 16. Two shift and an add instead of a division. The error is less than 1/65535, and negligible for 8 bit output.
I hope this is still relevant.
Cheers, Juan Vuletich
Cheers, Juan Vuletich