On 09-09-2013, at 9:10 AM, "J. Vuletich (mail lists)" juanlists@jvuletich.org wrote:
The idea is to approximate 256/255 by 257/256. So, instead of doing x/255 (slow) or x>>8 (incorrect), you do x*257 >> 16, or better yet (x<<8 + x) >> 16. Two shift and an add instead of a division. The error is less than 1/65535, and negligible for 8 bit output.
I hope this is still relevant.
Ooh, nice. Two instruction cycles for an ARM and probably the second shift could be merged into whatever the following operation is.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim "Bollocks," said Pooh being more forthright than usual