Nicolas Cellier uploaded a new version of VMMaker to project VM Maker: http://source.squeak.org/VMMaker/VMMaker.oscog-nice.3251.mcz
==================== Summary ====================
Name: VMMaker.oscog-nice.3251 Author: nice Time: 25 August 2022, 6:09:00.826995 pm UUID: fad63ac4-33e9-944d-a688-a69c2b8f5b33 Ancestors: VMMaker.oscog-nice.3250
Implement a correctly rounded rgbMul BitBlt with the ideas developped here:
https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/651
Also provide rgbMul tests for all depths (1 2 4 8 16 32) (32 bits depth test = 8 bits depth test).
=============== Diff against VMMaker.oscog-nice.3250 ===============
Item was removed: - ----- Method: BitBltSimulation>>partitionedMul:with:nBits:nPartitions: (in category 'combination rules') ----- - partitionedMul: word1 with: word2 nBits: nBits nPartitions: nParts - "Multiply word1 with word2 as nParts partitions of nBits each. - This is useful for packed pixels, or packed colors. - Bug in loop version when non-white background" - - | sMask product result dMask | - "In C, integer multiplication might answer a wrong value if the unsigned values are declared as signed. - This problem does not affect this method, because the most significant bit (i.e. the sign bit) will - always be zero (jmv)" - <returnTypeC: 'unsigned int'> - <var: 'word1' type: #'unsigned int'> - <var: 'word2' type: #'unsigned int'> - <var: 'sMask' type: #'unsigned int'> - <var: 'dMask' type: #'unsigned int'> - <var: 'result' type: #'unsigned int'> - <var: 'product' type: #'unsigned int'> - sMask := maskTable at: nBits. "partition mask starts at the right" - dMask := sMask << nBits. - result := (((word1 bitAnd: sMask)+1) * ((word2 bitAnd: sMask)+1) - 1 - bitAnd: dMask) >> nBits. "optimized first step" - nParts = 1 - ifTrue: [ ^result ]. - product := (((word1>>nBits bitAnd: sMask)+1) * ((word2>>nBits bitAnd: sMask)+1) - 1 bitAnd: dMask). - result := result bitOr: product. - nParts = 2 - ifTrue: [ ^result ]. - product := (((word1>>(2*nBits) bitAnd: sMask)+1) * ((word2>>(2*nBits) bitAnd: sMask)+1) - 1 bitAnd: dMask). - result := result bitOr: product << nBits. - nParts = 3 - ifTrue: [ ^result ]. - product := (((word1>>(3*nBits) bitAnd: sMask)+1) * ((word2>>(3*nBits) bitAnd: sMask)+1) - 1 bitAnd: dMask). - result := result bitOr: product << (2*nBits). - ^ result - - " | sMask product result dMask | - sMask := maskTable at: nBits. 'partition mask starts at the right' - dMask := sMask << nBits. - result := (((word1 bitAnd: sMask)+1) * ((word2 bitAnd: sMask)+1) - 1 - bitAnd: dMask) >> nBits. 'optimized first step' - nBits to: nBits * (nParts-1) by: nBits do: [:ofs | - product := (((word1>>ofs bitAnd: sMask)+1) * ((word2>>ofs bitAnd: sMask)+1) - 1 bitAnd: dMask). - result := result bitOr: (product bitAnd: dMask) << (ofs-nBits)]. - ^ result"!
Item was added: + ----- Method: BitBltSimulation>>partitionedMul:with:nBits:wordBits: (in category 'combination rules') ----- + partitionedMul: word1 with: word2 nBits: nBits wordBits: wordBits + "Multiply each channel of nBits in word1 and word2. + We assume that for each channel of nBits, we multiply ratios in interval [0..1], scaled by (1 << nBits - 1). + result := ((channel1/scale) * (channel2/scale) * scale) rounded + Or after simplification: + result := (channel1 * channel2 / scale) rounded + This is implemented by first forming the double precision products (channel1 * channel2) on a double-word. + Then dividing each double precision channel by scale, with correctly rounded operation. + With proper tricks, some of these operations can be multiplexed + (all channels are formed in parallel with a single sequence of operation)." + + | channelMask groupMask doubleGroupMask doubleWord1 doubleWord2 doubleWordMul half shift result highWordShift nGroups n2 | + <returnTypeC: 'unsigned int'> + <var: 'word1' type: #'unsigned int'> + <var: 'word2' type: #'unsigned int'> + <var: 'channelMask' type: #'unsigned int'> + <var: 'groupMask' type: #'unsigned int'> + <var: 'half' type: #'unsigned int'> + <var: 'doubleGroupMask' type: #'unsigned long long'> + <var: 'doubleWord1' type: #'unsigned long long'> + <var: 'doubleWord2' type: #'unsigned long long'> + <var: 'doubleWordMul' type: #'unsigned long long'> + <var: 'result' type: #'unsigned int'> + n2 := 2 * nBits. "width of double-precision channel" + channelMask := 1 << nBits - 1. "partition mask starts at the right" + nGroups := wordBits // nBits + 1 // 2. "number of channels that fit in a word, when alternating with group of zeros" + groupMask := channelMask. "form a word mask with alternate nBits 0 and nBits 1, so as to select even channels" + 2 to: nGroups do: [:i | groupMask := groupMask << n2 + channelMask]. + highWordShift := nGroups * n2. "shift for putting odd channels in high-word - usually wordBits, except if wordBits \ nBits ~= 0" + + doubleWord1 := word1 >> nBits bitAnd: groupMask. "select odd channel interleaved with groups of nBits zeros, so as to leave room for double-precision multiplication" + doubleWord2 := word2 >> nBits bitAnd: groupMask. + doubleWord1 := doubleWord1 << highWordShift + (word1 bitAnd: groupMask). "Put odd channels in high word, and even channels in low word" + doubleWord2 := doubleWord2 << highWordShift + (word2 bitAnd: groupMask). + + half := channelMask >> 1 + 1. "mid-value to add for getting a correctly rounded division" + shift := 0. + doubleWordMul := 0. + 1 to: wordBits // nBits do: [:i | + doubleWordMul := doubleWordMul + ((doubleWord1 >> shift bitAnd: channelMask) * (doubleWord2 >> shift bitAnd: channelMask) + half << shift). "multiply each channel of the two operands" + shift := shift + n2]. + + doubleGroupMask := groupMask. "form a mask for extracting single-precision channels in the double word" + doubleGroupMask := doubleGroupMask << highWordShift + groupMask. + + doubleWordMul := (doubleWordMul >> nBits bitAnd: doubleGroupMask) + doubleWordMul >> nBits bitAnd: doubleGroupMask. "divide by scale" + result := doubleWordMul >> (highWordShift - nBits) + (doubleWordMul bitAnd: groupMask). "compact channels back into a single word" + ^result!
Item was changed: ----- Method: BitBltSimulation>>rgbMul:with: (in category 'combination rules') ----- rgbMul: sourceWord with: destinationWord <inline: false> <returnTypeC: 'unsigned int'> <var: 'sourceWord' type: #'unsigned int'> <var: 'destinationWord' type: #'unsigned int'> destDepth < 16 ifTrue: ["Mul each pixel separately" + destDepth = 1 ifTrue: [^self bitAnd: sourceWord with: destinationWord]. + ^ self partitionedMul: sourceWord with: destinationWord nBits: destDepth wordBits: 32]. - ^ self partitionedMul: sourceWord with: destinationWord - nBits: destDepth nPartitions: destPPW]. destDepth = 16 ifTrue: ["Mul RGB components of each pixel separately" + ^ (self partitionedMul: (sourceWord bitAnd: 16rFFFF) with: (destinationWord bitAnd: 16rFFFF) nBits: 5 wordBits: 16) + + ((self partitionedMul: sourceWord>>16 with: destinationWord>>16 nBits: 5 wordBits: 16) << 16)] - ^ (self partitionedMul: sourceWord with: destinationWord - nBits: 5 nPartitions: 3) - + ((self partitionedMul: sourceWord>>16 with: destinationWord>>16 - nBits: 5 nPartitions: 3) << 16)] ifFalse: ["Mul RGBA components of the pixel separately" + ^ self partitionedMul: sourceWord with: destinationWord nBits: 8 wordBits: 32]! - ^ self partitionedMul: sourceWord with: destinationWord - nBits: 8 nPartitions: 4] - - " | scanner | - Display repaintMorphicDisplay. - scanner := DisplayScanner quickPrintOn: Display. - MessageTally time: [0 to: 760 by: 4 do: [:y |scanner drawString: 'qwrepoiuasfd=)(/&()=#!!lkjzxv.,mn124+09857907QROIYTOAFDJZXNBNB,M-.,Mqwrepoiuasfd=)(/&()=#!!lkjzxv.,mn124+09857907QROIYTOAFDJZXNBNB,M-.,M1234124356785678' at: 0@y]]. "!
Item was added: + ----- Method: BitBltSimulationTest>>testRgbMulDepth16 (in category 'tests') ----- + testRgbMulDepth16 + | x f1 f2 f3 bb | + x := 1 << 5. + f1 := Form extent: x@x depth: 16. + f2 := Form extent: x@x depth: 16. + 0 to: x-1 do: [:ix | + 0 to: x-1 do: [:iy | + f1 pixelValueAt: ix@iy put: ((ix bitOr: ix+10\x<<5) bitOr: ix+20\x<<10). + f2 pixelValueAt: ix@iy put: ((iy bitOr: iy+10\x<<5) bitOr: iy+20\x<<10)]]. + f3 := f2 copy. + bb := BitBlt new. + bb setDestForm: f3; sourceForm: f1. + bb sourceX: 0; sourceY: 0; destX: 0; destY: 0. + bb width: x; height: x. + bb combinationRule: Form rgbMul. + bb copyBits. + 0 to: x-1 do: [:ix | + 0 to: x-1 do: [:iy | + "Test that each 5 bits rgb channel is correctly rounded multiplication" + self assert: ((f3 pixelValueAt: ix@iy) >> 10 bitAnd: 31) + = (((f1 pixelValueAt: ix@iy) >> 10 bitAnd: 31) + * ((f2 pixelValueAt: ix@iy) >>10 bitAnd: 31) / (x - 1)) rounded. + self assert: ((f3 pixelValueAt: ix@iy) >> 5 bitAnd: 31) + = (((f1 pixelValueAt: ix@iy) >> 5 bitAnd: 31) + * ((f2 pixelValueAt: ix@iy) >>5 bitAnd: 31) / (x - 1)) rounded. + self assert: ((f3 pixelValueAt: ix@iy) bitAnd: 31) + = (((f1 pixelValueAt: ix@iy) bitAnd: 31) + * ((f2 pixelValueAt: ix@iy) bitAnd: 31) / (x - 1)) rounded]]!
Item was added: + ----- Method: BitBltSimulationTest>>testRgbMulDepth1to8 (in category 'tests') ----- + testRgbMulDepth1to8 + "Note that depth=32 and depth=8 have exactly same effect 32bits-word-wise + since we decompose 32 bits depth in four 8-bits channels, ARGB. + Only depth 16 is special, with 3 channels of 5 bits, and 1 dead bit." + #(1 2 4 8) do: [:d | + | x f1 f2 f3 bb | + x := 1 << d. + f1 := Form extent: x@x depth: d. + f2 := Form extent: x@x depth: d. + 0 to: x-1 do: [:ix | + 0 to: x-1 do: [:iy | + f1 pixelValueAt: ix@iy put: ix. + f2 pixelValueAt: ix@iy put: iy]]. + f3 := f2 copy. + bb := BitBlt new. + bb setDestForm: f3; sourceForm: f1. + bb sourceX: 0; sourceY: 0; destX: 0; destY: 0. + bb width: x; height: x. + bb combinationRule: Form rgbMul. + bb copyBits. + 0 to: x-1 do: [:ix | + 0 to: x-1 do: [:iy | + self assert: (f3 pixelValueAt: ix@iy) = ((f1 pixelValueAt: ix@iy) * (f2 pixelValueAt: ix@iy) / (x - 1)) rounded]]]!
vm-dev@lists.squeakfoundation.org