[Vm-dev] VM Maker: VMMaker.oscog-nice.3251.mcz
commits at source.squeak.org
commits at source.squeak.org
Thu Aug 25 16:09:20 UTC 2022
Nicolas Cellier uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-nice.3251.mcz
==================== Summary ====================
Name: VMMaker.oscog-nice.3251
Author: nice
Time: 25 August 2022, 6:09:00.826995 pm
UUID: fad63ac4-33e9-944d-a688-a69c2b8f5b33
Ancestors: VMMaker.oscog-nice.3250
Implement a correctly rounded rgbMul BitBlt with the ideas developped here:
https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/651
Also provide rgbMul tests for all depths (1 2 4 8 16 32)
(32 bits depth test = 8 bits depth test).
=============== Diff against VMMaker.oscog-nice.3250 ===============
Item was removed:
- ----- Method: BitBltSimulation>>partitionedMul:with:nBits:nPartitions: (in category 'combination rules') -----
- partitionedMul: word1 with: word2 nBits: nBits nPartitions: nParts
- "Multiply word1 with word2 as nParts partitions of nBits each.
- This is useful for packed pixels, or packed colors.
- Bug in loop version when non-white background"
-
- | sMask product result dMask |
- "In C, integer multiplication might answer a wrong value if the unsigned values are declared as signed.
- This problem does not affect this method, because the most significant bit (i.e. the sign bit) will
- always be zero (jmv)"
- <returnTypeC: 'unsigned int'>
- <var: 'word1' type: #'unsigned int'>
- <var: 'word2' type: #'unsigned int'>
- <var: 'sMask' type: #'unsigned int'>
- <var: 'dMask' type: #'unsigned int'>
- <var: 'result' type: #'unsigned int'>
- <var: 'product' type: #'unsigned int'>
- sMask := maskTable at: nBits. "partition mask starts at the right"
- dMask := sMask << nBits.
- result := (((word1 bitAnd: sMask)+1) * ((word2 bitAnd: sMask)+1) - 1
- bitAnd: dMask) >> nBits. "optimized first step"
- nParts = 1
- ifTrue: [ ^result ].
- product := (((word1>>nBits bitAnd: sMask)+1) * ((word2>>nBits bitAnd: sMask)+1) - 1 bitAnd: dMask).
- result := result bitOr: product.
- nParts = 2
- ifTrue: [ ^result ].
- product := (((word1>>(2*nBits) bitAnd: sMask)+1) * ((word2>>(2*nBits) bitAnd: sMask)+1) - 1 bitAnd: dMask).
- result := result bitOr: product << nBits.
- nParts = 3
- ifTrue: [ ^result ].
- product := (((word1>>(3*nBits) bitAnd: sMask)+1) * ((word2>>(3*nBits) bitAnd: sMask)+1) - 1 bitAnd: dMask).
- result := result bitOr: product << (2*nBits).
- ^ result
-
- " | sMask product result dMask |
- sMask := maskTable at: nBits. 'partition mask starts at the right'
- dMask := sMask << nBits.
- result := (((word1 bitAnd: sMask)+1) * ((word2 bitAnd: sMask)+1) - 1
- bitAnd: dMask) >> nBits. 'optimized first step'
- nBits to: nBits * (nParts-1) by: nBits do: [:ofs |
- product := (((word1>>ofs bitAnd: sMask)+1) * ((word2>>ofs bitAnd: sMask)+1) - 1 bitAnd: dMask).
- result := result bitOr: (product bitAnd: dMask) << (ofs-nBits)].
- ^ result"!
Item was added:
+ ----- Method: BitBltSimulation>>partitionedMul:with:nBits:wordBits: (in category 'combination rules') -----
+ partitionedMul: word1 with: word2 nBits: nBits wordBits: wordBits
+ "Multiply each channel of nBits in word1 and word2.
+ We assume that for each channel of nBits, we multiply ratios in interval [0..1], scaled by (1 << nBits - 1).
+ result := ((channel1/scale) * (channel2/scale) * scale) rounded
+ Or after simplification:
+ result := (channel1 * channel2 / scale) rounded
+ This is implemented by first forming the double precision products (channel1 * channel2) on a double-word.
+ Then dividing each double precision channel by scale, with correctly rounded operation.
+ With proper tricks, some of these operations can be multiplexed
+ (all channels are formed in parallel with a single sequence of operation)."
+
+ | channelMask groupMask doubleGroupMask doubleWord1 doubleWord2 doubleWordMul half shift result highWordShift nGroups n2 |
+ <returnTypeC: 'unsigned int'>
+ <var: 'word1' type: #'unsigned int'>
+ <var: 'word2' type: #'unsigned int'>
+ <var: 'channelMask' type: #'unsigned int'>
+ <var: 'groupMask' type: #'unsigned int'>
+ <var: 'half' type: #'unsigned int'>
+ <var: 'doubleGroupMask' type: #'unsigned long long'>
+ <var: 'doubleWord1' type: #'unsigned long long'>
+ <var: 'doubleWord2' type: #'unsigned long long'>
+ <var: 'doubleWordMul' type: #'unsigned long long'>
+ <var: 'result' type: #'unsigned int'>
+ n2 := 2 * nBits. "width of double-precision channel"
+ channelMask := 1 << nBits - 1. "partition mask starts at the right"
+ nGroups := wordBits // nBits + 1 // 2. "number of channels that fit in a word, when alternating with group of zeros"
+ groupMask := channelMask. "form a word mask with alternate nBits 0 and nBits 1, so as to select even channels"
+ 2 to: nGroups do: [:i | groupMask := groupMask << n2 + channelMask].
+ highWordShift := nGroups * n2. "shift for putting odd channels in high-word - usually wordBits, except if wordBits \\ nBits ~= 0"
+
+ doubleWord1 := word1 >> nBits bitAnd: groupMask. "select odd channel interleaved with groups of nBits zeros, so as to leave room for double-precision multiplication"
+ doubleWord2 := word2 >> nBits bitAnd: groupMask.
+ doubleWord1 := doubleWord1 << highWordShift + (word1 bitAnd: groupMask). "Put odd channels in high word, and even channels in low word"
+ doubleWord2 := doubleWord2 << highWordShift + (word2 bitAnd: groupMask).
+
+ half := channelMask >> 1 + 1. "mid-value to add for getting a correctly rounded division"
+ shift := 0.
+ doubleWordMul := 0.
+ 1 to: wordBits // nBits do: [:i |
+ doubleWordMul := doubleWordMul + ((doubleWord1 >> shift bitAnd: channelMask) * (doubleWord2 >> shift bitAnd: channelMask) + half << shift). "multiply each channel of the two operands"
+ shift := shift + n2].
+
+ doubleGroupMask := groupMask. "form a mask for extracting single-precision channels in the double word"
+ doubleGroupMask := doubleGroupMask << highWordShift + groupMask.
+
+ doubleWordMul := (doubleWordMul >> nBits bitAnd: doubleGroupMask) + doubleWordMul >> nBits bitAnd: doubleGroupMask. "divide by scale"
+ result := doubleWordMul >> (highWordShift - nBits) + (doubleWordMul bitAnd: groupMask). "compact channels back into a single word"
+ ^result!
Item was changed:
----- Method: BitBltSimulation>>rgbMul:with: (in category 'combination rules') -----
rgbMul: sourceWord with: destinationWord
<inline: false>
<returnTypeC: 'unsigned int'>
<var: 'sourceWord' type: #'unsigned int'>
<var: 'destinationWord' type: #'unsigned int'>
destDepth < 16 ifTrue:
["Mul each pixel separately"
+ destDepth = 1 ifTrue: [^self bitAnd: sourceWord with: destinationWord].
+ ^ self partitionedMul: sourceWord with: destinationWord nBits: destDepth wordBits: 32].
- ^ self partitionedMul: sourceWord with: destinationWord
- nBits: destDepth nPartitions: destPPW].
destDepth = 16 ifTrue:
["Mul RGB components of each pixel separately"
+ ^ (self partitionedMul: (sourceWord bitAnd: 16rFFFF) with: (destinationWord bitAnd: 16rFFFF) nBits: 5 wordBits: 16)
+ + ((self partitionedMul: sourceWord>>16 with: destinationWord>>16 nBits: 5 wordBits: 16) << 16)]
- ^ (self partitionedMul: sourceWord with: destinationWord
- nBits: 5 nPartitions: 3)
- + ((self partitionedMul: sourceWord>>16 with: destinationWord>>16
- nBits: 5 nPartitions: 3) << 16)]
ifFalse:
["Mul RGBA components of the pixel separately"
+ ^ self partitionedMul: sourceWord with: destinationWord nBits: 8 wordBits: 32]!
- ^ self partitionedMul: sourceWord with: destinationWord
- nBits: 8 nPartitions: 4]
-
- " | scanner |
- Display repaintMorphicDisplay.
- scanner := DisplayScanner quickPrintOn: Display.
- MessageTally time: [0 to: 760 by: 4 do: [:y |scanner drawString: 'qwrepoiuasfd=)(/&()=#!!lkjzxv.,mn124+09857907QROIYTOAFDJZXNBNB,M-.,Mqwrepoiuasfd=)(/&()=#!!lkjzxv.,mn124+09857907QROIYTOAFDJZXNBNB,M-.,M1234124356785678' at: 0 at y]]. "!
Item was added:
+ ----- Method: BitBltSimulationTest>>testRgbMulDepth16 (in category 'tests') -----
+ testRgbMulDepth16
+ | x f1 f2 f3 bb |
+ x := 1 << 5.
+ f1 := Form extent: x at x depth: 16.
+ f2 := Form extent: x at x depth: 16.
+ 0 to: x-1 do: [:ix |
+ 0 to: x-1 do: [:iy |
+ f1 pixelValueAt: ix at iy put: ((ix bitOr: ix+10\\x<<5) bitOr: ix+20\\x<<10).
+ f2 pixelValueAt: ix at iy put: ((iy bitOr: iy+10\\x<<5) bitOr: iy+20\\x<<10)]].
+ f3 := f2 copy.
+ bb := BitBlt new.
+ bb setDestForm: f3; sourceForm: f1.
+ bb sourceX: 0; sourceY: 0; destX: 0; destY: 0.
+ bb width: x; height: x.
+ bb combinationRule: Form rgbMul.
+ bb copyBits.
+ 0 to: x-1 do: [:ix |
+ 0 to: x-1 do: [:iy |
+ "Test that each 5 bits rgb channel is correctly rounded multiplication"
+ self assert: ((f3 pixelValueAt: ix at iy) >> 10 bitAnd: 31)
+ = (((f1 pixelValueAt: ix at iy) >> 10 bitAnd: 31)
+ * ((f2 pixelValueAt: ix at iy) >>10 bitAnd: 31) / (x - 1)) rounded.
+ self assert: ((f3 pixelValueAt: ix at iy) >> 5 bitAnd: 31)
+ = (((f1 pixelValueAt: ix at iy) >> 5 bitAnd: 31)
+ * ((f2 pixelValueAt: ix at iy) >>5 bitAnd: 31) / (x - 1)) rounded.
+ self assert: ((f3 pixelValueAt: ix at iy) bitAnd: 31)
+ = (((f1 pixelValueAt: ix at iy) bitAnd: 31)
+ * ((f2 pixelValueAt: ix at iy) bitAnd: 31) / (x - 1)) rounded]]!
Item was added:
+ ----- Method: BitBltSimulationTest>>testRgbMulDepth1to8 (in category 'tests') -----
+ testRgbMulDepth1to8
+ "Note that depth=32 and depth=8 have exactly same effect 32bits-word-wise
+ since we decompose 32 bits depth in four 8-bits channels, ARGB.
+ Only depth 16 is special, with 3 channels of 5 bits, and 1 dead bit."
+ #(1 2 4 8) do: [:d |
+ | x f1 f2 f3 bb |
+ x := 1 << d.
+ f1 := Form extent: x at x depth: d.
+ f2 := Form extent: x at x depth: d.
+ 0 to: x-1 do: [:ix |
+ 0 to: x-1 do: [:iy |
+ f1 pixelValueAt: ix at iy put: ix.
+ f2 pixelValueAt: ix at iy put: iy]].
+ f3 := f2 copy.
+ bb := BitBlt new.
+ bb setDestForm: f3; sourceForm: f1.
+ bb sourceX: 0; sourceY: 0; destX: 0; destY: 0.
+ bb width: x; height: x.
+ bb combinationRule: Form rgbMul.
+ bb copyBits.
+ 0 to: x-1 do: [:ix |
+ 0 to: x-1 do: [:iy |
+ self assert: (f3 pixelValueAt: ix at iy) = ((f1 pixelValueAt: ix at iy) * (f2 pixelValueAt: ix at iy) / (x - 1)) rounded]]]!
More information about the Vm-dev
mailing list