[Vm-dev] VM Maker: VMMaker.oscog-nice.3251.mcz

Thu Aug 25 16:09:20 UTC 2022

Nicolas Cellier uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-nice.3251.mcz

==================== Summary ====================

Name: VMMaker.oscog-nice.3251
Author: nice
Time: 25 August 2022, 6:09:00.826995 pm
UUID: fad63ac4-33e9-944d-a688-a69c2b8f5b33
Ancestors: VMMaker.oscog-nice.3250

Implement a correctly rounded rgbMul BitBlt with the ideas developped here:

https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/651

Also provide rgbMul tests for all depths (1 2 4 8 16 32)
(32 bits depth test = 8 bits depth test).

=============== Diff against VMMaker.oscog-nice.3250 ===============

Item was removed:
- ----- Method: BitBltSimulation>>partitionedMul:with:nBits:nPartitions: (in category 'combination rules') -----
- partitionedMul: word1 with: word2 nBits: nBits nPartitions: nParts
- 	"Multiply word1 with word2 as nParts partitions of nBits each.
- 	This is useful for packed pixels, or packed colors.
- 	Bug in loop version when non-white background"
- 
- 	| sMask product result dMask |
- 	"In C, integer multiplication might answer a wrong value if the unsigned values are declared as signed.
- 	This problem does not affect this method, because the most significant bit (i.e. the sign bit) will
- 	always be zero (jmv)"
- 	<returnTypeC: 'unsigned int'>
- 	<var: 'word1' type: #'unsigned int'>
- 	<var: 'word2' type: #'unsigned int'>
- 	<var: 'sMask' type: #'unsigned int'>
- 	<var: 'dMask' type: #'unsigned int'>
- 	<var: 'result' type: #'unsigned int'>
- 	<var: 'product' type: #'unsigned int'>
- 	sMask := maskTable at: nBits.  "partition mask starts at the right"
- 	dMask :=  sMask << nBits.
- 	result := (((word1 bitAnd: sMask)+1) * ((word2 bitAnd: sMask)+1) - 1 
- 				bitAnd: dMask) >> nBits.	"optimized first step"
- 	nParts = 1
- 		ifTrue: [ ^result ].
- 	product := (((word1>>nBits bitAnd: sMask)+1) * ((word2>>nBits bitAnd: sMask)+1) - 1 bitAnd: dMask).
- 	result := result bitOr: product.
- 	nParts = 2
- 		ifTrue: [ ^result ].
- 	product := (((word1>>(2*nBits) bitAnd: sMask)+1) * ((word2>>(2*nBits) bitAnd: sMask)+1) - 1 bitAnd: dMask).
- 	result := result bitOr: product << nBits.
- 	nParts = 3
- 		ifTrue: [ ^result ].
- 	product := (((word1>>(3*nBits) bitAnd: sMask)+1) * ((word2>>(3*nBits) bitAnd: sMask)+1) - 1 bitAnd: dMask).
- 	result := result bitOr: product << (2*nBits).
- 	^ result
- 
- "	| sMask product result dMask |
- 	sMask := maskTable at: nBits.  'partition mask starts at the right'
- 	dMask :=  sMask << nBits.
- 	result := (((word1 bitAnd: sMask)+1) * ((word2 bitAnd: sMask)+1) - 1 
- 				bitAnd: dMask) >> nBits.	'optimized first step'
- 	nBits to: nBits * (nParts-1) by: nBits do: [:ofs |
- 		product := (((word1>>ofs bitAnd: sMask)+1) * ((word2>>ofs bitAnd: sMask)+1) - 1 bitAnd: dMask).
- 		result := result bitOr: (product bitAnd: dMask) << (ofs-nBits)].
- 	^ result"!

Item was added:
+ ----- Method: BitBltSimulation>>partitionedMul:with:nBits:wordBits: (in category 'combination rules') -----
+ partitionedMul: word1 with: word2 nBits: nBits wordBits: wordBits
+ 	"Multiply each channel of nBits in word1 and word2.
+ 	We assume that for each channel of nBits, we multiply ratios in interval [0..1], scaled by (1 << nBits - 1).
+ 		result := ((channel1/scale) * (channel2/scale) * scale) rounded
+ 	Or after simplification:
+ 		result := (channel1 * channel2 / scale) rounded
+ 	This is implemented by first forming the double precision products (channel1 * channel2) on a double-word.
+ 	Then dividing each double precision channel by scale, with correctly rounded operation.
+ 	With proper tricks, some of these operations can be multiplexed
+ 	(all channels are formed in parallel with a single sequence of operation)."
+ 
+ 	| channelMask groupMask doubleGroupMask doubleWord1 doubleWord2 doubleWordMul half shift result highWordShift nGroups n2 |
+ 	<returnTypeC: 'unsigned int'>
+ 	<var: 'word1' type: #'unsigned int'>
+ 	<var: 'word2' type: #'unsigned int'>
+ 	<var: 'channelMask' type: #'unsigned int'>
+ 	<var: 'groupMask' type: #'unsigned int'>
+ 	<var: 'half' type: #'unsigned int'>
+ 	<var: 'doubleGroupMask' type: #'unsigned long long'>
+ 	<var: 'doubleWord1' type: #'unsigned long long'>
+ 	<var: 'doubleWord2' type: #'unsigned long long'>
+ 	<var: 'doubleWordMul' type: #'unsigned long long'>
+ 	<var: 'result' type: #'unsigned int'>
+ 	n2 := 2 * nBits.	"width of double-precision channel"
+ 	channelMask := 1 << nBits - 1.  "partition mask starts at the right"
+ 	nGroups := wordBits // nBits + 1 // 2.	"number of channels that fit in a word, when alternating with group of zeros"
+ 	groupMask := channelMask.	"form a word mask with alternate nBits 0 and nBits 1, so as to select even channels"
+ 	2 to: nGroups do: [:i | groupMask := groupMask << n2 + channelMask].
+ 	highWordShift := nGroups * n2.	"shift for putting odd channels in high-word - usually wordBits, except if wordBits \\ nBits ~= 0"
+ 	
+ 	doubleWord1 := word1 >> nBits bitAnd: groupMask.	"select odd channel interleaved with groups of nBits zeros, so as to leave room for double-precision multiplication"
+ 	doubleWord2 := word2 >> nBits bitAnd: groupMask.
+ 	doubleWord1 := doubleWord1 << highWordShift + (word1 bitAnd: groupMask).	"Put odd channels in high word, and even channels in low word"
+ 	doubleWord2 := doubleWord2 << highWordShift + (word2 bitAnd: groupMask).
+ 
+ 	half := channelMask >> 1 + 1. "mid-value to add for getting a correctly rounded division"
+ 	shift := 0.
+ 	doubleWordMul  := 0.
+ 	1 to: wordBits // nBits do: [:i |
+ 		doubleWordMul := doubleWordMul + ((doubleWord1 >> shift bitAnd: channelMask) * (doubleWord2 >> shift bitAnd: channelMask) + half << shift). "multiply each channel of the two operands"
+ 		shift := shift + n2].
+ 
+ 	doubleGroupMask := groupMask.	"form a mask for extracting single-precision channels in the double word"
+ 	doubleGroupMask := doubleGroupMask << highWordShift + groupMask.
+ 
+ 	doubleWordMul := (doubleWordMul >> nBits bitAnd: doubleGroupMask) + doubleWordMul >> nBits bitAnd: doubleGroupMask.	"divide by scale"
+ 	result := doubleWordMul >> (highWordShift - nBits) + (doubleWordMul bitAnd: groupMask).	"compact channels back into a single word"
+ 	^result!

Item was changed:
  ----- Method: BitBltSimulation>>rgbMul:with: (in category 'combination rules') -----
  rgbMul: sourceWord with: destinationWord
  	<inline: false>
  	<returnTypeC: 'unsigned int'>
  	<var: 'sourceWord' type: #'unsigned int'>
  	<var: 'destinationWord' type: #'unsigned int'>
  	destDepth < 16 ifTrue:
  		["Mul each pixel separately"
+ 		destDepth = 1 ifTrue: [^self bitAnd: sourceWord with: destinationWord].
+ 		^ self partitionedMul: sourceWord with: destinationWord nBits: destDepth wordBits: 32].
- 		^ self partitionedMul: sourceWord with: destinationWord
- 						nBits: destDepth nPartitions: destPPW].
  	destDepth = 16 ifTrue:
  		["Mul RGB components of each pixel separately"
+ 		^ (self partitionedMul: (sourceWord bitAnd: 16rFFFF) with: (destinationWord bitAnd: 16rFFFF) nBits: 5 wordBits: 16)
+ 		+ ((self partitionedMul: sourceWord>>16 with: destinationWord>>16 nBits: 5 wordBits: 16) << 16)]
- 		^ (self partitionedMul: sourceWord with: destinationWord
- 						nBits: 5 nPartitions: 3)
- 		+ ((self partitionedMul: sourceWord>>16 with: destinationWord>>16
- 						nBits: 5 nPartitions: 3) << 16)]
  	ifFalse:
  		["Mul RGBA components of the pixel separately"
+ 		^ self partitionedMul: sourceWord with: destinationWord nBits: 8 wordBits: 32]!
- 		^ self partitionedMul: sourceWord with: destinationWord
- 						nBits: 8 nPartitions: 4]
- 
- "	| scanner |
- 	Display repaintMorphicDisplay.
- 	scanner := DisplayScanner quickPrintOn: Display.
- 	MessageTally time: [0 to: 760 by: 4 do:  [:y |scanner drawString: 'qwrepoiuasfd=)(/&()=#!!lkjzxv.,mn124+09857907QROIYTOAFDJZXNBNB,M-.,Mqwrepoiuasfd=)(/&()=#!!lkjzxv.,mn124+09857907QROIYTOAFDJZXNBNB,M-.,M1234124356785678' at: 0 at y]]. "!

Item was added:
+ ----- Method: BitBltSimulationTest>>testRgbMulDepth16 (in category 'tests') -----
+ testRgbMulDepth16 
+ 	| x f1 f2 f3 bb |
+ 	x := 1 << 5.
+ 	f1 := Form extent: x at x depth: 16.
+ 	f2 := Form extent: x at x depth: 16.
+ 	0 to: x-1 do: [:ix |
+ 		0 to: x-1 do: [:iy |
+ 			f1 pixelValueAt: ix at iy put: ((ix bitOr: ix+10\\x<<5)  bitOr: ix+20\\x<<10).
+ 			f2 pixelValueAt: ix at iy put: ((iy bitOr: iy+10\\x<<5)  bitOr: iy+20\\x<<10)]].
+ 	f3 := f2 copy.
+ 	bb := BitBlt new.
+ 	bb setDestForm: f3; sourceForm: f1.
+ 	bb sourceX: 0; sourceY: 0; destX: 0; destY: 0.
+ 	bb width: x; height: x.
+ 	bb combinationRule: Form rgbMul.
+ 	bb copyBits.
+ 	0 to: x-1 do: [:ix |
+ 		0 to: x-1 do: [:iy |
+ 			"Test that each 5 bits rgb channel is correctly rounded multiplication"
+ 			self assert: ((f3 pixelValueAt: ix at iy) >> 10 bitAnd: 31)
+ 				= (((f1 pixelValueAt: ix at iy) >> 10 bitAnd: 31)
+ 				* ((f2 pixelValueAt: ix at iy) >>10 bitAnd: 31) / (x - 1)) rounded.
+ 			self assert: ((f3 pixelValueAt: ix at iy) >> 5 bitAnd: 31)
+ 				= (((f1 pixelValueAt: ix at iy) >> 5 bitAnd: 31)
+ 				* ((f2 pixelValueAt: ix at iy) >>5 bitAnd: 31) / (x - 1)) rounded.
+ 			self assert: ((f3 pixelValueAt: ix at iy) bitAnd: 31)
+ 				= (((f1 pixelValueAt: ix at iy) bitAnd: 31)
+ 				* ((f2 pixelValueAt: ix at iy) bitAnd: 31) / (x - 1)) rounded]]!

Item was added:
+ ----- Method: BitBltSimulationTest>>testRgbMulDepth1to8 (in category 'tests') -----
+ testRgbMulDepth1to8
+ 	"Note that depth=32 and depth=8 have exactly same effect 32bits-word-wise
+ 	since we decompose 32 bits depth in four 8-bits channels, ARGB.
+ 	Only depth 16 is special, with 3 channels of 5 bits, and 1 dead bit."
+ 	#(1 2 4 8) do: [:d |
+ 			| x f1 f2 f3 bb |
+ 			x := 1 << d.
+ 			f1 := Form extent: x at x depth: d.
+ 			f2 := Form extent: x at x depth: d.
+ 			0 to: x-1 do: [:ix |
+ 				0 to: x-1 do: [:iy |
+ 					f1 pixelValueAt: ix at iy put: ix.
+ 					f2 pixelValueAt: ix at iy put: iy]].
+ 			f3 := f2 copy.
+ 			bb := BitBlt new.
+ 			bb setDestForm: f3; sourceForm: f1.
+ 			bb sourceX: 0; sourceY: 0; destX: 0; destY: 0.
+ 			bb width: x; height: x.
+ 			bb combinationRule: Form rgbMul.
+ 			bb copyBits.
+ 			0 to: x-1 do: [:ix |
+ 				0 to: x-1 do: [:iy |
+ 					self assert: (f3 pixelValueAt: ix at iy) = ((f1 pixelValueAt: ix at iy) * (f2 pixelValueAt: ix at iy) / (x - 1)) rounded]]]!