[Vm-dev] pixelValueAt: is not robust to hibernation [was: [squeak-dev] The Trunk: MorphicExtras-nice.137.mcz]

David T. Lewis lewis at mail.msen.com
Fri Dec 20 20:45:26 UTC 2013


On Fri, Dec 20, 2013 at 06:39:23PM +0100, Nicolas Cellier wrote:
>  
> 2013/12/20 David T. Lewis <lewis at mail.msen.com>
> 
> > On Fri, Dec 20, 2013 at 04:54:39PM +0100, Nicolas Cellier wrote:
> >
> > > 2013/12/20 David T. Lewis <lewis at mail.msen.com>
> > >
> > > > I think there are two things we should test:
> > > >
> > > > 1) Check if the extra computation has an effect on performance.
> > > >
> > >
> > > I don't think the difference will be measurable, that's very few ops
> > > overhead...
> > > Even if it was, it must be compared to in-image protection alternative.
> > > So I don't think that this is relevant in this case.
> > >
> >
> > You are probably right, although I would not mind checking to be sure. Can
> > you suggest a simple test that I can run to exercise the primitive heavily?
> > I will try to follow up later with a timeToRun comparison of a VM before
> > and after the change.
> >
> >
> most heavy use are probably Form>>floodFill2:at: and
> PNMReadWriter>>nextPutRGB:
> 
> For example:
> 
> AndreasSystemProfiler spyOn: [| prw |
>     prw := PNMReadWriter new.
>     prw stream: (ByteArray new: 1e5) writeStream.
>     [prw nextPutImage: PaintBoxMorph paletteImage] bench].
> 

As you predicted, performance does not seem to be negatively affected by the change.

I tested like this:

someForms := Form allSubInstances.
testBlock := [someForms do: [:form |
		10 timesRepeat: [
			1 to: form width do: [:x |
			1 to: form height do: [:y |
				form primPixelValueAtX: x y: y ]]]]].

Results for an interpreter VM prior to making the change:
Time millisecondsToRun: testBlock. ==> 10548 "interpreter before change"
Time millisecondsToRun: testBlock. ==> 10596 "interpreter before change"
Time millisecondsToRun: testBlock. ==> 10741 "interpreter before change"
Time millisecondsToRun: testBlock. ==> 10703 "interpreter before change"
Time millisecondsToRun: testBlock. ==> 10668 "interpreter before change"

Result for a Cog VM prior to making the change, to confirm that most time
is being spent in the primitive in this test:
Time millisecondsToRun: testBlock. ==> 7582 "cog before change"

The same interpreter VM with your change applied:
Time millisecondsToRun: testBlock. ==> 10681 "interpreter after change"
Time millisecondsToRun: testBlock. ==> 9849 "interpreter after change"
Time millisecondsToRun: testBlock. ==> 9383 "interpreter after change"
Time millisecondsToRun: testBlock. ==> 9405 "interpreter after change"
Time millisecondsToRun: testBlock. ==> 9386 "interpreter after change"

Same as above with interpreter VM, and with my rearrangement to move the
check for bitmap isWordsOrBytes ahead of any use of the bitmaps variable:
Time millisecondsToRun: testBlock. ==> 10564 "interpreter after change"
Time millisecondsToRun: testBlock. ==> 10097 "interpreter after change"
Time millisecondsToRun: testBlock. ==> 10316 "interpreter after change"
Time millisecondsToRun: testBlock. ==> 10295 "interpreter after change"
Time millisecondsToRun: testBlock. ==> 9766 "interpreter after change"

Conclusion: The change does not hurt performance. In fact, performance is
inexplicably better with the change (possibly related to some random effect
on C compiler optimizer).

I updated Mantis with the final version of the method that I used for
this test (also attached here). The change is to do the check for bitmap
isWordsOrBytes immediately after taking bitmap from the stack, before any
other use of the variable. The code will actually work fine as you wrote
it (because byteSizeOf: bitmap will answer 0 if it is an integer object),
but it seemed a bit safer on general principles to do the test as early
as possible. On the other hand, the way you originally wrote it seems to
be faster (I cannot say why). What do you think?

Dave

-------------- next part --------------
'From Squeak4.3 of 22 June 2013 [latest update: #12810] on 20 December 2013 at 3:13:32 pm'!

!BitBltSimulation methodsFor: 'primitives' stamp: 'dtl 12/20/2013 15:12'!
primitivePixelValueAtX: xVal y: yVal
	"returns the single pixel at x at y.
	It does not handle LSB bitmaps right now.
	If x or y are < 0, return 0 to indicate transparent (cf BitBlt>bitPeekerFromForm: usage).
	Likewise if x>width or y>depth.
	Fail if the rcvr doesn't seem to be a Form, or x|y seem wrong"
	| rcvr bitmap depth ppW stride bitsSize word mask shift pixel |
	rcvr := self primitive: 'primitivePixelValueAt' parameters: #(SmallInteger SmallInteger) receiver: #Oop.
	
	"possible quick exit if x or y is -ve"
	(xVal < 0 or: [ yVal < 0 ] ) ifTrue:[^interpreterProxy integerObjectOf: 0].
	"check that rcvr is plausibly a Form or subclass"	
	rcvr := interpreterProxy stackValue: interpreterProxy methodArgumentCount.
	((interpreterProxy isPointers: rcvr) and: [(interpreterProxy slotSizeOf: rcvr) >= 4])
		ifFalse: [^interpreterProxy primitiveFail].

	"get the bits oop and width/height/depth"
	bitmap := interpreterProxy fetchPointer: FormBitsIndex ofObject: rcvr.
	(interpreterProxy isWordsOrBytes: bitmap) ifFalse: [^interpreterProxy primitiveFail].
	width := interpreterProxy fetchInteger: FormWidthIndex ofObject: rcvr.
	height := interpreterProxy fetchInteger: FormHeightIndex ofObject: rcvr.
	depth := interpreterProxy fetchInteger: FormDepthIndex ofObject: rcvr.
	"if width/height/depth are not integer, fail"
	interpreterProxy failed ifTrue:[^nil].

	"possible quick exit if x or y is >= extent of form. This also catches cases where the width/height are < 0"
	(xVal >= width or: [ yVal >= height ] ) ifTrue:[^interpreterProxy integerObjectOf: 0].

	"we don't handle LSB Forms yet"
	depth < 0 ifTrue:[^interpreterProxy primitiveFail].
	
	"OK so now we know we have a plausible Form, the width/height/depth/x/y are all reasonable and it's time to plunder the bitmap"
	ppW := 32//depth. "pixels in each word"
	stride := (width + (ppW  -1)) // ppW. "how many words per row of pixels"
	bitsSize := interpreterProxy byteSizeOf: bitmap.
	bitsSize = (stride * height * 4 "bytes per word")
		ifFalse: [^interpreterProxy primitiveFail].

	word := interpreterProxy fetchLong32:(yVal * stride) + (xVal//ppW) ofObject: bitmap. "load the word that contains our target"
	mask := 16rFFFFFFFF >> (32 - depth). "make a mask to isolate the pixel within that word"
	shift := 32 - (((xVal bitAnd: ppW-1) + 1) * depth). "this is the tricky MSB part - we mask the xVal to find how far into the word we need, then add 1 for the pixel we're looking for, then * depth to get the bit shift"
	pixel := (word >> shift) bitAnd: mask. "shift, mask and dim the lights"
	^ pixel asPositiveIntegerObj "pop the incoming and push our answer"
! !


More information about the Vm-dev mailing list