[squeak-dev] Rounding in MatrixTransform2x3
Stephan Lutz
dev at stlutz.net
Wed Aug 12 10:11:49 UTC 2020
Ah, yes. I hadn't even thought about that. Probably because I've only
ever used Squeak on 64bits with immediate floats ^^.
I followed your suggestion and implemented a Matrix2x3 in pure Smalltalk.
It's actually really fast and surprisingly even beats the existing
implementation in quite a few cases. :D
Most notably, the plugin-supported point transformation
(/localPointToGlobal:/) is actually slower on my machine.
Only the transformation of multiple points at a time, as done in
/localBoundsToGlobal:/ is significantly (~3x) faster using the plugin.
Below are some of the measurements I have taken in a 64bit trunk image
using squeak.cog.spur_linux64x64:
"--------------------------------"
" localPointToGlobal: "
"--------------------------------"
[mat2x3Old localPointToGlobal: -10 @ 10] bench.
" '11,700,000 per second. 85.6 nanoseconds per run. 1.43971 % GC time.'"
[mat2x3Old transformPoint: -10 @ 10] bench.
" '2,330,000 per second. 429 nanoseconds per run. 0.37985 % GC time.'"
[mat2x3New localPointToGlobal: -10 @ 10] bench.
" '12,500,000 per second. 80.3 nanoseconds per run. 1.89962 % GC time.'"
[morphic localPointToGlobal: -10 @ 10] bench.
" '2,710,000 per second. 370 nanoseconds per run. 1.16 % GC time.'"
"--------------------------------"
" localBoundsToGlobal: "
"--------------------------------"
[mat2x3Old localBoundsToGlobal: rect] bench.
" '6,770,000 per second. 148 nanoseconds per run. 1.67966 % GC time.'"
[mat2x3New localBoundsToGlobal: rect] bench.
" '2,090,000 per second. 480 nanoseconds per run. 1.55969 % GC time.'"
[morphic localBoundsToGlobal: rect] bench.
" '505,000 per second. 1.98 microseconds per run. 1.95922 % GC time.'"
"--------------------------------"
" localBoundsToGlobal: (pure translation) "
"--------------------------------"
[mat2x3OldTranslation localBoundsToGlobal: rect] bench.
" '6,780,000 per second. 147 nanoseconds per run. 1.7 % GC time.'"
[mat2x3NewTranslation localBoundsToGlobal: rect] bench.
" '5,860,000 per second. 171 nanoseconds per run. 1.48 % GC time.'"
[morphicTranslation localBoundsToGlobal: rect] bench.
" '1,580,000 per second. 631 nanoseconds per run. 4.12 % GC time.'"
"--------------------------------"
"composedWithLocal:"
"--------------------------------"
[mat2x3Old composedWithLocal: mat2x3OldRotation] bench.
" '9,670,000 per second. 103 nanoseconds per run. 1.19976 % GC time.'"
[mat2x3New composedWithLocal: mat2x3NewRotation] bench.
" '6,920,000 per second. 144 nanoseconds per run. 1.4997 % GC time.'"
[morphic composedWithLocal: morphicRotation] bench.
" '11,100,000 per second. 89.8 nanoseconds per run. 1.09978 % GC time.'"
"--------------------------------"
" instance creation "
"--------------------------------"
[MatrixTransform2x3 withOffset: offset] bench.
" '3,320,000 per second. 301 nanoseconds per run. 1.91962 % GC time.'"
[Matrix2x3 withOffset: offset] bench.
" '24,800,000 per second. 40.3 nanoseconds per run. 10.63787 % GC
time.'"
[MorphicTransform offset: offset] bench.
" '43,600,000 per second. 22.9 nanoseconds per run. 7.63847 % GC time.'"
There are quite a few more benchmarks in the attached file.
I have also attached a change set of the implementation I used, so you
can try it out for yourselves if you'd like :)
Cheers
Stephan
On 28.07.20 19:33, Vanessa Freudenberg wrote:
> On Tue, Jul 28, 2020 at 4:34 AM Stephan Lutz <dev at stlutz.net
> <mailto:dev at stlutz.net>> wrote:
>
> While transforming points using MatrixTransform2x3 we noticed some
> strange rounding behavior:
>
> "with plugin"
> (MatrixTransform2x3 withOffset: 5 @ 10) localPointToGlobal:
> 0 at 0. "5 at 10"
> (MatrixTransform2x3 withOffset: -5 @ -10) localPointToGlobal:
> 0 at 0. "-4@ -9"
>
> "without plugin"
> ((MatrixTransform2x3 withOffset: 5 @ 10) transformPoint: 0 at 0)
> rounded. "5 at 10"
> ((MatrixTransform2x3 withOffset: -5 @ -10) transformPoint:
> 0 at 0) rounded. "-5@ -10"
>
> It appears the code used to round in the plugin simply adds 0.5
> and truncates the result, which does not work correctly for
> negative numbers.
> This code can be found in Matrix2x3Plugin >>
> #roundAndStoreResultPoint: and Matrix2x3Plugin >>
> #roundAndStoreResultRect:x0:y0:x1:y1: .
>
> ----
>
> On a kind of related note: Is there even a reason to round the
> resulting floats?
>
> While the class comment of MatrixTransform2x3 notes that this
> behavior is intentional, glancing quickly over its uses we could
> not find anything taking advantage or benefiting from it. It's
> also not a limitation of the DisplayTransform interface, since
> MorphicTransform does produce floating point values. Wouldn't it
> be much more versatile and easier to leave rounding to users if
> they actually need it?
>
> No. Having a float result means that the primitive would need to
> allocate two Float objects. Any allocation can fail due to memory
> exhaustion. So the primitive would have to be made to retry the
> allocation after running a garbage collection.
>
> Secondly, its results are primarily used to set up a WarpBlt IIRC, for
> drawing rotated user objects in Etoys. WarpBlt fails if the coords are
> not integers. The failure code rounds the numbers and retries. Doing
> the rounding in the matrix primitives ensured a fast path to rendering
> - that's why it was done that way.
>
> So, there are very good reasons why the plugin returns integers. And
> there are Squeak VMs where this still is a very reasonable behavior.
> It also would be a good idea to document the reasoning in the class
> comment of MatrixTransform2x3.
>
> That being said, there is virtually no reason to use it when running
> on Cog, much less Sista, especially on 64 bits where we have immediate
> floats. An interesting thing would be to compare a pure Smalltalk
> implementation to the performance of the plugin. If you need floating
> point transform results, just write it in Smalltalk, would be my
> suggestion.
>
> - Vanessa -
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200812/4b5ea283/attachment-0001.html>
-------------- next part --------------
rect := 100 at 200 extent: 300 at 400.
offset := 500 at 600.
angle := 42.
mat2x3Old := (MatrixTransform2x3 withRotation: 70) offset: offset.
mat2x3New := (Matrix2x3 withRotation: 70) setOffset: offset.
morphic := (MorphicTransform offset: offset negated angle: 70 scale: 1).
mat2x3OldTranslation := MatrixTransform2x3 withOffset: offset.
mat2x3NewTranslation := Matrix2x3 withOffset: offset.
morphicTranslation := MorphicTransform offset: offset negated.
mat2x3OldRotation := MatrixTransform2x3 withRotation: angle.
mat2x3NewRotation := Matrix2x3 withRotation: angle.
morphicRotation := MorphicTransform offset: 0 at 0 angle: angle negated degreesToRadians scale: 1.0.
mat2x3OldIdentity := MatrixTransform2x3 identity.
mat2x3NewIdentity := Matrix2x3 identity.
morphicIdentity := MorphicTransform identity.
"--------------------------------"
" localPointToGlobal: "
"--------------------------------"
[mat2x3Old localPointToGlobal: -10 @ 10] bench.
" '11,700,000 per second. 85.6 nanoseconds per run. 1.43971 % GC time.'"
[mat2x3Old transformPoint: -10 @ 10] bench.
" '2,330,000 per second. 429 nanoseconds per run. 0.37985 % GC time.'"
[mat2x3New localPointToGlobal: -10 @ 10] bench.
" '12,500,000 per second. 80.3 nanoseconds per run. 1.89962 % GC time.'"
[morphic localPointToGlobal: -10 @ 10] bench.
" '2,710,000 per second. 370 nanoseconds per run. 1.16 % GC time.'"
"--------------------------------"
" localBoundsToGlobal: "
"--------------------------------"
[mat2x3Old localBoundsToGlobal: rect] bench.
" '6,770,000 per second. 148 nanoseconds per run. 1.67966 % GC time.'"
[mat2x3New localBoundsToGlobal: rect] bench.
" '2,090,000 per second. 480 nanoseconds per run. 1.55969 % GC time.'"
[morphic localBoundsToGlobal: rect] bench.
" '505,000 per second. 1.98 microseconds per run. 1.95922 % GC time.'"
"--------------------------------"
" localBoundsToGlobal: (pure translation) "
"--------------------------------"
[mat2x3OldTranslation localBoundsToGlobal: rect] bench.
" '6,780,000 per second. 147 nanoseconds per run. 1.7 % GC time.'"
[mat2x3NewTranslation localBoundsToGlobal: rect] bench.
" '5,860,000 per second. 171 nanoseconds per run. 1.48 % GC time.'"
[morphicTranslation localBoundsToGlobal: rect] bench.
" '1,580,000 per second. 631 nanoseconds per run. 4.12 % GC time.'"
"--------------------------------"
" globalPointToLocal: "
"--------------------------------"
[mat2x3Old globalPointToLocal: -10 @ 10] bench.
" '11,300,000 per second. 88.3 nanoseconds per run. 1.79964 % GC time.'"
[mat2x3Old invertPoint: -10 @ 10] bench.
" '2,470,000 per second. 405 nanoseconds per run. 0.45991 % GC time.'"
[mat2x3New globalPointToLocal: -10 @ 10] bench.
" '8,950,000 per second. 112 nanoseconds per run. 1.5197 % GC time.'"
[morphic globalPointToLocal: -10 @ 10] bench.
" '2,720,000 per second. 368 nanoseconds per run. 1.19976 % GC time.'"
"--------------------------------"
"globalBoundsToLocal:"
"--------------------------------"
[mat2x3Old globalBoundsToLocal: rect] bench.
" '6,210,000 per second. 161 nanoseconds per run. 1.69966 % GC time.'"
[mat2x3New globalBoundsToLocal: rect] bench.
" '1,720,000 per second. 581 nanoseconds per run. 1.41972 % GC time.'"
[morphic globalBoundsToLocal: rect] bench.
" '465,000 per second. 2.15 microseconds per run. 2.08 % GC time.'"
"--------------------------------"
"globalBoundsToLocal: (pure translation)"
"--------------------------------"
[mat2x3OldTranslation globalBoundsToLocal: rect] bench.
" '6,280,000 per second. 159 nanoseconds per run. 1.73965 % GC time.'"
[mat2x3NewTranslation globalBoundsToLocal: rect] bench.
" '4,060,000 per second. 246 nanoseconds per run. 1.13977 % GC time.'"
[morphicTranslation globalBoundsToLocal: rect] bench.
" '1,500,000 per second. 668 nanoseconds per run. 4.36 % GC time.'"
"--------------------------------"
"composedWithLocal:"
"--------------------------------"
[mat2x3Old composedWithLocal: mat2x3OldRotation] bench.
" '9,670,000 per second. 103 nanoseconds per run. 1.19976 % GC time.'"
[mat2x3New composedWithLocal: mat2x3NewRotation] bench.
" '6,920,000 per second. 144 nanoseconds per run. 1.4997 % GC time.'"
[morphic composedWithLocal: morphicRotation] bench.
" '11,100,000 per second. 89.8 nanoseconds per run. 1.09978 % GC time.'"
"--------------------------------"
" inverseTransformation "
"--------------------------------"
[mat2x3Old inverseTransformation] bench.
" '584,000 per second. 1.71 microseconds per run. 0.55989 % GC time.'"
[mat2x3New inverseTransformation] bench.
" '2,450,000 per second. 408 nanoseconds per run. 2.27954 % GC time.'"
[morphic inverseTransformation] bench.
" '1,230,000 per second. 813 nanoseconds per run. 1.19976 % GC time.'"
"--------------------------------"
" = "
"--------------------------------"
mx := mat2x3Old copy.
[mat2x3Old = mx] bench.
" '16,900,000 per second. 59.1 nanoseconds per run. 0 % GC time.'"
mx := mat2x3New copy.
[mat2x3New = mx] bench.
" '17,500,000 per second. 57.1 nanoseconds per run. 0 % GC time.'"
mx := morphic copy.
[morphic = mx] bench. "identity check"
" '97,600,000 per second. 10.3 nanoseconds per run. 0 % GC time.'"
"--------------------------------"
" instance creation "
"--------------------------------"
[MatrixTransform2x3 withOffset: offset] bench.
" '3,320,000 per second. 301 nanoseconds per run. 1.91962 % GC time.'"
[Matrix2x3 withOffset: offset] bench.
" '24,800,000 per second. 40.3 nanoseconds per run. 10.63787 % GC time.'"
[MorphicTransform offset: offset] bench.
" '43,600,000 per second. 22.9 nanoseconds per run. 7.63847 % GC time.'"
"--------------------------------"
" isPureTranslation "
"--------------------------------"
[mat2x3Old isPureTranslation] bench.
" '22,000,000 per second. 45.4 nanoseconds per run. 0 % GC time.'"
[mat2x3New isPureTranslation] bench.
" '77,000,000 per second. 13 nanoseconds per run. 0 % GC time.'"
[morphic isPureTranslation] bench.
" '76,600,000 per second. 13.1 nanoseconds per run. 0 % GC time.'"
"--------------------------------"
" isPureTranslation (pure translation)"
"--------------------------------"
[mat2x3OldTranslation isPureTranslation] bench.
" '21,000,000 per second. 47.7 nanoseconds per run. 0 % GC time.'"
[mat2x3NewTranslation isPureTranslation] bench.
" '38,700,000 per second. 25.8 nanoseconds per run. 0 % GC time.'"
[morphicTranslation isPureTranslation] bench.
" '60,400,000 per second. 16.6 nanoseconds per run. 0 % GC time.'"
"--------------------------------"
" isIdentity "
"--------------------------------"
[mat2x3Old isIdentity] bench.
" '23,700,000 per second. 42.1 nanoseconds per run. 0 % GC time.'"
[mat2x3New isIdentity] bench.
" '76,500,000 per second. 13.1 nanoseconds per run. 0 % GC time.'"
[morphic isIdentity] bench.
" '58,400,000 per second. 17.1 nanoseconds per run. 0 % GC time.'"
"--------------------------------"
" isIdentity (identity) "
"--------------------------------"
[mat2x3OldIdentity isIdentity] bench.
" '22,100,000 per second. 45.1 nanoseconds per run. 0 % GC time.'"
[mat2x3NewIdentity isIdentity] bench.
" '28,000,000 per second. 35.8 nanoseconds per run. 0 % GC time.'"
[morphicIdentity isIdentity] bench.
" '19,500,000 per second. 51.2 nanoseconds per run. 0.89982 % GC time.'"
"--------------------------------"
"--------------------------------"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Matrix2x3.1.cs
Type: text/x-csharp
Size: 8534 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200812/4b5ea283/attachment-0001.bin>
More information about the Squeak-dev
mailing list
|