[squeak-dev] Re: [ANN] Number comparison, hash, NaN, Point,
and other partially ordered sets
Tim Olson
tim_olson at att.net
Fri Jan 9 15:10:41 UTC 2009
On Jan 9, 2009, at 2:29 AM, Hans-Martin Mosner wrote:
> First of all, does it matter? If I understand correctly, this behavior
> is only present for denormalized numbers.
If you set the internal rounding-precision mode in the x86 control
register to double-precision, then yes, the multiple rounding issue
goes away for most computations. It only remains when generating
denormal results because the exponent field size still remains at
extended precision during the computation, so the denormalized result
is only generated during the conversion back to double-precision
format, leading to multiple rounding operations.
> Do these appear in real-world
> cases?
That's hard to say. It might be interesting to instrument the VM to
check for denormal operands or results on the float operations to get a
feel for how often (if ever) they are occurring.
> I've tried to analyze the case in question and came to the following
> results:
> The exact mantissa after multiplication is 3B16EF930A76E.80002C69F96C2
> (the hex digits after the point are those that should be rounded off
> when going to a 52-bit mantissa). The result with "A76F" as the last
> hex
> digits would therefore be the correct value for an IEEE-754 double
> precision multiplication (rounding to the nearest representable
> number),
> so the PPC implementation does it right.
> When doing an extended double precision multiplication, there are some
> more bits in the mantissa, and the mantissa of the intermediate result
> looks like ...A76E.800 which is exacly in the middle between two
> representable numbers. Converting to double precision involves rounding
> the mantissa again, and the rounding rule for this case (exact middle)
> says to round to the nearest even number, which is ...A76E.
That's sort of what is going on, but it is complicated here due to the
way denorms are handled. What is actually happening is:
1st operand (binary):
1.0011001000001000101000100101111000000100111010000111
(note "hidden" 1 bit added to the left of the radix point because it
is a normalized value)
2nd operand (binary):
0.0011000101101101110100011101000000101101000110101110
(note no "hidden" 1 bit because it is a denormalized value)
Before multiplication, the 2nd operand is renormalized, adjusting the
exponent accordingly:
1.1000101101101110100011101000000101101000110101110000
The exact product is:
1.1101100010110111011111001001100001010011101101110100,0000000000000001.
..
The comma (,) is at the double-precision rounding position.
PPC operation:
The product exponent is smaller than can be represented in a normalized
double-precision format, so the result is first denormalized:
0.001101100010110111011111001001100001010011101101110,100000000000000000
1...
Then round-to-nearest rounding adds an ULP because the bits to the
right of the rounding position are greater than halfway:
0.001101100010110111011111001001100001010011101101111
x86 (extended with double-rounding mode):
Because the exponent field is still sized for 80-bit extended floats,
the exact product is still representable as a normalized number:
1.1101100010110111011111001001100001010011101101110100,0000000000000001.
..
Then round-to-nearest drops the bits to the right of the
double-precision rounding point without adding an ULP, because the bits
to the right are less than halfway:
1.1101100010110111011111001001100001010011101101110100
Then the result is converted to a double-precision representation when
storing to memory, which causes it to become denormalized:
0.0011101100010110111011111001001100001010011101101110,100
Round-to-nearest rounding does not add an ULP because the bits to the
right of the rounding position are exactly halfway, and in that case
the nearest even result is selected:
0.0011101100010110111011111001001100001010011101101110
> Is it at all possible to get the x86 FPU to produce the correct result
> for this situation?
Not using the extended-precision FPU, without an extreme performance
loss. You basically would have to cause an exception for all imprecise
results (lots of them!) and check them in software. If you perform the
operations in the SSE unit, then that will work, provided all the x86
platforms you run on support SSE2 with double-precision scalars.
> Interestingly, this article
> (http://www.vinc17.org/research/extended.en.html) claims that the FPU
> is
> set up for incorrect rounding under Linux only, but I could not
> reproduce the test case given there with Squeak (which probably means
> that I mistranslated the Java example).
That example shows the effect for normalized intermediate results when
the intermediate result is rounded to extended precision, then
double-precision rounding occurs when converting to double-precision
format. That can be fixed by setting the rounding-precision mode to
double in the fpu control register, but, as shown in the example above,
it does not work in the case of denormalized intermediate results.
I suspect that your Squeak VM has the rounding-precision mode set to
double, which fixes most of the cases.
-- tim
More information about the Squeak-dev
mailing list
|