Float equality? (was: [BUG] Float NaN's)

Bruce O'Neel edoneel at sdf.lonestar.org
Wed Sep 15 10:37:20 UTC 2004

What Every Computer Scientist Should Know About Floating-Point Arithmetic

http://docs.sun.com/source/806-3568/ncg_goldberg.html

On Wed, Sep 15, 2004 at 03:22:25PM +1200, Richard A. O'Keefe wrote:
> "Jarvis, Robert P. (Bob) (Contingent)" <bob.jarvis at timken.com> wrote:
> 	The best practice I'm aware of for handling equality
> 	calculations with Floats is avoid them completely.
>
> I do wish that more people would read "What Every Computer Scientist
> Should Know About Floating-Point Arithmetic".  There may be a mistake
> or too in that title, but you'll certainly find the paper around on
> the Web. and Sun used to make a habit of shipping it with their
> compilers.  This is not directed at Robert Jarvis, but at his audience.
>
> Floating-point equality tests are in fact perfectly well behaved
> (in the absence of NaN, sigh).  More than that, when the operands
> and result are integers in the range -(2**53 - 1) .. +(2**53-1)
> held as IEEE 754 double precision numbers, addition, subtraction,
> multiplication, remainder() -- hence also division via
> rint((x - remainder(x,y))/y) -- and comparison are EXACT.
>
> If you want to test whether a number x is a positive infinity,
> then x == Inf is the best way to do it.
>
> There are plenty of examples where floating-point equality is exactly
> the right thing to do.  Any blanket ban on floating point equality is
> too strict.
>
> The problem is not equality.  The problem is that floating-point
> arithmetic is BINARY, not decimal, and it's APPROXIMATE, not exact.
> It just plain doesn't do what people expect.  With the possible
> exception of absolute value and unary minus, there is NO floating-point
> operation which does what a naive user would expect.  And while IEEE
> floating-point is bizarre, it isn't outright broken like many of the
> hardware floating-point systems that preceded it.
>
> What I'd really like to get my hands on is the decimal floating-point
> arithmetic in the revised IEEE standard.  *That's* the arithmetic you
> want for a spreadsheet.  That's the arithmetic you want if people are
> not to be tripped up by base 2 -vs- base 10.  In fact you *can* get
> your hands on a software implementation if you know where to look, but
> wouldn't it be nice to have it going at full hardware speed?
>
> 	You should
> 	establish what you consider to be an acceptable epsilon value
> 	based on your understanding of your data and use it as follows:
>
> It's not just your data you have to understand; more generally it is
> your algorithm.  Anyone who understands them well enough to choose a
> good epsilon already knows how to do the fuzzy comparisons.
>
> 		maxEpsilon = 0.000001.
> 			.
> 			.
> 			.
> 		(f1 - f2) abs < maxEpsilon
> 			ifTrue: ["f1 and f2 are approximately equal"]
> 			ifFalse: ["f1 and f2 are not approximately equal"]
>
> Urk.  Absolute tolerances seldom work very well.
> See Knuth, The Art of Computer PRogramming, Volume 2 "Seminumerical
> Algorithms" for a thorough discussion of "fuzzy" floating-point comparison.
>
> The really nasty thing about fuzzy comparisons is that they aren't
> transitive:  (x fuzzyEquals: y) and: [y fuzzyEquals: z] does NOT
> imply x fuzzyEquals: z.  And yes, I *have* known programs (in APL and
> in IBM Prolog) go wrong because their programmers didn't really understand
> that they were getting fuzzy comparison and/or didn't appreciate the
> consequences.  (What Robert Jarvis is recommending is *explicitly* doing
> fuzzy comparison with a *specifically* chosen *local* tolerance, not
> implicit fuzzy comparison with a *global* tolerance.  So it should be
> less risky.)
>
> 	Do not under any circumstances use floating point numbers in
> 	financial calculations.  Floats are imprecise, often only
> 	approximate, and utterly inappropriate for any calculation where
> 	all the fiddly little decimal places really count.
>
> Except decimal floats.  Addition, subtraction, and multiplication of
> in-range numbers stated in decimal with in-range results are *exact*.
> We *really* want the new IEEE standard, don't we?
>
> I wish I understood the ANSI Smalltalk 'ScaledDecimal' interface a bit
> better.  I'm not sure I believe all of what I think I do understand.
>
> And of course *some* financial calculations are *supposed* to be
> approximate.  The thing is, as always 'know what you are doing'.
>
> 	>This does not require the use of a Float.  In Smalltalk I'd use either a
> 	>Fraction or a ScaledDecimal.
>
> Unfortunately, there is no ScaledDecimal class in Squeak.
> I do have an implementation of ScaledDecimal I wrote for another
> Smalltalk, but it would require some work to fit with the traditional
> double dispatch, and I am wary about changing the compiler to recognise
> ScaledDecimals.  Above all, I'm not sure I've interpreted the standard
> correctly.
>

--
edoneel at sdf.lonestar.org
SDF Public Access UNIX System - http://sdf.lonestar.org