ANSI, =, hash, Integer, Float

Wed Dec 18 00:19:37 UTC 2002

Stephan Rudlof <sr at evolgo.de> asked:
	What about
	- defining #= true and #hash equal
	  applied to numerical equal rational numbers
	  (SmallInteger, LargePositiveInteger, LargeNegativeInteger, Fraction)
	and
	- defining #= as undefined (raising an Exception)
	  for comparisons of all these rationals with Floats
	  (also if numerical equal)?

If I understand this proposal, it is
 - = should do what it does now when the receiver and argument are
     both rational numbers or are both floating point numbers;
 - hash should do what it does now for rational numbers and floating point
   numbers;
 - <rational> = <float> and <float> = <rational>
   should both raise an exception whether the receiver and argument
   are numerically eaual or not.

That is certainly a self-consistent proposal.
Never mind the details of ANSI, it clearly violates the INTENT of
ANSI Smalltalk.

*BUT* it creates a library that is significantly harder to understand.
In general,
    if x < y is defined,
    then x <= y, x > y, x >= y, x = y, and x ~= y are also defined
    and their values are mutually consistent.
In particular, we expect
    x < y   iff  y > x
    x <= y  iff  y >= x
    x = y   iff  (x = y) not
    x <= y  iff  x < y or: [x = y]
    x = y   iff  (x < y or: [y < x]) not     {provided RHS is defined}
(Hope I got those right.)
Any definition of comparison operations which violates these laws is
a definition which will seduce programmers into making mistakes, because
the symbols won't _really_ mean what they _look_ as though they mean.

Ada consistently forbids _all_ mixed mode comparisons and no confusion
arises.  Pascal consistently allows all mixed mode comparisons and no
confusion arises.  Allowing _some_ mixed mode comparisons while
forbidding others (which could clearly be derived from the allowed ones)
does not provide a design that is to grasp or use correctly.

In short, I suggest that 1.0 = 1 should be forbidden ONLY if
1.0 <= 1 (and all the others) are _also_ forbidden.

	This would avoid the trap of making loops with bad end
	conditions like e.g.

	| anInteger |
	anInteger := 0.
	[self doSomething. anInteger := anInteger + 0.1]
	  whileTrue: [anInteger <= 1.0]

I don't understand what this example is supposed to demonstrate.
Calling something anInteger when its value is almost always a Float
is certainly a bad idea.
Writing a block whose value is always a Float in a context
where a block that always returns a Boolean is required
is also a bad idea.

I fail to see how an example that uses <= tells us anything about
what we should do to ensure that #hash and #= are mutually consistent.
I especially fail to see how an example which always compares a
Float with another Float tells us anything about mixed mode comparison.

What's more, undefine <= and people will still be able to write
    0 to: 1.0 by: 0.1 do: [:anInteger | self doSomething]

	Would such a change break something?

Yes.

I am extremely puzzled here.

I pointed out a problem in Squeak's #hash implementations,
and people suddenly pounce like starving wolves on a lamb
on #= as if there was a problem with #=.

>From a _user's_ perspective, The Simplest Thing That Could Possibly Work
is simply to fix #hash.  I not only know that it can be done, I know how
to do it.  With the proviso that the code is for specification purposes
only (I do NOT propose actually allocating a fresh Float each time):

    LargePositiveInteger>>hash   ^self asFloat hash
    LargeNegativeInteger>>hash   ^self asFloat hash
    Fraction>>hash               ^self asFloat hash
    Float hash
      ^((self between: SmallIntger minVal and: SmallInteger maxval)
        and: self fractionPart = 0.0)
        ifTrue: [self truncated hash]
        ifFalse: ["the current basicAt: bitAnd: ... stuff goes here"]

This ensures consistency by
 - forcing Float to use SmallInteger hash whenever the float
   could be equal to a SmallInteger
 - forcing rationals other than SmallInteger to use Float hash

The change would be hidden entirely inside the Number classes.