At 04:26 PM 11/29/99 -0500, David N. Smith wrote:
At 17:33 -0600 11/28/99, R. A. Harmon wrote:
I found the following:
2 = 2.0 -> true 2 hash = 2.0 hash ->false
I think this is a bug. Is it?
Good question. The blue book says:
[snip]
(I cannot find any reference in the blue book to another hashing rule: The hash value of an object must be constant over time. There
The ANSI doc in <Object> hash says:
"The hash value of an object need not be temporally invariant. Two independent invocations of #hash with the same receiver may not always yield the same results. Note that collections that use #= to discriminate."
So, by the Blue Book definition, what you observe is a bug.
[snip]
Should it be? Does it really mean that two instances of wildly differing classes must answer equal hashes when they compare equal?
[snip]
All of the discussion in the Blue Book about hashing is in the context of looking up values in a hashed collection. Did they intend that someone might put a float 2.0 into a hashed collection and then look it up with an integer 2? Do people do it?
I was also concerned with the ANSI definition as I'm trying to produce all the ANSI conforming messages for Squeak, and that I thought there may be an unpleasant surprise in collections you mentioned. I would think there are good reason way folks might do it.
Maybe there is some other reason for this rule, but I think it might just be incompletely stated. Consider these cases:
[snip]
So, maybe the rule should state something about closely related objects answering the same hash, but I don't see a good and simple way to say it.
I like the ANSI standard way as it relieves me of having to be creative, and I get to rely on the committee's greater experience.
I also wondered if there was a specific reason Squeak does it this way, and I thought I'd leverage off this list's greater experience.
Are subclasses of Number closely related enough that they should follow the rule? Most Smalltalk systems do what you expect.
Squeak Dolphin VWNC 2 = 2.0 true true true 2 hash = 2.0 hash false true true
How about VAST?
[Good analysis snip]
There are similar problems with scaled decimal values.
I can't recall how I handled this in my Scaled Decimal implementation for Squeak.
Squeak Dolphin VWNC 2 = 2.0s2 ? true true 2 hash = 2.0s2 hash ? false true
2.0 = 2.0s2 ? true true 2.0 hash = 2.0s2 hash ? false true
This may not be Dophin's fault as I contributed my Scaled Decimal implementation that they then modified (I haven't yet checked how much changed). It may be that my error may have slipped by them.
How about VAST?
So, should the rule be followed even if it's more expensive to do so?
[snip]
In my view, and assuming I'm not missing something obvious, the answer is no.
I'd argue that having (2.0 = 2) answer false is better than forcing the hashes to be equal. Besides, comparing floats is a sin one should not encourage. I'd rather see #= answer false and that some other method be used for 'has equivalent value'.
That is the wisdom I've always heard, and my first impulse was that (2.0 = 2) should answer false.
As a general rule, I use the ANSI standard messages and where it doesn't specify, I defer to other dialects where they agree, otherwise I have to think (not a happy occurrence). I use my general rule where possible so my programs are at least nominally portable.
After checking on what Dolphin and VWNC does, I would vote for (2.0 = 2) answer true and (2 hash = 2.0 hash) answer true in Squeak. I haven't dug into how this is achieved
At 05:41 PM 11/29/99 -0500, agree@carltonfields.com wrote:
This is an excellent discussion. It suggests to me that the blue book notion of "="/hash and "==" do not properly contemplate the more general notion of equality used in computing 2 = 2.0 under the present system. The latter computation is much more than the notion of "represents the same number," which is the Smalltalk/Blue Book notion of "=," but rather is the more general notion of "after coercing both arguments to the same level of generality in a specified hierarchy, represents the same number."
[Good analysis snip]
The ANSI doc in <Object> = says:
"The meaning of "equivalent" cannot be precisely defined but the intent is that two objects are considered equivalent if they can be used interchangeably. Conforming protocols may choose to more precisely define the meaning of "equivalent".
The value of (receiver = comparand) is true if and only if the value of (comparand = receiver) would also be true. If the value of (receiver = comparand) is true then the receiver and comparand must have equivalent hash values. Or more formally:
receiver = comparand => receiver hash = comparand hash
The equivalence of objects need not be temporally invariant. Two independent invocations of #= with the same receiver and operand objects may not always yield the same results. Note that a collection that uses #= to discriminate objects may only reliably store objects whose hash values do not change while the objects are contained in the collection."
I find this a bit more slippery than I can get my mind around. First I think 2 and 2.0 are interchangeable and then I think they are not. Arrrrrrgh!
Thanks to all others for good points.
-- Richard A. Harmon "The only good zombie is a dead zombie" harmonra@webname.com E. G. McCarthy Spencer, Iowa