[squeak-dev] #= ==> #hash issues

Eliot Miranda eliot.miranda at gmail.com
Thu Nov 15 15:41:28 UTC 2018


> On Nov 15, 2018, at 5:44 AM, Tim Olson <tim.olson.mail at gmail.com> wrote:
>
> Interval >> size does the correct thing with the stop value, so maybe Interval >> = could use:
>
>    isInterval and:
>        [start = anInterval start and:
>        [step = anInterval step and:
>        [self size = anInterval size]]]

The current implementation is correct; it is effectively the same as
your's but has some obvious optimizations.  Two intervals are equal if
they have the same sequence of elements, no matter how they are
written.  Here is is:

= anObject

    ^ self == anObject
        ifTrue: [true]
        ifFalse: [anObject isInterval
            ifTrue: [start = anObject first
                    and: [step = anObject increment
                    and: [self last = anObject last]]]
            ifFalse: [super = anObject]]

which I would have written
= anObject
    ^self == anObject
     or: [anObject isInterval
                ifFalse: [super = anObject]
                ifTrue:
                    [start = anObject first
                     and: [step = anObject increment
                     and: [self last = anObject last]]]]

The issue is with hash which accesses stop directly instead of last.
If hash read

hash
    "Hash is reimplemented because = is implemented."

    ^(((start hash bitShift: 2)
        bitOr: self last hash)
        bitShift: 1)
        bitOr: self size

(i.e. "bitOr: stop hash)" => "bitOr: self last hash)"
then things will be fine.  And most common intervals hash will not change.



>
>    — tim
>
>> On Nov 2, 2018, at 12:53 PM, Louis LaBrunda <Lou at Keystone-Software.com> wrote:
>>
>> Hi Chris,
>>
>> I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to
>> each other.  My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how
>> it is used and therefor what equals should mean.  I would interpret two intervals being equal if they span the same
>> range and not worry about how they got there.  In the case where the increment (by) is an integer the start and end
>> values map down to integers and if those integers are the same in two intervals then the intervals span the same range
>> and should be considered equal.  Any program using those intervals would expect them to work the same.  In VA Smalltalk
>> they would work the same but you can't tell that with #=.
>>
>> I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work.  But
>> from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be
>> out of luck and probably confused as to why.  Sure, this is Smalltalk and there are ways around this, if you know you
>> need to work around it.  One could always add a method to intervals to "fix" the start and end values if the increment
>> is an integer.
>>
>> I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been
>> this way since 1996.  They are concerned that changing #= may break existing user code.  I doubt it but I understand the
>> concern.
>>
>> Lou
>>
>>
>>> On Fri, 2 Nov 2018 10:01:38 -0700, Chris Cunningham <cunningham.cb at gmail.com> wrote:
>>>
>>> All of that said, I too find the VA troubling a bit in this case.  I rely
>>> on this (0 to: 1) = (0 to: 5/3) being true.  VA not supporting is limits
>>> cross-dialect portability, although I don't (personally) use VA, other
>>> folks at work do and we do occasionally share code.
>>>
>>> However, this implementation is internally consistent and obeys the = and
>>> hash rule in this case.  Its just not what I would want.
>>>
>>> -cbc
>>>
>>> On Fri, Nov 2, 2018 at 9:52 AM Chris Cunningham <cunningham.cb at gmail.com>
>>> wrote:
>>>
>>>> Hi Louis,
>>>> On Fri, Nov 2, 2018 at 9:12 AM Louis LaBrunda <Lou at keystone-software.com>
>>>> wrote:
>>>>
>>>>> Hi Chris,
>>>>>
>>>>> On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham <
>>>>> cunningham.cb at gmail.com> wrote:
>>>>>
>>>>>> ParcPlace-Digitalk VSE 3.1 (roughly 1999):
>>>>>>
>>>>>> (0 to: 1) = (0 to: 5/3). "true"
>>>>>> (0 to: 1) hash = (0 to: 5/3) hash. "true"
>>>>>>
>>>>>> So, ancient VSE and current VisualWorks are consistent, and agree on
>>>>> where
>>>>>> they want to be.  This is also the direction I want to take Squeak.
>>>>>> VA is also consistent, but #= doesn't match any other Smalltalk varient
>>>>> that we've looked at.
>>>>>> Squeak, Pharo, Dolphin all currently have the same answer, but are not
>>>>>> consistent.
>>>>>
>>>>>> Interesting indeed.
>>>>>
>>>>> I have been talking to the VA Smalltalk guys about this and they are
>>>>> thinking about it but haven't decided what to do
>>>>> yet.  It turns out that the way collections (like Set) that use #hash in
>>>>> VA Smalltalk work, because of the #= test
>>>>> failing for intervals that cover the same range and have the same hash,
>>>>> that it overrides the equal hash value and adds
>>>>> the interval to the collection.  I find this troubling.
>>>>>
>>>>> Lou
>>>>>
>>>>
>>>> The rules for = and hash are that if two object are #=, then their hash
>>>> values have to be equal as well.
>>>>
>>>> There is no statement about if two objects hashes are the same, what this
>>>> means for equality.  This, I believe, is intentional.
>>>>
>>>> The collection objects in (most?all?) smalltalks behave similarly to VA's
>>>> - if objects have the same hash but are not equal, then they will both be
>>>> in the hashed collection (such as Set).  The squeak implementation is
>>>> described in Set>>scanFor: .  This method also shows why having objects
>>>> equal but their hash not equal is so dangerous - if you had two objects
>>>> that are supposed to be one and the same and are in fact #= but don't have
>>>> the same hash, they can both show up in a Set together, or as keys in a
>>>> Dictionary together, which breaks what we would expect.
>>>>
>>>> But getting back to VA's collection issue that you have issues with - they
>>>> are undoubtedly doing something similar in their collections that we do in
>>>> Squeak, which is what is expected (although not necessarily obvious).
>>>>
>>>> A long time ago, I took advantage of this and just hard-coded the hash for
>>>> some of my classes to 1. This actually did work, but is a horrible (I mean
>>>> HORRIBLE) idea - it really, really slows down the system when you have more
>>>> than a couple instances of an object, but it does work.
>>>>
>>>> -cbc
>>>>
>>>>
>>>>>
>>>>>
>>>>>> thanks,
>>>>>> cbc
>>>>>>
>>>>>> On Thu, Nov 1, 2018 at 5:40 AM Louis LaBrunda <Lou at keystone-software.com
>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Benoit,
>>>>>>>
>>>>>>> On the latest version of VA Smalltalk:
>>>>>>>
>>>>>>> VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
>>>>>>> VM Timestamp: 4.0, 10/01/18 (100)
>>>>>>>
>>>>>>> I see:
>>>>>>>
>>>>>>> (0 to: 1) = (0 to: 5/3). "false"
>>>>>>> (0 to: 1) hash = (0 to: 5/3) hash. "true"
>>>>>>>
>>>>>>> Very interesting.
>>>>>>>
>>>>>>> Lou
>>>>>>>
>>>>>>>
>>>>>>> On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
>>>>> <
>>>>>>> squeak-dev at lists.squeakfoundation.org> wrote:
>>>>>>>
>>>>>>>> Interesting!
>>>>>>>>
>>>>>>>> As a comparison:
>>>>>>>> Squeak 5.2
>>>>>>>> (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
>>>>> "false"
>>>>>>>> Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
>>>>>>> hash. "false"
>>>>>>>> VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
>>>>>>>> (0 to: 1) hash = (0 to: 5/3) hash. "true"
>>>>>>>> Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
>>>>>>>> (0 to: 1) hash = (0 to: 5/3) hash. "false"
>>>>>>>>
>>>>>>>> I don't have VAST installed on the PC I'm using right now.  I'd be
>>>>>>> curious to see how other Smalltalk and/or GemStone handle this?  So far
>>>>>>> (according to what I could test, only VW is right (according to the
>>>>> ANSI
>>>>>>> standard and just plain logic!)
>>>>>>>>
>>>>>>>> I wonder how much code relies on this "behavior" out there!
>>>>>>>> But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
>>>>>>> 53,
>>>>> http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf):
>>>>>>>> "If the value of receiver = comparand is true then the receiver and
>>>>>>> comparand *must* have equivalent hash values."
>>>>>>>> That's what I always thought (or was taught or even read in the Blue
>>>>>>> Book).  Was this something that was changed at some point???
>>>>>>>>
>>>>>>>> ----------------
>>>>>>>> Benoît St-Jean
>>>>>>>> Yahoo! Messenger: bstjean
>>>>>>>> Twitter: @BenLeChialeux
>>>>>>>> Pinterest: benoitstjean
>>>>>>>> Instagram: Chef_Benito
>>>>>>>> IRC: lamneth
>>>>>>>> Blogue: endormitoire.wordpress.com
>>>>>>>> "A standpoint is an intellectual horizon of radius zero".  (A.
>>>>> Einstein)
>>>>>>> --
>>>>>>> Louis LaBrunda
>>>>>>> Keystone Software Corp.
>>>>>>> SkypeMe callto://PhotonDemon
>>>>>>>
>>>>>>>
>>>>>>>
>>>>> --
>>>>> Louis LaBrunda
>>>>> Keystone Software Corp.
>>>>> SkypeMe callto://PhotonDemon
>>>>>
>>>>>
>>>>>
>> --
>> Louis LaBrunda
>> Keystone Software Corp.
>> SkypeMe callto://PhotonDemon
>>
>>
>
>


More information about the Squeak-dev mailing list