Hi.
I'm slowly (very) working towards crating a usable test for validating for classes where #= is true, #hash will also be true.
Last week, the Date issue showed up. This week? Intervals: (0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false" CharacterBlock: | cb1 cb2 | cb1 := (CharacterBlock new stringIndex: 5 text: 'StandardText' asText topLeft: (100@100) extent: (20@20)). cb2 := (CharacterBlock new stringIndex: 5 text: 'StandardText' asText topLeft: (200@200) extent: (20@20)). cb1 = cb2. "true" cb1 hash = cb2 hash. "false"
These were found by comparing a random sampling of instances of classes that implement #= or #hash (or both), and finding which have these deviant properties. The hard part is figuring out instances that are going to have issues - Date didn't show up in my prototype scanning. Also most classes don't have instances floating around to compare.
Thanks, -cbc
Oh my, it sounds like you are tracking down a likely source of very obscure intermittent bugs. Bravo.
Dave
On Fri, Oct 26, 2018 at 03:21:20PM -0700, Chris Cunningham wrote:
Hi.
I'm slowly (very) working towards crating a usable test for validating for classes where #= is true, #hash will also be true.
Last week, the Date issue showed up. This week? Intervals: (0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false" CharacterBlock: | cb1 cb2 | cb1 := (CharacterBlock new stringIndex: 5 text: 'StandardText' asText topLeft: (100@100) extent: (20@20)). cb2 := (CharacterBlock new stringIndex: 5 text: 'StandardText' asText topLeft: (200@200) extent: (20@20)). cb1 = cb2. "true" cb1 hash = cb2 hash. "false"
These were found by comparing a random sampling of instances of classes that implement #= or #hash (or both), and finding which have these deviant properties. The hard part is figuring out instances that are going to have issues - Date didn't show up in my prototype scanning. Also most classes don't have instances floating around to compare.
Thanks, -cbc
On Fri, Oct 26, 2018 at 4:44 PM David T. Lewis lewis@mail.msen.com wrote:
Oh my, it sounds like you are tracking down a likely source of very obscure intermittent bugs. Bravo.
Dave
Yes.
Looking at the latest 5.2b, I find that 0 = 0, but 0 hash does not = 0 hash (for 2 existing instances of LargePositiveInteger). That is odd. Unfortunately, I can't seem to actually capture the ones that are causing the issue to investigate - the get normalized (or something) to regular 0 integers.
However, while looking at this, I noticed that the fall back code in Integer>>digitCompare: is buggy.
If you evaluate 1 digitCompare: -1249. "1" but, if you then comment out "<primitive: 'primDigitCompare' module:'LargeInteger'>" in that method and run it again, you get: 1 digitCompare: -1249. "-1"
-cbc
http://bugs.squeak.org/view.php?id=3380
Le dim. 28 oct. 2018 à 18:23, Chris Cunningham cunningham.cb@gmail.com a écrit :
On Fri, Oct 26, 2018 at 4:44 PM David T. Lewis lewis@mail.msen.com wrote:
Oh my, it sounds like you are tracking down a likely source of very obscure intermittent bugs. Bravo.
Dave
Yes.
Looking at the latest 5.2b, I find that 0 = 0, but 0 hash does not = 0 hash (for 2 existing instances of LargePositiveInteger). That is odd. Unfortunately, I can't seem to actually capture the ones that are causing the issue to investigate - the get normalized (or something) to regular 0 integers.
However, while looking at this, I noticed that the fall back code in Integer>>digitCompare: is buggy.
If you evaluate 1 digitCompare: -1249. "1" but, if you then comment out "<primitive: 'primDigitCompare' module:'LargeInteger'>" in that method and run it again, you get: 1 digitCompare: -1249. "-1"
-cbc
On Sun, Oct 28, 2018 at 2:53 PM Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
and
Intervals: (0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
In the inbox is collections-cbc.810.mcz, which fixes both of these bugs.
You can test them out - #hash still replicates the bugs, while #hashBetterFastArrayCompatible (on Interval) and #hashBetterFastIntervalCompatible (on Array) makes both work. The later also implements your idea of only testing some of the elements - the first and last 16.
It slows down hash speed of Interval roughly an order of magnitude, though.
If anyone hash ideas I'd be interested. Failing that, I'll ruminate on them for the next several days, and eventually push something in that fixes this (meanwhile moving that package to treated). ----- Here is the series of 'tests' that I did while working on these with various timings.
{ [(1 to: 100 by: 1) hash] bench. [(1 to: 100 by: 1) hashBetter] bench. [(1 to: 100 by: 1) hashBetterAlsoFixBug3380] bench. [(1 to: 100 by: 1) hashSlowerBetterAlsoFixBug3380] bench. [(1 to: 100 by: 1) hashFastArrayCompatible] bench. [(1 to: 100 by: 1) hashBetterFastArrayCompatible] bench. '---'. [(1 to: 100.3 by: 1) hash] bench. [(1 to: 100.3 by: 1) hashBetter] bench. [(1 to: 100.3 by: 1) hashBetterAlsoFixBug3380] bench. [(1 to: 100.3 by: 1) hashSlowerBetterAlsoFixBug3380] bench. [(1 to: 100.3 by: 1) hashFastArrayCompatible] bench. [(1 to: 100.3 by: 1) hashBetterFastArrayCompatible] bench. }
{ (0 to: 1) = (0 to: 5/3). (0 to: 1) hash = (0 to: 5/3) hash. (0 to: 1) hashBetter = (0 to: 5/3) hashBetter. (0 to: 1) hashBetterAlsoFixBug3380 = (0 to: 5/3) hashBetterAlsoFixBug3380. (0 to: 1) hashSlowerBetterAlsoFixBug3380 = (0 to: 5/3) hashSlowerBetterAlsoFixBug3380. (0 to: 1) hashFastArrayCompatible = (0 to: 5/3) hashFastArrayCompatible. (0 to: 1) hashBetterFastArrayCompatible = (0 to: 5/3) hashBetterFastArrayCompatible. }
{ (1 to: 3) = #(1 2 3). (1 to: 3) hash = #(1 2 3) hash. (1 to: 3) hashBetter = #(1 2 3) hash. (1 to: 3) hashBetterAlsoFixBug3380 = #(1 2 3) hash. (1 to: 3) hashSlowerBetterAlsoFixBug3380 = #(1 2 3) hash. (1 to: 3) hashFastArrayCompatible = #(1 2 3) hashFastIntervalCompatible. (1 to: 3) hashBetterFastArrayCompatible = #(1 2 3) hashBetterFastIntervalCompatible. }
-cbc
Remonder: this is because interval is used for text selection and/or cursor position. From that POV, 3 to: 2 is not equal to 4 to: 3. From collection POV, they are both an empty collection. I think that i once proposed to distinguish the two usages and introduce a TextInterval for that purpose.
Le lun. 29 oct. 2018 à 01:35, Chris Cunningham cunningham.cb@gmail.com a écrit :
On Sun, Oct 28, 2018 at 2:53 PM Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
and
Intervals: (0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
In the inbox is collections-cbc.810.mcz, which fixes both of these bugs.
You can test them out - #hash still replicates the bugs, while #hashBetterFastArrayCompatible (on Interval) and #hashBetterFastIntervalCompatible (on Array) makes both work. The later also implements your idea of only testing some of the elements - the first and last 16.
It slows down hash speed of Interval roughly an order of magnitude, though.
If anyone hash ideas I'd be interested. Failing that, I'll ruminate on them for the next several days, and eventually push something in that fixes this (meanwhile moving that package to treated).
Here is the series of 'tests' that I did while working on these with various timings.
{ [(1 to: 100 by: 1) hash] bench. [(1 to: 100 by: 1) hashBetter] bench. [(1 to: 100 by: 1) hashBetterAlsoFixBug3380] bench. [(1 to: 100 by: 1) hashSlowerBetterAlsoFixBug3380] bench. [(1 to: 100 by: 1) hashFastArrayCompatible] bench. [(1 to: 100 by: 1) hashBetterFastArrayCompatible] bench. '---'. [(1 to: 100.3 by: 1) hash] bench. [(1 to: 100.3 by: 1) hashBetter] bench. [(1 to: 100.3 by: 1) hashBetterAlsoFixBug3380] bench. [(1 to: 100.3 by: 1) hashSlowerBetterAlsoFixBug3380] bench. [(1 to: 100.3 by: 1) hashFastArrayCompatible] bench. [(1 to: 100.3 by: 1) hashBetterFastArrayCompatible] bench. }
{ (0 to: 1) = (0 to: 5/3). (0 to: 1) hash = (0 to: 5/3) hash. (0 to: 1) hashBetter = (0 to: 5/3) hashBetter. (0 to: 1) hashBetterAlsoFixBug3380 = (0 to: 5/3) hashBetterAlsoFixBug3380. (0 to: 1) hashSlowerBetterAlsoFixBug3380 = (0 to: 5/3) hashSlowerBetterAlsoFixBug3380. (0 to: 1) hashFastArrayCompatible = (0 to: 5/3) hashFastArrayCompatible. (0 to: 1) hashBetterFastArrayCompatible = (0 to: 5/3) hashBetterFastArrayCompatible. }
{ (1 to: 3) = #(1 2 3). (1 to: 3) hash = #(1 2 3) hash. (1 to: 3) hashBetter = #(1 2 3) hash. (1 to: 3) hashBetterAlsoFixBug3380 = #(1 2 3) hash. (1 to: 3) hashSlowerBetterAlsoFixBug3380 = #(1 2 3) hash. (1 to: 3) hashFastArrayCompatible = #(1 2 3) hashFastIntervalCompatible. (1 to: 3) hashBetterFastArrayCompatible = #(1 2 3) hashBetterFastIntervalCompatible. }
-cbc
Interesting!
As a comparison: Squeak 5.2 (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false" Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false" VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true" Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be curious to see how other Smalltalk and/or GemStone handle this? So far (according to what I could test, only VW is right (according to the ANSI standard and just plain logic!)
I wonder how much code relies on this "behavior" out there! But the ANSI Smalltalk draft is very clear on this (revision 1.9, page 53, http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf): "If the value of receiver = comparand is true then the receiver and comparand *must* have equivalent hash values." That's what I always thought (or was taught or even read in the Blue Book). Was this something that was changed at some point???
---------------- Benoît St-Jean Yahoo! Messenger: bstjean Twitter: @BenLeChialeux Pinterest: benoitstjean Instagram: Chef_Benito IRC: lamneth Blogue: endormitoire.wordpress.com "A standpoint is an intellectual horizon of radius zero". (A. Einstein)
Hi Benoit,
On the latest version of VA Smalltalk:
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413] VM Timestamp: 4.0, 10/01/18 (100)
I see:
(0 to: 1) = (0 to: 5/3). "false" (0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev squeak-dev@lists.squeakfoundation.org wrote:
Interesting!
As a comparison: Squeak 5.2 (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false" Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false" VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true" Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be curious to see how other Smalltalk and/or GemStone handle this? So far (according to what I could test, only VW is right (according to the ANSI standard and just plain logic!)
I wonder how much code relies on this "behavior" out there! But the ANSI Smalltalk draft is very clear on this (revision 1.9, page 53, http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf): "If the value of receiver = comparand is true then the receiver and comparand *must* have equivalent hash values." That's what I always thought (or was taught or even read in the Blue Book). Was this something that was changed at some point???
Benoît St-Jean Yahoo! Messenger: bstjean Twitter: @BenLeChialeux Pinterest: benoitstjean Instagram: Chef_Benito IRC: lamneth Blogue: endormitoire.wordpress.com "A standpoint is an intellectual horizon of radius zero". (A. Einstein)
ParcPlace-Digitalk VSE 3.1 (roughly 1999):
(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on where they want to be. This is also the direction I want to take Squeak. VA is also consistent, but #= doesn't match any other Smalltalk varient that we've looked at. Squeak, Pharo, Dolphin all currently have the same answer, but are not consistent.
Interesting indeed.
thanks, cbc
On Thu, Nov 1, 2018 at 5:40 AM Louis LaBrunda Lou@keystone-software.com wrote:
Hi Benoit,
On the latest version of VA Smalltalk:
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413] VM Timestamp: 4.0, 10/01/18 (100)
I see:
(0 to: 1) = (0 to: 5/3). "false" (0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev < squeak-dev@lists.squeakfoundation.org> wrote:
Interesting!
As a comparison: Squeak 5.2 (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false" Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true" Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far (according to what I could test, only VW is right (according to the ANSI standard and just plain logic!)
I wonder how much code relies on this "behavior" out there! But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53, http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf):
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
Benoît St-Jean Yahoo! Messenger: bstjean Twitter: @BenLeChialeux Pinterest: benoitstjean Instagram: Chef_Benito IRC: lamneth Blogue: endormitoire.wordpress.com "A standpoint is an intellectual horizon of radius zero". (A. Einstein)
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham cunningham.cb@gmail.com wrote:
ParcPlace-Digitalk VSE 3.1 (roughly 1999):
(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on where they want to be. This is also the direction I want to take Squeak. VA is also consistent, but #= doesn't match any other Smalltalk varient that we've looked at. Squeak, Pharo, Dolphin all currently have the same answer, but are not consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are thinking about it but haven't decided what to do yet. It turns out that the way collections (like Set) that use #hash in VA Smalltalk work, because of the #= test failing for intervals that cover the same range and have the same hash, that it overrides the equal hash value and adds the interval to the collection. I find this troubling.
Lou
thanks, cbc
On Thu, Nov 1, 2018 at 5:40 AM Louis LaBrunda Lou@keystone-software.com wrote:
Hi Benoit,
On the latest version of VA Smalltalk:
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413] VM Timestamp: 4.0, 10/01/18 (100)
I see:
(0 to: 1) = (0 to: 5/3). "false" (0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev < squeak-dev@lists.squeakfoundation.org> wrote:
Interesting!
As a comparison: Squeak 5.2 (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false" Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true" Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far (according to what I could test, only VW is right (according to the ANSI standard and just plain logic!)
I wonder how much code relies on this "behavior" out there! But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53, http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf):
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
Benoît St-Jean Yahoo! Messenger: bstjean Twitter: @BenLeChialeux Pinterest: benoitstjean Instagram: Chef_Benito IRC: lamneth Blogue: endormitoire.wordpress.com "A standpoint is an intellectual horizon of radius zero". (A. Einstein)
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
Hi Louis, On Fri, Nov 2, 2018 at 9:12 AM Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham < cunningham.cb@gmail.com> wrote:
ParcPlace-Digitalk VSE 3.1 (roughly 1999):
(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on where they want to be. This is also the direction I want to take Squeak. VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Squeak, Pharo, Dolphin all currently have the same answer, but are not consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are thinking about it but haven't decided what to do yet. It turns out that the way collections (like Set) that use #hash in VA Smalltalk work, because of the #= test failing for intervals that cover the same range and have the same hash, that it overrides the equal hash value and adds the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash values have to be equal as well.
There is no statement about if two objects hashes are the same, what this means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's - if objects have the same hash but are not equal, then they will both be in the hashed collection (such as Set). The squeak implementation is described in Set>>scanFor: . This method also shows why having objects equal but their hash not equal is so dangerous - if you had two objects that are supposed to be one and the same and are in fact #= but don't have the same hash, they can both show up in a Set together, or as keys in a Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they are undoubtedly doing something similar in their collections that we do in Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for some of my classes to 1. This actually did work, but is a horrible (I mean HORRIBLE) idea - it really, really slows down the system when you have more than a couple instances of an object, but it does work.
-cbc
thanks, cbc
On Thu, Nov 1, 2018 at 5:40 AM Louis LaBrunda Lou@keystone-software.com wrote:
Hi Benoit,
On the latest version of VA Smalltalk:
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413] VM Timestamp: 4.0, 10/01/18 (100)
I see:
(0 to: 1) = (0 to: 5/3). "false" (0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev < squeak-dev@lists.squeakfoundation.org> wrote:
Interesting!
As a comparison: Squeak 5.2 (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true" Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far (according to what I could test, only VW is right (according to the ANSI standard and just plain logic!)
I wonder how much code relies on this "behavior" out there! But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53, http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf
):
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
Benoît St-Jean Yahoo! Messenger: bstjean Twitter: @BenLeChialeux Pinterest: benoitstjean Instagram: Chef_Benito IRC: lamneth Blogue: endormitoire.wordpress.com "A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
All of that said, I too find the VA troubling a bit in this case. I rely on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits cross-dialect portability, although I don't (personally) use VA, other folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and hash rule in this case. Its just not what I would want.
-cbc
On Fri, Nov 2, 2018 at 9:52 AM Chris Cunningham cunningham.cb@gmail.com wrote:
Hi Louis, On Fri, Nov 2, 2018 at 9:12 AM Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham < cunningham.cb@gmail.com> wrote:
ParcPlace-Digitalk VSE 3.1 (roughly 1999):
(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
they want to be. This is also the direction I want to take Squeak. VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Squeak, Pharo, Dolphin all currently have the same answer, but are not consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are thinking about it but haven't decided what to do yet. It turns out that the way collections (like Set) that use #hash in VA Smalltalk work, because of the #= test failing for intervals that cover the same range and have the same hash, that it overrides the equal hash value and adds the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash values have to be equal as well.
There is no statement about if two objects hashes are the same, what this means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is described in Set>>scanFor: . This method also shows why having objects equal but their hash not equal is so dangerous - if you had two objects that are supposed to be one and the same and are in fact #= but don't have the same hash, they can both show up in a Set together, or as keys in a Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they are undoubtedly doing something similar in their collections that we do in Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for some of my classes to 1. This actually did work, but is a horrible (I mean HORRIBLE) idea - it really, really slows down the system when you have more than a couple instances of an object, but it does work.
-cbc
thanks, cbc
On Thu, Nov 1, 2018 at 5:40 AM Louis LaBrunda <Lou@keystone-software.com
wrote:
Hi Benoit,
On the latest version of VA Smalltalk:
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413] VM Timestamp: 4.0, 10/01/18 (100)
I see:
(0 to: 1) = (0 to: 5/3). "false" (0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
squeak-dev@lists.squeakfoundation.org> wrote:
Interesting!
As a comparison: Squeak 5.2 (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true" Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far (according to what I could test, only VW is right (according to the
ANSI
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there! But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53,
http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf):
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
Benoît St-Jean Yahoo! Messenger: bstjean Twitter: @BenLeChialeux Pinterest: benoitstjean Instagram: Chef_Benito IRC: lamneth Blogue: endormitoire.wordpress.com "A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
Hi Chris,
I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same range and not worry about how they got there. In the case where the increment (by) is an integer the start and end values map down to integers and if those integers are the same in two intervals then the intervals span the same range and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk they would work the same but you can't tell that with #=.
I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment is an integer.
I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the concern.
Lou
On Fri, 2 Nov 2018 10:01:38 -0700, Chris Cunningham cunningham.cb@gmail.com wrote:
All of that said, I too find the VA troubling a bit in this case. I rely on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits cross-dialect portability, although I don't (personally) use VA, other folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and hash rule in this case. Its just not what I would want.
-cbc
On Fri, Nov 2, 2018 at 9:52 AM Chris Cunningham cunningham.cb@gmail.com wrote:
Hi Louis, On Fri, Nov 2, 2018 at 9:12 AM Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham < cunningham.cb@gmail.com> wrote:
ParcPlace-Digitalk VSE 3.1 (roughly 1999):
(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
they want to be. This is also the direction I want to take Squeak. VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Squeak, Pharo, Dolphin all currently have the same answer, but are not consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are thinking about it but haven't decided what to do yet. It turns out that the way collections (like Set) that use #hash in VA Smalltalk work, because of the #= test failing for intervals that cover the same range and have the same hash, that it overrides the equal hash value and adds the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash values have to be equal as well.
There is no statement about if two objects hashes are the same, what this means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is described in Set>>scanFor: . This method also shows why having objects equal but their hash not equal is so dangerous - if you had two objects that are supposed to be one and the same and are in fact #= but don't have the same hash, they can both show up in a Set together, or as keys in a Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they are undoubtedly doing something similar in their collections that we do in Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for some of my classes to 1. This actually did work, but is a horrible (I mean HORRIBLE) idea - it really, really slows down the system when you have more than a couple instances of an object, but it does work.
-cbc
thanks, cbc
On Thu, Nov 1, 2018 at 5:40 AM Louis LaBrunda <Lou@keystone-software.com
wrote:
Hi Benoit,
On the latest version of VA Smalltalk:
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413] VM Timestamp: 4.0, 10/01/18 (100)
I see:
(0 to: 1) = (0 to: 5/3). "false" (0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
squeak-dev@lists.squeakfoundation.org> wrote:
Interesting!
As a comparison: Squeak 5.2 (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true" Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far (according to what I could test, only VW is right (according to the
ANSI
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there! But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53,
http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf):
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
Benoît St-Jean Yahoo! Messenger: bstjean Twitter: @BenLeChialeux Pinterest: benoitstjean Instagram: Chef_Benito IRC: lamneth Blogue: endormitoire.wordpress.com "A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
Interval >> size does the correct thing with the stop value, so maybe Interval >> = could use:
isInterval and: [start = anInterval start and: [step = anInterval step and: [self size = anInterval size]]]
— tim
On Nov 2, 2018, at 12:53 PM, Louis LaBrunda Lou@Keystone-Software.com wrote:
Hi Chris,
I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same range and not worry about how they got there. In the case where the increment (by) is an integer the start and end values map down to integers and if those integers are the same in two intervals then the intervals span the same range and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk they would work the same but you can't tell that with #=.
I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment is an integer.
I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the concern.
Lou
On Fri, 2 Nov 2018 10:01:38 -0700, Chris Cunningham cunningham.cb@gmail.com wrote:
All of that said, I too find the VA troubling a bit in this case. I rely on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits cross-dialect portability, although I don't (personally) use VA, other folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and hash rule in this case. Its just not what I would want.
-cbc
On Fri, Nov 2, 2018 at 9:52 AM Chris Cunningham cunningham.cb@gmail.com wrote:
Hi Louis, On Fri, Nov 2, 2018 at 9:12 AM Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham < cunningham.cb@gmail.com> wrote:
ParcPlace-Digitalk VSE 3.1 (roughly 1999):
(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
they want to be. This is also the direction I want to take Squeak. VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Squeak, Pharo, Dolphin all currently have the same answer, but are not consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are thinking about it but haven't decided what to do yet. It turns out that the way collections (like Set) that use #hash in VA Smalltalk work, because of the #= test failing for intervals that cover the same range and have the same hash, that it overrides the equal hash value and adds the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash values have to be equal as well.
There is no statement about if two objects hashes are the same, what this means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is described in Set>>scanFor: . This method also shows why having objects equal but their hash not equal is so dangerous - if you had two objects that are supposed to be one and the same and are in fact #= but don't have the same hash, they can both show up in a Set together, or as keys in a Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they are undoubtedly doing something similar in their collections that we do in Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for some of my classes to 1. This actually did work, but is a horrible (I mean HORRIBLE) idea - it really, really slows down the system when you have more than a couple instances of an object, but it does work.
-cbc
thanks, cbc
On Thu, Nov 1, 2018 at 5:40 AM Louis LaBrunda <Lou@keystone-software.com
wrote:
Hi Benoit,
On the latest version of VA Smalltalk:
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413] VM Timestamp: 4.0, 10/01/18 (100)
I see:
(0 to: 1) = (0 to: 5/3). "false" (0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
squeak-dev@lists.squeakfoundation.org> wrote:
> Interesting! > > As a comparison: > Squeak 5.2 > (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
> Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false" > VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" > (0 to: 1) hash = (0 to: 5/3) hash. "true" > Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" > (0 to: 1) hash = (0 to: 5/3) hash. "false" > > I don't have VAST installed on the PC I'm using right now. I'd be curious to see how other Smalltalk and/or GemStone handle this? So far (according to what I could test, only VW is right (according to the
ANSI
standard and just plain logic!) > > I wonder how much code relies on this "behavior" out there! > But the ANSI Smalltalk draft is very clear on this (revision 1.9, page 53,
http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf):
> "If the value of receiver = comparand is true then the receiver and comparand *must* have equivalent hash values." > That's what I always thought (or was taught or even read in the Blue Book). Was this something that was changed at some point??? > > ---------------- > Benoît St-Jean > Yahoo! Messenger: bstjean > Twitter: @BenLeChialeux > Pinterest: benoitstjean > Instagram: Chef_Benito > IRC: lamneth > Blogue: endormitoire.wordpress.com > "A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
Hi Tim,
After thoroughly discussing this with the VA Smalltalk guys I have concluded that the developer should be responsible for creating intervals that have equal ranges that compare equal. For example intervals with mixed integers and real values can have the same range but don't compare equal. If comparing equal is desired, the values should be made to make that work.
Lou
On Thu, 15 Nov 2018 07:44:01 -0600, Tim Olson tim.olson.mail@gmail.com wrote:
Interval >> size does the correct thing with the stop value, so maybe Interval >> = could use:
isInterval and: [start = anInterval start and: [step = anInterval step and: [self size = anInterval size]]]
tim
On Nov 2, 2018, at 12:53 PM, Louis LaBrunda Lou@Keystone-Software.com wrote:
Hi Chris,
I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same range and not worry about how they got there. In the case where the increment (by) is an integer the start and end values map down to integers and if those integers are the same in two intervals then the intervals span the same range and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk they would work the same but you can't tell that with #=.
I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment is an integer.
I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the concern.
Lou
On Fri, 2 Nov 2018 10:01:38 -0700, Chris Cunningham cunningham.cb@gmail.com wrote:
All of that said, I too find the VA troubling a bit in this case. I rely on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits cross-dialect portability, although I don't (personally) use VA, other folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and hash rule in this case. Its just not what I would want.
-cbc
On Fri, Nov 2, 2018 at 9:52 AM Chris Cunningham cunningham.cb@gmail.com wrote:
Hi Louis, On Fri, Nov 2, 2018 at 9:12 AM Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham < cunningham.cb@gmail.com> wrote:
ParcPlace-Digitalk VSE 3.1 (roughly 1999):
(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
they want to be. This is also the direction I want to take Squeak. VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Squeak, Pharo, Dolphin all currently have the same answer, but are not consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are thinking about it but haven't decided what to do yet. It turns out that the way collections (like Set) that use #hash in VA Smalltalk work, because of the #= test failing for intervals that cover the same range and have the same hash, that it overrides the equal hash value and adds the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash values have to be equal as well.
There is no statement about if two objects hashes are the same, what this means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is described in Set>>scanFor: . This method also shows why having objects equal but their hash not equal is so dangerous - if you had two objects that are supposed to be one and the same and are in fact #= but don't have the same hash, they can both show up in a Set together, or as keys in a Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they are undoubtedly doing something similar in their collections that we do in Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for some of my classes to 1. This actually did work, but is a horrible (I mean HORRIBLE) idea - it really, really slows down the system when you have more than a couple instances of an object, but it does work.
-cbc
thanks, cbc
On Thu, Nov 1, 2018 at 5:40 AM Louis LaBrunda <Lou@keystone-software.com
wrote:
> Hi Benoit, > > On the latest version of VA Smalltalk: > > VA Smalltalk V9.1 (32-bit); Image: 9.1 [413] > VM Timestamp: 4.0, 10/01/18 (100) > > I see: > > (0 to: 1) = (0 to: 5/3). "false" > (0 to: 1) hash = (0 to: 5/3) hash. "true" > > Very interesting. > > Lou > > > On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
> squeak-dev@lists.squeakfoundation.org> wrote: > >> Interesting! >> >> As a comparison: >> Squeak 5.2 >> (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
>> Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) > hash. "false" >> VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" >> (0 to: 1) hash = (0 to: 5/3) hash. "true" >> Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" >> (0 to: 1) hash = (0 to: 5/3) hash. "false" >> >> I don't have VAST installed on the PC I'm using right now. I'd be > curious to see how other Smalltalk and/or GemStone handle this? So far > (according to what I could test, only VW is right (according to the
ANSI
> standard and just plain logic!) >> >> I wonder how much code relies on this "behavior" out there! >> But the ANSI Smalltalk draft is very clear on this (revision 1.9, page > 53,
http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf):
>> "If the value of receiver = comparand is true then the receiver and > comparand *must* have equivalent hash values." >> That's what I always thought (or was taught or even read in the Blue > Book). Was this something that was changed at some point??? >> >> ---------------- >> Benoît St-Jean >> Yahoo! Messenger: bstjean >> Twitter: @BenLeChialeux >> Pinterest: benoitstjean >> Instagram: Chef_Benito >> IRC: lamneth >> Blogue: endormitoire.wordpress.com >> "A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
> -- > Louis LaBrunda > Keystone Software Corp. > SkypeMe callto://PhotonDemon > > >
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
On Nov 15, 2018, at 5:55 AM, Louis LaBrunda Lou@Keystone-Software.com wrote:
Hi Tim,
After thoroughly discussing this with the VA Smalltalk guys I have concluded that the developer should be responsible for creating intervals that have equal ranges that compare equal. For example intervals with mixed integers and real values can have the same range but don't compare equal. If comparing equal is desired, the values should be made to make that work.
IMO that’s a cop out. An implementation which compares two intervals as equal if their elements are equal makes perfect sense and is easy to implement. All that’s needed is that the implementation access “self last” instead of “stop”.
Implementing newHash as one that uses self last in place of stop then in my image
| insts s | insts := Interval allInstances. { insts size. s := (insts select: [:i| i hash ~= i newHash]) size. s * 100.0 / insts size } #(3267 0 0.0)
So there's minimal risk in breaking anything simply redefining hash (I would also reformat #= as per my suggestion ;-) ).
Lou
On Thu, 15 Nov 2018 07:44:01 -0600, Tim Olson tim.olson.mail@gmail.com wrote:
Interval >> size does the correct thing with the stop value, so maybe Interval >> = could use:
isInterval and: [start = anInterval start and: [step = anInterval step and: [self size = anInterval size]]]
— tim
On Nov 2, 2018, at 12:53 PM, Louis LaBrunda Lou@Keystone-Software.com wrote:
Hi Chris,
I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same range and not worry about how they got there. In the case where the increment (by) is an integer the start and end values map down to integers and if those integers are the same in two intervals then the intervals span the same range and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk they would work the same but you can't tell that with #=.
I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment is an integer.
I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the concern.
Lou
On Fri, 2 Nov 2018 10:01:38 -0700, Chris Cunningham cunningham.cb@gmail.com wrote:
All of that said, I too find the VA troubling a bit in this case. I rely on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits cross-dialect portability, although I don't (personally) use VA, other folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and hash rule in this case. Its just not what I would want.
-cbc
On Fri, Nov 2, 2018 at 9:52 AM Chris Cunningham cunningham.cb@gmail.com wrote:
Hi Louis, On Fri, Nov 2, 2018 at 9:12 AM Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham < cunningham.cb@gmail.com> wrote:
> ParcPlace-Digitalk VSE 3.1 (roughly 1999): > > (0 to: 1) = (0 to: 5/3). "true" > (0 to: 1) hash = (0 to: 5/3) hash. "true" > > So, ancient VSE and current VisualWorks are consistent, and agree on where > they want to be. This is also the direction I want to take Squeak. > VA is also consistent, but #= doesn't match any other Smalltalk varient that we've looked at. > Squeak, Pharo, Dolphin all currently have the same answer, but are not > consistent.
> Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are thinking about it but haven't decided what to do yet. It turns out that the way collections (like Set) that use #hash in VA Smalltalk work, because of the #= test failing for intervals that cover the same range and have the same hash, that it overrides the equal hash value and adds the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash values have to be equal as well.
There is no statement about if two objects hashes are the same, what this means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is described in Set>>scanFor: . This method also shows why having objects equal but their hash not equal is so dangerous - if you had two objects that are supposed to be one and the same and are in fact #= but don't have the same hash, they can both show up in a Set together, or as keys in a Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they are undoubtedly doing something similar in their collections that we do in Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for some of my classes to 1. This actually did work, but is a horrible (I mean HORRIBLE) idea - it really, really slows down the system when you have more than a couple instances of an object, but it does work.
-cbc
> thanks, > cbc > > On Thu, Nov 1, 2018 at 5:40 AM Louis LaBrunda <Lou@keystone-software.com > > wrote: > >> Hi Benoit, >> >> On the latest version of VA Smalltalk: >> >> VA Smalltalk V9.1 (32-bit); Image: 9.1 [413] >> VM Timestamp: 4.0, 10/01/18 (100) >> >> I see: >> >> (0 to: 1) = (0 to: 5/3). "false" >> (0 to: 1) hash = (0 to: 5/3) hash. "true" >> >> Very interesting. >> >> Lou >> >> >> On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev < >> squeak-dev@lists.squeakfoundation.org> wrote: >> >>> Interesting! >>> >>> As a comparison: >>> Squeak 5.2 >>> (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false" >>> Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) >> hash. "false" >>> VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" >>> (0 to: 1) hash = (0 to: 5/3) hash. "true" >>> Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" >>> (0 to: 1) hash = (0 to: 5/3) hash. "false" >>> >>> I don't have VAST installed on the PC I'm using right now. I'd be >> curious to see how other Smalltalk and/or GemStone handle this? So far >> (according to what I could test, only VW is right (according to the ANSI >> standard and just plain logic!) >>> >>> I wonder how much code relies on this "behavior" out there! >>> But the ANSI Smalltalk draft is very clear on this (revision 1.9, page >> 53, http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf): >>> "If the value of receiver = comparand is true then the receiver and >> comparand *must* have equivalent hash values." >>> That's what I always thought (or was taught or even read in the Blue >> Book). Was this something that was changed at some point??? >>> >>> ---------------- >>> Benoît St-Jean >>> Yahoo! Messenger: bstjean >>> Twitter: @BenLeChialeux >>> Pinterest: benoitstjean >>> Instagram: Chef_Benito >>> IRC: lamneth >>> Blogue: endormitoire.wordpress.com >>> "A standpoint is an intellectual horizon of radius zero". (A. Einstein) >> -- >> Louis LaBrunda >> Keystone Software Corp. >> SkypeMe callto://PhotonDemon >> >>
>>
Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
Hi Eliot,
I really shouldn't speak for Instantiations but since I brought them into this conversation I will say this:
(0 to: 1) = (0 to: 5/3). "false" (0 to: 1) hash = (0 to: 5/3) hash. "true"
The comparison of the intervals answers false. I argued strenuously that two intervals that cover the same range should compare as equal. Unfortunately the ANSI standard is I think ambiguous on this point. It says that if two things compare equal their hashes should be equal but here the two intervals don't compare equal. The VA Smalltalk code has been this way for over 20 years. Changing it could impact an unknown amount of customer code. I eventually concluded that even though the ranges were equal the objects were not and that their definition of equal was as valid as any other. If this came up 20+ years ago, maybe they could be convinced to change their definition. Now I agree with them, it is too late and too dangerous.
Since this is Smalltalk, if one is really interested in intervals that cover the same range comparing equal, there are simple ways to make that work. Yes, moving code from Squeal to VA Smalltalk would need a little love but probably not much.
Lou
On Thu, 15 Nov 2018 07:50:31 -0800, Eliot Miranda eliot.miranda@gmail.com wrote:
On Nov 15, 2018, at 5:55 AM, Louis LaBrunda Lou@Keystone-Software.com wrote:
Hi Tim,
After thoroughly discussing this with the VA Smalltalk guys I have concluded that the developer should be responsible for creating intervals that have equal ranges that compare equal. For example intervals with mixed integers and real values can have the same range but don't compare equal. If comparing equal is desired, the values should be made to make that work.
IMO thats a cop out. An implementation which compares two intervals as equal if their elements are equal makes perfect sense and is easy to implement. All thats needed is that the implementation access self last instead of stop.
Implementing newHash as one that uses self last in place of stop then in my image
| insts s | insts := Interval allInstances. { insts size. s := (insts select: [:i| i hash ~= i newHash]) size. s * 100.0 / insts size } #(3267 0 0.0)
So there's minimal risk in breaking anything simply redefining hash (I would also reformat #= as per my suggestion ;-) ).
Lou
On Thu, 15 Nov 2018 07:44:01 -0600, Tim Olson tim.olson.mail@gmail.com wrote:
Interval >> size does the correct thing with the stop value, so maybe Interval >> = could use:
isInterval and: [start = anInterval start and: [step = anInterval step and: [self size = anInterval size]]]
tim
On Nov 2, 2018, at 12:53 PM, Louis LaBrunda Lou@Keystone-Software.com wrote:
Hi Chris,
I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same range and not worry about how they got there. In the case where the increment (by) is an integer the start and end values map down to integers and if those integers are the same in two intervals then the intervals span the same range and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk they would work the same but you can't tell that with #=.
I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment is an integer.
I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the concern.
Lou
On Fri, 2 Nov 2018 10:01:38 -0700, Chris Cunningham cunningham.cb@gmail.com wrote:
All of that said, I too find the VA troubling a bit in this case. I rely on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits cross-dialect portability, although I don't (personally) use VA, other folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and hash rule in this case. Its just not what I would want.
-cbc
On Fri, Nov 2, 2018 at 9:52 AM Chris Cunningham cunningham.cb@gmail.com wrote:
Hi Louis, On Fri, Nov 2, 2018 at 9:12 AM Louis LaBrunda Lou@keystone-software.com wrote:
> Hi Chris, > > On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham < > cunningham.cb@gmail.com> wrote: > >> ParcPlace-Digitalk VSE 3.1 (roughly 1999): >> >> (0 to: 1) = (0 to: 5/3). "true" >> (0 to: 1) hash = (0 to: 5/3) hash. "true" >> >> So, ancient VSE and current VisualWorks are consistent, and agree on > where >> they want to be. This is also the direction I want to take Squeak. >> VA is also consistent, but #= doesn't match any other Smalltalk varient > that we've looked at. >> Squeak, Pharo, Dolphin all currently have the same answer, but are not >> consistent. > >> Interesting indeed. > > I have been talking to the VA Smalltalk guys about this and they are > thinking about it but haven't decided what to do > yet. It turns out that the way collections (like Set) that use #hash in > VA Smalltalk work, because of the #= test > failing for intervals that cover the same range and have the same hash, > that it overrides the equal hash value and adds > the interval to the collection. I find this troubling. > > Lou >
The rules for = and hash are that if two object are #=, then their hash values have to be equal as well.
There is no statement about if two objects hashes are the same, what this means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is described in Set>>scanFor: . This method also shows why having objects equal but their hash not equal is so dangerous - if you had two objects that are supposed to be one and the same and are in fact #= but don't have the same hash, they can both show up in a Set together, or as keys in a Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they are undoubtedly doing something similar in their collections that we do in Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for some of my classes to 1. This actually did work, but is a horrible (I mean HORRIBLE) idea - it really, really slows down the system when you have more than a couple instances of an object, but it does work.
-cbc
> > >> thanks, >> cbc >> >> On Thu, Nov 1, 2018 at 5:40 AM Louis LaBrunda <Lou@keystone-software.com >> >> wrote: >> >>> Hi Benoit, >>> >>> On the latest version of VA Smalltalk: >>> >>> VA Smalltalk V9.1 (32-bit); Image: 9.1 [413] >>> VM Timestamp: 4.0, 10/01/18 (100) >>> >>> I see: >>> >>> (0 to: 1) = (0 to: 5/3). "false" >>> (0 to: 1) hash = (0 to: 5/3) hash. "true" >>> >>> Very interesting. >>> >>> Lou >>> >>> >>> On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev > < >>> squeak-dev@lists.squeakfoundation.org> wrote: >>> >>>> Interesting! >>>> >>>> As a comparison: >>>> Squeak 5.2 >>>> (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. > "false" >>>> Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) >>> hash. "false" >>>> VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" >>>> (0 to: 1) hash = (0 to: 5/3) hash. "true" >>>> Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" >>>> (0 to: 1) hash = (0 to: 5/3) hash. "false" >>>> >>>> I don't have VAST installed on the PC I'm using right now. I'd be >>> curious to see how other Smalltalk and/or GemStone handle this? So far >>> (according to what I could test, only VW is right (according to the > ANSI >>> standard and just plain logic!) >>>> >>>> I wonder how much code relies on this "behavior" out there! >>>> But the ANSI Smalltalk draft is very clear on this (revision 1.9, page >>> 53, > http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf): >>>> "If the value of receiver = comparand is true then the receiver and >>> comparand *must* have equivalent hash values." >>>> That's what I always thought (or was taught or even read in the Blue >>> Book). Was this something that was changed at some point??? >>>> >>>> ---------------- >>>> Benoît St-Jean >>>> Yahoo! Messenger: bstjean >>>> Twitter: @BenLeChialeux >>>> Pinterest: benoitstjean >>>> Instagram: Chef_Benito >>>> IRC: lamneth >>>> Blogue: endormitoire.wordpress.com >>>> "A standpoint is an intellectual horizon of radius zero". (A. > Einstein) >>> -- >>> Louis LaBrunda >>> Keystone Software Corp. >>> SkypeMe callto://PhotonDemon >>> >>> >>> > -- > Louis LaBrunda > Keystone Software Corp. > SkypeMe callto://PhotonDemon > > >
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
Isn't this extremely simple to fix?
#= is implemented in terms of start, step, and last.
So why not implement #hash as
^(start hash bitXor: step hash) bitXor: self last hash
Then in the postscript do a
HashedCollection rehashAll
and that should be it, right?
I did a quick check using
Dictionary allInstances select: [ :dict | dict keys anySatisfy: #isInterval]
and the system seems fine after that.
- Bert -
Isn't this extremely simple to fix?
#= is implemented in terms of start, step, and last.
So why not implement #hash as
^(start hash bitXor: step hash) bitXor: self last hash
Then in the postscript do a
HashedCollection rehashAll
and that should be it, right?
I did a quick check using
Dictionary allInstances select: [ :dict | dict keys anySatisfy: #isInterval]
and the system seems fine after that.
You forgot about the universe that lies beyond the fringes of the image. There be persistent data files out there. And users. Possibly with systems that rely on Interval>>#hash.
But since most applications don't use non-Integer based Intervals (since Interval doesn't really support them, by design, I guess), there's no reason whatsoever to decimate the universe when Eliot's simply fixes the corner case that no one is using anyway.
+1 to Eliots suggestion to fix Interval>>#hash.
- Chris
Hi Guys,
I don't work for Instantiations, so this decision isn't mine to make. That said, I have to agree with their desire to be cautious. There is no up side to them to change this and even though the down side should be small, there is no real way of knowing how big or small it is.
Lou
On Thu, 15 Nov 2018 15:32:09 -0600, Chris Muller asqueaker@gmail.com wrote:
Isn't this extremely simple to fix?
#= is implemented in terms of start, step, and last.
So why not implement #hash as
^(start hash bitXor: step hash) bitXor: self last hash
Then in the postscript do a
HashedCollection rehashAll
and that should be it, right?
I did a quick check using
Dictionary allInstances select: [ :dict | dict keys anySatisfy: #isInterval]
and the system seems fine after that.
You forgot about the universe that lies beyond the fringes of the image. There be persistent data files out there. And users. Possibly with systems that rely on Interval>>#hash.
But since most applications don't use non-Integer based Intervals (since Interval doesn't really support them, by design, I guess), there's no reason whatsoever to decimate the universe when Eliot's simply fixes the corner case that no one is using anyway.
+1 to Eliots suggestion to fix Interval>>#hash.
- Chris
Nice. It seems like we have consensus on what to change.
I'll push these changes (with the tests) to trunk soon.
The fix I have for #hash was exactly what Elliot suggested. I'll make sure to include the rehash as well (thanks for the code snippit Bert!) If no one objects strenuously, I'll also include Eliot's slight rewrite of #= has well - it is marginally cleaner and equally fast, so now is a reasonable time to include it.
I'll delay working on bug #3380 for now - to fix this, we'd have to also add in a check on class in #= to make sure we aren't comparing an interval to an array. Unless someone has been bitten by this recently, I'd rather wait.
-cbc
On Thu, Nov 15, 2018 at 1:32 PM Chris Muller asqueaker@gmail.com wrote:
Isn't this extremely simple to fix?
#= is implemented in terms of start, step, and last.
So why not implement #hash as
^(start hash bitXor: step hash) bitXor: self last hash
Then in the postscript do a
HashedCollection rehashAll
and that should be it, right?
I did a quick check using
Dictionary allInstances select: [ :dict | dict keys anySatisfy:
#isInterval]
and the system seems fine after that.
You forgot about the universe that lies beyond the fringes of the image. There be persistent data files out there. And users. Possibly with systems that rely on Interval>>#hash.
But since most applications don't use non-Integer based Intervals (since Interval doesn't really support them, by design, I guess), there's no reason whatsoever to decimate the universe when Eliot's simply fixes the corner case that no one is using anyway.
+1 to Eliots suggestion to fix Interval>>#hash.
- Chris
Nice. It seems like we have consensus on what to change.
I'll push these changes (with the tests) to trunk soon.
Hey, "seems" carries enough uncertainty to give us one final look before trunk. By "these changes" are you referring to just the Interval>>#hash or some Array changes, too? All we've seen so far are Collections-cbc.810.mcz, could we get one look at your final draft proposal before trunk?
On a less important note, I personally find a pure conditional nomenclature more attractive than the embedded ifTrue:ifFalse:, like:
= anObject ^ self == anObject or: [ (anObject isInterval and: [ start = anObject first and: [ step = anObject increment and: [ self last = anObject last ] ] ]) or: [ super = anObject ] ]
For whatever and whenever you push, I'm sure you already were but, just in case, I would be grateful if you would please base it solely off the current top trunk version with no intermediate versions in the ancestry. :-)
Thanks a lot finding this and helping get it fixed!
Best Regards, Chris
- Chris
The fix I have for #hash was exactly what Elliot suggested. I'll make sure to include the rehash as well (thanks for the code snippit Bert!) If no one objects strenuously, I'll also include Eliot's slight rewrite of #= has well - it is marginally cleaner and equally fast, so now is a reasonable time to include it.
I'll delay working on bug #3380 for now - to fix this, we'd have to also add in a check on class in #= to make sure we aren't comparing an interval to an array. Unless someone has been bitten by this recently, I'd rather wait.
-cbc
On Thu, Nov 15, 2018 at 1:32 PM Chris Muller asqueaker@gmail.com wrote:
Isn't this extremely simple to fix?
#= is implemented in terms of start, step, and last.
So why not implement #hash as
^(start hash bitXor: step hash) bitXor: self last hash
Then in the postscript do a
HashedCollection rehashAll
and that should be it, right?
I did a quick check using
Dictionary allInstances select: [ :dict | dict keys anySatisfy: #isInterval]
and the system seems fine after that.
You forgot about the universe that lies beyond the fringes of the image. There be persistent data files out there. And users. Possibly with systems that rely on Interval>>#hash.
But since most applications don't use non-Integer based Intervals (since Interval doesn't really support them, by design, I guess), there's no reason whatsoever to decimate the universe when Eliot's simply fixes the corner case that no one is using anyway.
+1 to Eliots suggestion to fix Interval>>#hash.
- Chris
HI Chris,
The changes will be limited to Interval, and will be changes to #= and hash (and the interval test so this doesn't show up again).
I'll push the changes to inbox soon; and to trunk tomorrow/early next week. The test will go to Trunk with the changes to inbox (the test will be what I've pushed to the inbox minus the 3380 part).
And, yes, I'll rebase if off of the current trunk version - there has been significant changes since my last proposal.
Interestingly:
= anObject ^ self == anObject or: [ (anObject isInterval and: [ start = anObject first and: [ step = anObject increment and: [ self last = anObject last ] ] ]) or: [ super = anObject ] ]
This is actually wrong - if the two items to compare are intervals but they don't match based on interval hash (first/last/increment), then it will check if super #= returns true - that is not desirable. But, I understand the desire you mention here - which I believe is what Eliot was driving for as well.
-cbc
On Thu, Nov 15, 2018 at 2:21 PM Chris Muller asqueaker@gmail.com wrote:
Nice. It seems like we have consensus on what to change.
I'll push these changes (with the tests) to trunk soon.
Hey, "seems" carries enough uncertainty to give us one final look before trunk. By "these changes" are you referring to just the Interval>>#hash or some Array changes, too? All we've seen so far are Collections-cbc.810.mcz, could we get one look at your final draft proposal before trunk?
On a less important note, I personally find a pure conditional nomenclature more attractive than the embedded ifTrue:ifFalse:, like:
= anObject ^ self == anObject or: [ (anObject isInterval and: [ start = anObject first and: [ step = anObject increment and: [ self last = anObject last ] ] ]) or: [ super = anObject ] ]
For whatever and whenever you push, I'm sure you already were but, just in case, I would be grateful if you would please base it solely off the current top trunk version with no intermediate versions in the ancestry. :-)
Thanks a lot finding this and helping get it fixed!
Best Regards, Chris
- Chris
The fix I have for #hash was exactly what Elliot suggested. I'll make sure to include the rehash as well (thanks for the code
snippit Bert!)
If no one objects strenuously, I'll also include Eliot's slight rewrite
of #= has well - it is marginally cleaner and equally fast, so now is a reasonable time to include it.
I'll delay working on bug #3380 for now - to fix this, we'd have to also
add in a check on class in #= to make sure we aren't comparing an interval to an array. Unless someone has been bitten by this recently, I'd rather wait.
-cbc
On Thu, Nov 15, 2018 at 1:32 PM Chris Muller asqueaker@gmail.com
wrote:
Isn't this extremely simple to fix?
#= is implemented in terms of start, step, and last.
So why not implement #hash as
^(start hash bitXor: step hash) bitXor: self last hash
Then in the postscript do a
HashedCollection rehashAll
and that should be it, right?
I did a quick check using
Dictionary allInstances select: [ :dict | dict keys anySatisfy:
#isInterval]
and the system seems fine after that.
You forgot about the universe that lies beyond the fringes of the image. There be persistent data files out there. And users. Possibly with systems that rely on Interval>>#hash.
But since most applications don't use non-Integer based Intervals (since Interval doesn't really support them, by design, I guess), there's no reason whatsoever to decimate the universe when Eliot's simply fixes the corner case that no one is using anyway.
+1 to Eliots suggestion to fix Interval>>#hash.
- Chris
Somehow I missed Eliot's version, but unsurprisingly he had exactly the same idea (use "last" not "stop" for hash). I'd still think bitXor: is preferable to bitOr, that is the standard way in almost all hash methods. But ...
BUT: I forgot about the super fallback in #=. That makes this discussion pretty much moot, because since
#(1 2 3) = (1 to: 3) "true"
is true, this must also be true:
#(1 2 3) hash = (1 to: 3) hash "must be true"
So the only proper fix IMHO is to remove #hash from Interval (or replace it with ^super hash and a proper comment)
- Bert -
Hi Bert,
On Thu, Nov 15, 2018 at 4:38 PM Bert Freudenberg bert@freudenbergs.de wrote:
Somehow I missed Eliot's version, but unsurprisingly he had exactly the same idea (use "last" not "stop" for hash). I'd still think bitXor: is preferable to bitOr, that is the standard way in almost all hash methods. But ...
BUT: I forgot about the super fallback in #=. That makes this discussion pretty much moot, because since
#(1 2 3) = (1 to: 3) "true"
is true, this must also be true:
#(1 2 3) hash = (1 to: 3) hash "must be true"
So the only proper fix IMHO is to remove #hash from Interval (or replace it with ^super hash and a proper comment)
We discussed this a couple of weeks ago. There is no need for #(1 2 3) = (1 to: 3) to be true. #(1 2 3) = #[1 2 3] isn’t true. And we have hasEqualElements:. So a more coherent approach is for the hack that makes intervals equal to arrays be discarded, and the hashes kept distinct.
- Bert -
On Thu, Nov 15, 2018 at 5:36 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Bert,
On Thu, Nov 15, 2018 at 4:38 PM Bert Freudenberg bert@freudenbergs.de wrote:
Somehow I missed Eliot's version, but unsurprisingly he had exactly the same idea (use "last" not "stop" for hash). I'd still think bitXor: is preferable to bitOr, that is the standard way in almost all hash methods. But ...
BUT: I forgot about the super fallback in #=. That makes this discussion pretty much moot, because since
#(1 2 3) = (1 to: 3) "true"
is true, this must also be true:
#(1 2 3) hash = (1 to: 3) hash "must be true"
So the only proper fix IMHO is to remove #hash from Interval (or replace it with ^super hash and a proper comment)
We discussed this a couple of weeks ago. There is no need for #(1 2 3) = (1 to: 3) to be true. #(1 2 3) = #[1 2 3] isn’t true. And we have hasEqualElements:. So a more coherent approach is for the hack that makes intervals equal to arrays be discarded, and the hashes kept distinct.
Makes sense. The version you posted ("I would have written...") still delegated to super>>= so I thought we wanted to keep that. But I agree that it's of little utility.
- Bert -
On Thu, Nov 15, 2018 at 5:36 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Bert,
On Thu, Nov 15, 2018 at 4:38 PM Bert Freudenberg bert@freudenbergs.de wrote:
Somehow I missed Eliot's version, but unsurprisingly he had exactly the same idea (use "last" not "stop" for hash). I'd still think bitXor: is preferable to bitOr, that is the standard way in almost all hash methods. But ...
BUT: I forgot about the super fallback in #=. That makes this discussion pretty much moot, because since
#(1 2 3) = (1 to: 3) "true"
is true, this must also be true:
#(1 2 3) hash = (1 to: 3) hash "must be true"
So the only proper fix IMHO is to remove #hash from Interval (or replace it with ^super hash and a proper comment)
We discussed this a couple of weeks ago. There is no need for #(1 2 3) = (1 to: 3) to be true. #(1 2 3) = #[1 2 3] isn’t true. And we have hasEqualElements:. So a more coherent approach is for the hack that makes intervals equal to arrays be discarded, and the hashes kept distinct.
And I agreed with you weeks ago, but looking at it closer, the code specifically says Interval is a species of Array. Interestingly, ByteArray, which is a subclass of ArrayedCollection, doesn't set its species, so its species is ByteArray. Which is desirable.
If we change the Interval #species to not be array, then many things break with Interval - most notably #select: and #collect:, so a major overhaul would be in store for that part of the code.
In line with Bert's allusion, if we removed the super = call, then #= is no longer associative between Interval's and Arrays: (1 to: 3) = #(1 2 3) "false" #(1 2 3) = (1 to: 3)" true"
So, I'm just fixing the Interval only part and punting on the issue between Interval and Array for now.
-cbc
- Bert -
-- _,,,^..^,,,_ best, Eliot
Sorry for the excessive delay in responding to these threads. On Thu, Nov 15, 2018 at 5:36 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Bert,
On Thu, Nov 15, 2018 at 4:38 PM Bert Freudenberg bert@freudenbergs.de wrote:
Somehow I missed Eliot's version, but unsurprisingly he had exactly the same idea (use "last" not "stop" for hash). I'd still think bitXor: is preferable to bitOr, that is the standard way in almost all hash methods. But ...
BUT: I forgot about the super fallback in #=. That makes this discussion pretty much moot, because since
#(1 2 3) = (1 to: 3) "true"
is true, this must also be true:
#(1 2 3) hash = (1 to: 3) hash "must be true"
So the only proper fix IMHO is to remove #hash from Interval (or replace it with ^super hash and a proper comment)
We discussed this a couple of weeks ago. There is no need for #(1 2 3) = (1 to: 3) to be true. #(1 2 3) = #[1 2 3] isn’t true. And we have hasEqualElements:. So a more coherent approach is for the hack that makes intervals equal to arrays be discarded, and the hashes kept distinct.
Actually, the hack is that interval is a subclass of SequenceableCollection with species defined as Array. This makes lots of things very nice - like #collect: and #select: just work. If we removed #species (which would be necessary to make interval and array not be equal), that would require re-implementing these two methods - and many, many more - from the superclasses.
Basically, that hack is a fundamental part of how the class is built today.
Are we ok with us taking on that much of a change?
-cbc
- Bert -
-- _,,,^..^,,,_ best, Eliot
Hi Chris,
On Nov 20, 2018, at 11:11 AM, Chris Cunningham cunningham.cb@gmail.com wrote:
Sorry for the excessive delay in responding to these threads.
On Thu, Nov 15, 2018 at 5:36 PM Eliot Miranda eliot.miranda@gmail.com wrote: Hi Bert,
On Thu, Nov 15, 2018 at 4:38 PM Bert Freudenberg bert@freudenbergs.de wrote: Somehow I missed Eliot's version, but unsurprisingly he had exactly the same idea (use "last" not "stop" for hash). I'd still think bitXor: is preferable to bitOr, that is the standard way in almost all hash methods. But ...
BUT: I forgot about the super fallback in #=. That makes this discussion pretty much moot, because since
#(1 2 3) = (1 to: 3) "true"
is true, this must also be true:
#(1 2 3) hash = (1 to: 3) hash "must be true"
So the only proper fix IMHO is to remove #hash from Interval (or replace it with ^super hash and a proper comment)
We discussed this a couple of weeks ago. There is no need for #(1 2 3) = (1 to: 3) to be true. #(1 2 3) = #[1 2 3] isn’t true. And we have hasEqualElements:. So a more coherent approach is for the hack that makes intervals equal to arrays be discarded, and the hashes kept distinct.
Actually, the hack is that interval is a subclass of SequenceableCollection with species defined as Array. This makes lots of things very nice - like #collect: and #select: just work. If we removed #species (which would be necessary to make interval and array not be equal), that would require re-implementing these two methods - and many, many more - from the superclasses.
Basically, that hack is a fundamental part of how the class is built today.
IMO it is not a hack. But it has nothing to do with whether an Interval with equal elements to an Array is equal to it. A ByteArray is also a SequenceableCollection and is not equal to an Array if it has equal elements. It has a different species to Array, but species exists, as you’ve noted, for the convenience of select: & collect: so that immutable collections can answer a suitable mutable class to be used to construct the result.
Are we ok with us taking on that much of a change?
No one is suggesting changing the species of Interval.
-cbc
- Bert -
-- _,,,^..^,,,_ best, Eliot
On Nov 15, 2018, at 5:44 AM, Tim Olson tim.olson.mail@gmail.com wrote:
Interval >> size does the correct thing with the stop value, so maybe Interval >> = could use:
isInterval and: [start = anInterval start and: [step = anInterval step and: [self size = anInterval size]]]
The current implementation is correct; it is effectively the same as your's but has some obvious optimizations. Two intervals are equal if they have the same sequence of elements, no matter how they are written. Here is is:
= anObject
^ self == anObject ifTrue: [true] ifFalse: [anObject isInterval ifTrue: [start = anObject first and: [step = anObject increment and: [self last = anObject last]]] ifFalse: [super = anObject]]
which I would have written = anObject ^self == anObject or: [anObject isInterval ifFalse: [super = anObject] ifTrue: [start = anObject first and: [step = anObject increment and: [self last = anObject last]]]]
The issue is with hash which accesses stop directly instead of last. If hash read
hash "Hash is reimplemented because = is implemented."
^(((start hash bitShift: 2) bitOr: self last hash) bitShift: 1) bitOr: self size
(i.e. "bitOr: stop hash)" => "bitOr: self last hash)" then things will be fine. And most common intervals hash will not change.
— tim
On Nov 2, 2018, at 12:53 PM, Louis LaBrunda Lou@Keystone-Software.com wrote:
Hi Chris,
I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same range and not worry about how they got there. In the case where the increment (by) is an integer the start and end values map down to integers and if those integers are the same in two intervals then the intervals span the same range and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk they would work the same but you can't tell that with #=.
I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment is an integer.
I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the concern.
Lou
On Fri, 2 Nov 2018 10:01:38 -0700, Chris Cunningham cunningham.cb@gmail.com wrote:
All of that said, I too find the VA troubling a bit in this case. I rely on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits cross-dialect portability, although I don't (personally) use VA, other folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and hash rule in this case. Its just not what I would want.
-cbc
On Fri, Nov 2, 2018 at 9:52 AM Chris Cunningham cunningham.cb@gmail.com wrote:
Hi Louis, On Fri, Nov 2, 2018 at 9:12 AM Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham < cunningham.cb@gmail.com> wrote:
ParcPlace-Digitalk VSE 3.1 (roughly 1999):
(0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
they want to be. This is also the direction I want to take Squeak. VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Squeak, Pharo, Dolphin all currently have the same answer, but are not consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are thinking about it but haven't decided what to do yet. It turns out that the way collections (like Set) that use #hash in VA Smalltalk work, because of the #= test failing for intervals that cover the same range and have the same hash, that it overrides the equal hash value and adds the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash values have to be equal as well.
There is no statement about if two objects hashes are the same, what this means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is described in Set>>scanFor: . This method also shows why having objects equal but their hash not equal is so dangerous - if you had two objects that are supposed to be one and the same and are in fact #= but don't have the same hash, they can both show up in a Set together, or as keys in a Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they are undoubtedly doing something similar in their collections that we do in Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for some of my classes to 1. This actually did work, but is a horrible (I mean HORRIBLE) idea - it really, really slows down the system when you have more than a couple instances of an object, but it does work.
-cbc
thanks, cbc
On Thu, Nov 1, 2018 at 5:40 AM Louis LaBrunda <Lou@keystone-software.com
wrote:
> Hi Benoit, > > On the latest version of VA Smalltalk: > > VA Smalltalk V9.1 (32-bit); Image: 9.1 [413] > VM Timestamp: 4.0, 10/01/18 (100) > > I see: > > (0 to: 1) = (0 to: 5/3). "false" > (0 to: 1) hash = (0 to: 5/3) hash. "true" > > Very interesting. > > Lou > > > On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
> squeak-dev@lists.squeakfoundation.org> wrote: > >> Interesting! >> >> As a comparison: >> Squeak 5.2 >> (0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
>> Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) > hash. "false" >> VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true" >> (0 to: 1) hash = (0 to: 5/3) hash. "true" >> Pharo 5.0(0 to: 1) = (0 to: 5/3). "true" >> (0 to: 1) hash = (0 to: 5/3) hash. "false" >> >> I don't have VAST installed on the PC I'm using right now. I'd be > curious to see how other Smalltalk and/or GemStone handle this? So far > (according to what I could test, only VW is right (according to the
ANSI
> standard and just plain logic!) >> >> I wonder how much code relies on this "behavior" out there! >> But the ANSI Smalltalk draft is very clear on this (revision 1.9, page > 53,
http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf):
>> "If the value of receiver = comparand is true then the receiver and > comparand *must* have equivalent hash values." >> That's what I always thought (or was taught or even read in the Blue > Book). Was this something that was changed at some point??? >> >> ---------------- >> Benoît St-Jean >> Yahoo! Messenger: bstjean >> Twitter: @BenLeChialeux >> Pinterest: benoitstjean >> Instagram: Chef_Benito >> IRC: lamneth >> Blogue: endormitoire.wordpress.com >> "A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
> -- > Louis LaBrunda > Keystone Software Corp. > SkypeMe callto://PhotonDemon > > >
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
-- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon
Hi Benoît,
On Oct 31, 2018, at 7:40 PM, Benoit St-Jean via Squeak-dev squeak-dev@lists.squeakfoundation.org wrote:
Interesting!
As a comparison:
Squeak 5.2 (0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
Dolphin 7 (0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
VisualWorks 8.1.1 (0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0 (0 to: 1) = (0 to: 5/3). "true" (0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be curious to see how other Smalltalk and/or GemStone handle this? So far (according to what I could test, only VW is right (according to the ANSI standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
But the ANSI Smalltalk draft is very clear on this (revision 1.9, page 53, http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf):
"If the value of receiver = comparand is true then the receiver and comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue Book). Was this something that was changed at some point???
Nothing was changed. It’s simply people not realizing there is a bug there. Hence the value of Chris’ tests.
Benoît St-Jean Yahoo! Messenger: bstjean Twitter: @BenLeChialeux Pinterest: benoitstjean Instagram: Chef_Benito IRC: lamneth Blogue: endormitoire.wordpress.com "A standpoint is an intellectual horizon of radius zero". (A. Einstein)
squeak-dev@lists.squeakfoundation.org