Hi All,
In VisualWorks Message implements #= & #hash naturally; two messages whose selectors and arguments are #= are also equal. But in Cuis, Squeak and Pharo Message inherits #= and #hash from Object, i.e. uses identity comparison. This is, to say the least, annoying. Any objections to implementing comparing in Message to match VisualWorks?
_,,,^..^,,,_ best, Eliot
On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote:
Hi All,
In VisualWorks Message implements #= & #hash naturally; two messages
whose selectors and arguments are #= are also equal. But in Cuis, Squeak and Pharo Message inherits #= and #hash from Object, i.e. uses identity comparison. This is, to say the least, annoying. Any objections to implementing comparing in Message to match VisualWorks?
That sounds like an obviously good thing to do :-)
Is the lookkupClass instance variable relevant for comparisons? I am guessing not, since we already have #analogousCodeTo: for that type of comparison.
Dave
Hi David,
On Mon, Nov 19, 2018 at 9:52 AM David T. Lewis lewis@mail.msen.com wrote:
On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote:
Hi All,
In VisualWorks Message implements #= & #hash naturally; two messages
whose selectors and arguments are #= are also equal. But in Cuis, Squeak and Pharo Message inherits #= and #hash from Object, i.e. uses identity comparison. This is, to say the least, annoying. Any objections to implementing comparing in Message to match VisualWorks?
That sounds like an obviously good thing to do :-)
Is the lookupClass instance variable relevant for comparisons? I am guessing not, since we already have #analogousCodeTo: for that type of comparison.
For me it is relevant. Two messages with different lookupClasses, e.g. one with nil and one with a specific class, represent different messages, one a normal send one a super send. So my changes in waiting include lookupClass in both hash and =. I don't think it makes much difference, but the incompatibility with VisualWorks, while regrettable, feels correct to me.
Dave
_,,,^..^,,,_ best, Eliot
On Mon, Nov 19, 2018 at 04:16:51PM -0800, Eliot Miranda via Cuis-dev wrote:
Hi David,
On Mon, Nov 19, 2018 at 9:52 AM David T. Lewis lewis@mail.msen.com wrote:
On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote:
Hi All,
In VisualWorks Message implements #= & #hash naturally; two messages
whose selectors and arguments are #= are also equal. But in Cuis, Squeak and Pharo Message inherits #= and #hash from Object, i.e. uses identity comparison. This is, to say the least, annoying. Any objections to implementing comparing in Message to match VisualWorks?
That sounds like an obviously good thing to do :-)
Is the lookupClass instance variable relevant for comparisons? I am guessing not, since we already have #analogousCodeTo: for that type of comparison.
For me it is relevant. Two messages with different lookupClasses, e.g. one with nil and one with a specific class, represent different messages, one a normal send one a super send. So my changes in waiting include lookupClass in both hash and =. I don't think it makes much difference, but the incompatibility with VisualWorks, while regrettable, feels correct to me.
That feels correct to me also. But implementation details and bikeshedding aside, I would be happy if "Message allInstances asSet asArray" could answer a reasonably small collection of different-looking things. So +1 to the change, with or without consideration of lookupClass.
Dave
On 20 Nov 2018, at 01:16, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi David,
On Mon, Nov 19, 2018 at 9:52 AM David T. Lewis <lewis@mail.msen.com mailto:lewis@mail.msen.com> wrote: On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote:
Hi All,
In VisualWorks Message implements #= & #hash naturally; two messages
whose selectors and arguments are #= are also equal. But in Cuis, Squeak and Pharo Message inherits #= and #hash from Object, i.e. uses identity comparison. This is, to say the least, annoying. Any objections to implementing comparing in Message to match VisualWorks?
That sounds like an obviously good thing to do :-)
Is the lookupClass instance variable relevant for comparisons? I am guessing not, since we already have #analogousCodeTo: for that type of comparison.
For me it is relevant. Two messages with different lookupClasses, e.g. one with nil and one with a specific class, represent different messages, one a normal send one a super send. So my changes in waiting include lookupClass in both hash and =. I don't think it makes much difference, but the incompatibility with VisualWorks, while regrettable, feels correct to me.
To me this looks like a good change, yes.
Marcus
On Mon, 19 Nov 2018, Eliot Miranda wrote:
Hi David,
On Mon, Nov 19, 2018 at 9:52 AM David T. Lewis lewis@mail.msen.com wrote: On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote: > Hi All, > > In VisualWorks Message implements #= & #hash naturally; two messages > whose selectors and arguments are #= are also equal. But in Cuis, Squeak > and Pharo Message inherits #= and #hash from Object, i.e. uses identity > comparison. This is, to say the least, annoying. Any objections to > implementing comparing in Message to match VisualWorks? >
That sounds like an obviously good thing to do :-) Is the lookupClass instance variable relevant for comparisons? I am guessing not, since we already have #analogousCodeTo: for that type of comparison.
For me it is relevant. Two messages with different lookupClasses, e.g. one with nil and one with a specific class, represent different messages, one a normal send one a super send. So my changes in waiting include lookupClass in both hash and =. I don't think it makes much difference, but the incompatibility with VisualWorks, while regrettable, feels correct to me.
What about Behavior >> #hash? It uses the name of the class, which is fine as long as you don't rename your classes while you store them in hashed collections.
Levente
Dave
_,,,^..^,,,_ best, Eliot
It sounds good, but an Inbox submission would go a long way to understanding exactly what you mean esp. re: understanding this special condition with the lookupClass being nil... and whether it deserves a first-class name or not... On Tue, Nov 20, 2018 at 4:10 AM Levente Uzonyi leves@caesar.elte.hu wrote:
On Mon, 19 Nov 2018, Eliot Miranda wrote:
Hi David,
On Mon, Nov 19, 2018 at 9:52 AM David T. Lewis lewis@mail.msen.com wrote: On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote: > Hi All, > > In VisualWorks Message implements #= & #hash naturally; two messages > whose selectors and arguments are #= are also equal. But in Cuis, Squeak > and Pharo Message inherits #= and #hash from Object, i.e. uses identity > comparison. This is, to say the least, annoying. Any objections to > implementing comparing in Message to match VisualWorks? >
That sounds like an obviously good thing to do :-) Is the lookupClass instance variable relevant for comparisons? I am guessing not, since we already have #analogousCodeTo: for that type of comparison.
For me it is relevant. Two messages with different lookupClasses, e.g. one with nil and one with a specific class, represent different messages, one a normal send one a super send. So my changes in waiting include lookupClass in both hash and =. I don't think it makes much difference, but the incompatibility with VisualWorks, while regrettable, feels correct to me.
What about Behavior >> #hash? It uses the name of the class, which is fine as long as you don't rename your classes while you store them in hashed collections.
Levente
Dave
_,,,^..^,,,_ best, Eliot
It is interesting. In base Squeak, the only time that lookupClass is set is, well, never(!). There is a setter method, but it isn't called is base Squeak. It is checked a couple of times, but unless there is hidden machinery in the VM that sets it, it isn't actually ever used.
Unless someone decided to use it in their code.
-cbc
On Tue, Nov 20, 2018 at 9:06 AM Chris Muller asqueaker@gmail.com wrote:
It sounds good, but an Inbox submission would go a long way to understanding exactly what you mean esp. re: understanding this special condition with the lookupClass being nil... and whether it deserves a first-class name or not... On Tue, Nov 20, 2018 at 4:10 AM Levente Uzonyi leves@caesar.elte.hu wrote:
On Mon, 19 Nov 2018, Eliot Miranda wrote:
Hi David,
On Mon, Nov 19, 2018 at 9:52 AM David T. Lewis lewis@mail.msen.com
wrote:
On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote: > Hi All, > > In VisualWorks Message implements #= & #hash naturally;
two messages
> whose selectors and arguments are #= are also equal. But in
Cuis, Squeak
> and Pharo Message inherits #= and #hash from Object, i.e. uses
identity
> comparison. This is, to say the least, annoying. Any
objections to
> implementing comparing in Message to match VisualWorks? > That sounds like an obviously good thing to do :-) Is the lookupClass instance variable relevant for comparisons? I
am
guessing not, since we already have #analogousCodeTo: for that
type of
comparison.
For me it is relevant. Two messages with different lookupClasses,
e.g. one with nil and one with a specific class, represent different messages, one a normal send one a super send. So my changes in
waiting include lookupClass in both hash and =. I don't think it
makes much difference, but the incompatibility with VisualWorks, while regrettable, feels correct to me.
What about Behavior >> #hash? It uses the name of the class, which is
fine
as long as you don't rename your classes while you store them in hashed collections.
Levente
Dave
_,,,^..^,,,_ best, Eliot
To make things more clear, the current implementation of Behavior >> #hash has two negative side effects: - behaviors stored in collections relying on the hash value (e.g. Set, Dictionary) will have to be rehashed whenever a behavior is renamed - objects using Behavior >> #hash to implement their own #hash, like what Eliot just did to Message will suffer from the same issue. So Sets and Dictionaries holding those kind of objects will have to be rehashed as well upon the rename of the behavior.
My questions related to this: - why does Behavior >> #hash rely on the name instead of identity? - do we want to fix those issues mentioned above or do we just say that one should not rename classes and expect things to work?
Levente
On Tue, 20 Nov 2018, Levente Uzonyi wrote:
On Mon, 19 Nov 2018, Eliot Miranda wrote:
Hi David,
On Mon, Nov 19, 2018 at 9:52 AM David T. Lewis lewis@mail.msen.com wrote: On Mon, Nov 19, 2018 at 09:32:17AM -0800, Eliot Miranda wrote: > Hi All, > > In VisualWorks Message implements #= & #hash naturally; two messages > whose selectors and arguments are #= are also equal. But in Cuis, Squeak > and Pharo Message inherits #= and #hash from Object, i.e. uses identity > comparison. This is, to say the least, annoying. Any objections to > implementing comparing in Message to match VisualWorks? >
That sounds like an obviously good thing to do :-) Is the lookupClass instance variable relevant for comparisons? I am guessing not, since we already have #analogousCodeTo: for that type
of comparison.
For me it is relevant. Two messages with different lookupClasses, e.g. one with nil and one with a specific class, represent different messages, one a normal send one a super send. So my changes in waiting include lookupClass in both hash and =. I don't think it makes much difference, but the incompatibility with VisualWorks, while regrettable, feels correct to me.
What about Behavior >> #hash? It uses the name of the class, which is fine as long as you don't rename your classes while you store them in hashed collections.
Levente
Dave
_,,,^..^,,,_ best, Eliot
On 11/20/18 5:33 PM, Levente Uzonyi wrote:
- why does Behavior >> #hash rely on the name instead of identity?
FWIW, I found this...
'From Squeak3.9alpha of 4 July 2005 [latest update: #6726] on 16 February 2006 at 7:30:27 pm' "Change Set: 6727updateFromMC Date: 16 February 2006 Author: Marcus Denker
-Put back in the simplified Decompiler>>#decompileBlock: - deprecated #who - CompiledMethod>>#methodNode now sets Class and Selector for non-installed methods to Object/ CompiledMethod>>#defaultSelector - refactored ContextPart to not call #who - refactored ProcessBrowserPlus to not use #who - refactored ProcessBrowser to not use #who - simplified CompiledMethod>>#defaultSelector - String>>#hash now uses identityHasch as initial hash (needed forBehavior>>#hash) - moved numArgs: to Symbol, faster - addBehavior hash - simplify CompiledMethod: #defaultSelector, #equivalentTo:, #methodNode - add CompiledMethod>>#selector: - add iVarselector to MethodProperties, accessors - make MethodProperties compact again."
To make things more clear, the current implementation of Behavior >> #hash has two negative side effects:
- behaviors stored in collections relying on the hash value (e.g. Set,
Dictionary) will have to be rehashed whenever a behavior is renamed
- objects using Behavior >> #hash to implement their own #hash, like what
Eliot just did to Message will suffer from the same issue. So Sets and Dictionaries holding those kind of objects will have to be rehashed as well upon the rename of the behavior.
My questions related to this:
- why does Behavior >> #hash rely on the name instead of identity?
If you mean #identityHash, then its because involving an unstable value in a #hash calculation is never a good idea. #identityHash can be different for the same class between two different images, or if the class was ever becomed or reloaded into a new image, etc.
- do we want to fix those issues mentioned above or do we just say that
one should not rename classes and expect things to work?
Neither. We just say that when one renames a class to rehash all relevant HashedCollections.
- Chris
On Tue, 20 Nov 2018, Chris Muller wrote:
To make things more clear, the current implementation of Behavior >> #hash has two negative side effects:
- behaviors stored in collections relying on the hash value (e.g. Set,
Dictionary) will have to be rehashed whenever a behavior is renamed
- objects using Behavior >> #hash to implement their own #hash, like what
Eliot just did to Message will suffer from the same issue. So Sets and Dictionaries holding those kind of objects will have to be rehashed as well upon the rename of the behavior.
My questions related to this:
- why does Behavior >> #hash rely on the name instead of identity?
If you mean #identityHash, then its because involving an unstable value in a #hash calculation is never a good idea. #identityHash can be different for the same class between two different images, or if the class was ever becomed or reloaded into a new image, etc.
Is there an actual user of that feature?
Bob found out that #hash had been changed during the developement of Squeak 3.9. Therefore this issue is not present in Cuis (forked from 3.7). And I just checked Pharo and found that Behavior >> #hash had been removed from there. So, I suggest we remove it as well unless there's a really good reason to keep it.
- do we want to fix those issues mentioned above or do we just say that
one should not rename classes and expect things to work?
Neither. We just say that when one renames a class to rehash all relevant HashedCollections.
That's "one should not rename classes and expect things to work", isn't it?
Levente
- Chris
I'm neutral on this, but here is a bit more (and from earlier it would seem)
'From Squeak3.7beta of ''1 April 2004'' [latest update: #5963] on 22 June 2004 at 8:51:57 pm' "Change Set: BehaviorHashEnh v1.2 Date: 22 June 2004, 16.02.2006 Author: Stephan Rudlof, md md: added a line to the poscript to uncompactify the MethodProperties class. We want to add an instVar for the selector. Improves the default Object>>hash forBehaviors by installingBehavior>>hash. String>>hash has been changed a little to avoid infinite recursion (without changing its semantics). All is done in the postscript.
Important ----------- This is a special changeset: Do not export and import this changeset again after importing it the first time! Then the methods are not installed alone in the postscript anymore, leading to serious problems! -----------
Rationale: Object>>hash calling ProtoObject>>identityHash gives poor results forBehaviors. Therefore a newBehavior>>hash using Symbol>>hash or String>>hash (the latter slightly changed to avoide infinite recursion) will be installed.
Consequences: - It speeds up Set/Dictionary operations withBehaviors a lot (see below). - The main consequence for other objects asBehaviors seems to be a changed hash if they use self species hash as a start value for computing their hash. But AFAICS this doesn't hurt, since in most cases (non meta classes as species) it maps to Symbol>>hash, which is fast. On 11/21/18 7:31 AM, Levente Uzonyi wrote:
On Tue, 20 Nov 2018, Chris Muller wrote:
To make things more clear, the current implementation of Behavior >> #hash has two negative side effects:
- behaviors stored in collections relying on the hash value (e.g. Set,
Dictionary) will have to be rehashed whenever a behavior is renamed
- objects using Behavior >> #hash to implement their own #hash, like
what Eliot just did to Message will suffer from the same issue. So Sets and Dictionaries holding those kind of objects will have to be rehashed as well upon the rename of the behavior.
My questions related to this:
- why does Behavior >> #hash rely on the name instead of identity?
If you mean #identityHash, then its because involving an unstable value in a #hash calculation is never a good idea. #identityHash can be different for the same class between two different images, or if the class was ever becomed or reloaded into a new image, etc.
Is there an actual user of that feature?
Bob found out that #hash had been changed during the developement of Squeak 3.9. Therefore this issue is not present in Cuis (forked from 3.7). And I just checked Pharo and found that Behavior >> #hash had been removed from there. So, I suggest we remove it as well unless there's a really good reason to keep it.
- do we want to fix those issues mentioned above or do we just say that
one should not rename classes and expect things to work?
Neither. We just say that when one renames a class to rehash all relevant HashedCollections.
That's "one should not rename classes and expect things to work", isn't it?
Levente
- Chris
Nice find. So the reason for the change was to improve performance, but now that change actually makes things slower. We have 22-bit identity hashes and #hashMultiply is quicker too, so a new hash function based on those could be 6x-10x quicker.
Levente
On Wed, 21 Nov 2018, Bob Arning wrote:
I'm neutral on this, but here is a bit more (and from earlier it would seem)
'From Squeak3.7beta of ''1 April 2004'' [latest update: #5963] on 22 June 2004 at 8:51:57 pm' "Change Set: BehaviorHashEnh v1.2 Date: 22 June 2004, 16.02.2006 Author: Stephan Rudlof, md md: added a line to the poscript to uncompactify the MethodProperties class. We want to add an instVar for the selector. Improves the default Object>>hash for Behaviors by installing Behavior>>hash. String>>hash has been changed a little to avoid infinite recursion (without changing its semantics). All is done in the postscript.
Important
This is a special changeset: Do not export and import this changeset again after importing it the first time! Then the methods are not installed alone in the postscript anymore, leading to serious problems!
Rationale: Object>>hash calling ProtoObject>>identityHash gives poor results for Behaviors. Therefore a new Behavior>>hash using Symbol>>hash or String>>hash (the latter slightly changed to avoide infinite recursion) will be installed.
Consequences:
- It speeds up Set/Dictionary operations with Behaviors a lot (see below).
- The main consequence for other objects as Behaviors seems to be a changed hash if they use
self species hash as a start value for computing their hash. But AFAICS this doesn't hurt, since in most cases (non meta classes as species) it maps to Symbol>>hash, which is fast. On 11/21/18 7:31 AM, Levente Uzonyi wrote: On Tue, 20 Nov 2018, Chris Muller wrote:
To make things more clear, the current implementation of Behavior >> #hash has two negative side effects: - behaviors stored in collections relying on the hash value (e.g. Set, Dictionary) will have to be rehashed whenever a behavior is renamed - objects using Behavior >> #hash to implement their own #hash, like what Eliot just did to Message will suffer from the same issue. So Sets and Dictionaries holding those kind of objects will have to be rehashed as well upon the rename of the behavior. My questions related to this: - why does Behavior >> #hash rely on the name instead of identity? If you mean #identityHash, then its because involving an unstable value in a #hash calculation is never a good idea. #identityHash can be different for the same class between two different images, or if the class was ever becomed or reloaded into a new image, etc. Is there an actual user of that feature? Bob found out that #hash had been changed during the developement of Squeak 3.9. Therefore this issue is not present in Cuis (forked from 3.7). And I just checked Pharo and found that Behavior >> #hash had been removed from there. So, I suggest we remove it as well unless there's a really good reason to keep it. - do we want to fix those issues mentioned above or do we just say that one should not rename classes and expect things to work? Neither. We just say that when one renames a class to rehash all relevant HashedCollections. That's "one should not rename classes and expect things to work", isn't it? Levente - Chris
Hi Levente,
My questions related to this:
- why does Behavior >> #hash rely on the name instead of identity?
If you mean #identityHash, then its because involving an unstable value in a #hash calculation is never a good idea. #identityHash can be different for the same class between two different images, or if the class was ever becomed or reloaded into a new image, etc.
Is there an actual user of that feature?
Several actually.
If your primary motivation is improving performance then we could cache the hash value, possibly even straight in the CompiledMethod itself (hard-coded).
But we _must not_ destroy stable and deterministic hashes. It would be a huge blow to existing users and even future potential possibilities of future implementations of things. Anything that would try to share #hash calculations between images... Possibly Croquet. It's easy to imagine some things!
MagmaDictionary is just a Dictionary except the internal array grows on disk providing size scalability while sparing RAM. The #hash of the objects put into it are calculated by the clients -- it must be consistent between users.
source.squeak.org uses a MagmaDictionary for the revision history. It was the best way to introduce the capability without invading Monticello's code at all.
Bob found out that #hash had been changed during the developement of Squeak 3.9. Therefore this issue is not present in Cuis (forked from 3.7). And I just checked Pharo and found that Behavior >> #hash had been removed from there.
I think that's a big mistake. There's no need for us to cross the line of a stable hash to a volatile one. We can cache it.
Hi Eliot,
(below) On 19/11/2018 02:32 p.m., Eliot Miranda via Cuis-dev wrote:
Hi All,
In VisualWorks Message implements #= & #hash naturally; two
messages whose selectors and arguments are #= are also equal. But in Cuis, Squeak and Pharo Message inherits #= and #hash from Object, i.e. uses identity comparison. This is, to say the least, annoying. Any objections to implementing comparing in Message to match VisualWorks?
_,,,^..^,,,_ best, Eliot
Just added these implementations to Cuis:
= aMessage "Any object is equal to itself" self == aMessage ifTrue: [ ^ true ]. self class == aMessage class ifFalse: [ ^false ]. selector = aMessage selector ifFalse: [ ^false ]. lookupClass = aMessage lookupClass ifFalse: [ ^false ]. ^args = aMessage arguments
hash "Hash is reimplemented because = is implemented." ^selector hash bitXor: args hash
Do they look ok? (I considered lookupClass for #=, but not for hash, as collisions because of that seem unlikely, and in any case, grouping them that way could be useful).
Cheers,
Hi Juan,
On Tue, Nov 20, 2018 at 8:00 AM Juan Vuletich juan@jvuletich.org wrote:
Hi Eliot,
(below) On 19/11/2018 02:32 p.m., Eliot Miranda via Cuis-dev wrote:
Hi All,
In VisualWorks Message implements #= & #hash naturally; two messages
whose selectors and arguments are #= are also equal. But in Cuis, Squeak and Pharo Message inherits #= and #hash from Object, i.e. uses identity comparison. This is, to say the least, annoying. Any objections to implementing comparing in Message to match VisualWorks?
_,,,^..^,,,_ best, Eliot
Just added these implementations to Cuis:
= aMessage "Any object is equal to itself" self == aMessage ifTrue: [ ^ true ]. self class == aMessage class ifFalse: [ ^false ]. selector = aMessage selector ifFalse: [ ^false ]. lookupClass = aMessage lookupClass ifFalse: [ ^false ]. ^args = aMessage arguments
this last line should be
^args literalEqual: aMessage arguments
and this is very important :-)
hash "Hash is reimplemented because = is implemented." ^selector hash bitXor: args hash
Do they look ok? (I considered lookupClass for #=, but not for hash, as collisions because of that seem unlikely, and in any case, grouping them that way could be useful).
Cheers,
-- Juan Vuletichwww.cuis-smalltalk.orghttps://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Devhttps://github.com/jvule... @JuanVuletich
On 11/20/2018 6:59 PM, Eliot Miranda wrote:
Hi Juan,
this last line should be
^args literalEqual: aMessage arguments
and this is very important :-)
-- _,,,^..^,,,_ best, Eliot
Thanks Eliot. Fix is now at Cuis GitHub repo.
Cheers,
Wait, Message is the one used internally by the system. It doesn't have #= and #hash because it didn't need it.
*MessageSend* is the one for external use, and it already defines #= and #hash.
On Mon, Nov 19, 2018 at 11:32 AM Eliot Miranda eliot.miranda@gmail.com wrote:
Hi All,
In VisualWorks Message implements #= & #hash naturally; two messages whose selectors and arguments are #= are also equal. But in Cuis, Squeak and Pharo Message inherits #= and #hash from Object, i.e. uses identity comparison. This is, to say the least, annoying. Any objections to implementing comparing in Message to match VisualWorks?
_,,,^..^,,,_ best, Eliot
squeak-dev@lists.squeakfoundation.org