[BUG][FIX] WeakKeyDictionary>>keysAndValuesDo:

Mon Jun 21 15:07:43 UTC 2004

Richard,
here is my final addendum:
>
>Your code snippet did NOT satisfy (b) at all.  I doubt whether any 
>code snippet could.  It requires the hard work of human analysis.  
>You did not even BEGIN that hard work.  I did, and I found that many 
>of the classes that use the default implementation (==) of #= are, by 
>my criteria, RIGHT to do so.  I specifically pointed to nearly 400 
>Morphic classes and a large number of MVD and Stream classes as 
>classes reported by your code snippet (under (a)) which do NOT fit 
>test (b). 
>
>Let me emphasise this, since the point didn't seem to sink in the 
>first time: according to the principle "#== compares identity, #= 
>compares state" a quick check showed that at least 500 of the classes 
>your snippet are *RIGHT* to not reimplement #=, and so are wildly 
>irrelevant to any attempted refutation of me.
>
First and again, there are enough classes left to show that your 
premise is not universally applied. So, the premise is not the answer, 
it is the question. Second, it is not at all that clear that you 
simply can subtract Morph and its subclasses. Think a bit further and 
imagine a Morphic implementation, in which morphs are not guaranteed 
to be unique by their state. For performance reasons - if not for 
other ones - it is very well possible that #= still would have been 
implemented as #==. 
>
>I am still waiting for an example where there is a set whose elements
>include Sets and things that are not sets, where the non-set elements
>should be compared using #= (NOT #==) but the Sets should be compared
>as if using #==.
>
That is a special version of the key question: Are there two different 
kinds of classes (in general and/or for implementation of Set), one 
using state-equality and the other identity for #= ? After writing the 
last posts, it dawned on me, that I may have used the word "same" 
wrongly. I am still unsure about its exact meaning.  Here is the 
clarification of what I wanted to express with choosing "intuitively" 
state- or identity-comparing for #=. Let's consider the sequence: 

String Array | Set | Bag DataBase

DataBase is a fictive class with a) a complex structure of state, but 
totally contained in the image, or b) a class which is wrapping 
something external, which would be very slow to be tested for state-
equality, because the existence of two dataBases with completely equal 
state is allowed, so that the identity test can't be used as a 
shortcut. 

I am going to formulate the cardinal intention of a #= check like 
this: I put the object in question into a variable and elsewhere I 
want to check if some unknown object IS THE ONE I put in the variable. 
I consider the main intention of Set to be reflected by this 
formulation. 

A Set should be able to hold objects of arbitrary classes. Choosing an 
IdentitySet vs an "EqualStateSet" (not implemented in Squeak right 
now) is not an option, if Sets are about "IS THE ONE" in it and the 
answer to this question has to be implemented differently for 
different classes. 

Let's start with String. There are String methods which are returning 
either a copy of the receiver, or the receiver itself unaltered or the 
receiver itself but altered "in place". Very often the intention of 
the program is only to get the string's bytes written to a file or the 
screen. Programming with Strings is mostly about their state, rarely 
their pointer is of interest, seen from the POV of the result. The 
question "IS IT THE ONE" is mostly answered yes, if the two objects 
have equal states. They may be identical or not, mostly it doesn't 
matter. Even more, often I can't be sure about their identity in case 
they are state-equal given the Squeak or Smalltalk in general String 
protocol. 

Now think of DataBase. There should be no methods in its offered 
protocol for intended standard operations, which makes a copy of the 
whole database, to simply change, add or remove something of it. There 
should be only one exemplar of a specific database, duplicates other 
than for special intentions are nonsense. If I add to DataBase for 
thingsOfTypeA something, I expect it to be still DataBase for 
thingsOfTypeA. 

Now I am creating two DataBases, one is for thingsOfTypeA and the 
other shall contain thingsOfTypeB. At the beginning they are empty and 
not to distinguish by their state, so the question "IS IT THE ONE" can 
only be answered correctly by testing for identity. 

Even if this were for some reason not so clear, the implementor would 
possibly be forced by performance reasons to test for identity. 

My sequence starts with a class which suggests a state comparison for 
answering the question "IS IT THE ONE" end ends with a class which 
needs identity comparison. There has to be borderline, where the 
switch is made. In other Smalltalks it is drawn between Array and Set 
and in Squeak between Set and Bag. I think the existence of this 
borderline in major Smalltalks, alone shows that your premise can only 
be the question and not the answer. 

Perhaps it is debatable where the borderline should be drawn. I was 
used to see a Set more like a DataBase, an opaque container, you are 
more mathematical - you think of it as a pattern, in a transparent 
cover, like a String. As I said, both positions have their merits. 
Aside from that, compatibility and performance are an important issue, 
which favors the version of the other Smalltalks. 

Regards
Martin