help!! - now "Hash the set"

sqrmax at cvtci.com.ar sqrmax at cvtci.com.ar
Thu Apr 16 11:41:01 UTC 1998


Hi.

>The number of Sets in a system and its medium size can vary from system to 
system...
>Probably we need a more secure hash for Sets.

Well yes, what do we do with its hash? Certainly it's not a good thing to 
depend on the hash of its elements, because its unordered. This also renders 
kicking the problem to the implementation of the sets into a bad thing.

>Or a NOT so clever #=...

Like detecting an object not in the other... size squared complexity, fast 
for small sets, terribly slow for big sets.

>How many times do we compare collections for equal contents?
>How many times do we compare by identity tests?

How does "MethodDictionary>>do:" sound to you? ;-)... but ok, few times.

>Is it necessary to have the #= overwritten?

If you plan waiting until eternity comes when sometime you finally send 
#=... no.

>(muerto el perro se acabo la rabia)
>Trans: no dog, no illnesses transmitted

I'm not convinced in this case. I think I now have it:

1. Add an instance variable to sets, namely hash.
2. Initialize to zero on instance creation.
3. Upon every add: anObject, bitXor it with anObject hash.
4. Upon every remove: anObject, bitXor it with anObject hash.

The steps 3 and 4 only update when something happens. Hmmmm... this kills 
all the problems, I guess. In fact, it's a fast way to check if aSet ~= 
anotherSet, by checking their sizes first, then its hashes and then bad luck.

Could this work for any other collection?... well... ;-)... it's fast 
enough so that it doesn't turn add:, remove: and so into number crunchers.

Andres.





More information about the Squeak-dev mailing list