Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

stephane ducasse stephane.ducasse at free.fr
Thu May 10 11:50:03 UTC 2007


thanks for the email. It was cool.
I learned something today.

Stef

On 10 mai 07, at 08:46, Andrew P. Black wrote:

> This started as a story about what I realized in the shower this  
> morning, and ended up as a story about what I don't know about  
> Squeak (again :-( )
>
> On 8 May 2007, at 17:12, Andreas Raab wrote:
>
>> A while ago Association>>= was changed to compare the values in  
>> addition to its keys which has bitten me many, many times since it  
>> was introduced. Most likely it is the culprit here, too.
>
> No, Andreas, this was a different problem but an interesting one.   
> This is the third time in my life that i has bitten me.  It's worth  
> writing a pattern about: comparing recursive structures.
>
> Suppose that you have two potentially recursive data structures,  
> and you need to compare them — perhaps for equality, or perhaps for  
> dominance of one over the other.  (I met this problem first when  
> figuring out the type conformance algorithm for Emerald in about  
> 1984.)
>
> To be concrete, suppose that the structures are dictionaries, which  
> contain simple data as keys (like symbols), and more complex data  
> as values — including, perhaps, other dictionaries.  The obvious  
> thing is to walk both structures in parallel, checking that they  
> have the same keys, and that the values for each occurrence of a  
> key are also the same.   It's possible that those values are also  
> dictionaries, in which case you execute the same = code that you  
> were in the middle of running ...
>
> This is not a problem, unless: the dictionary stored in the  
> dictionary is the very same dictionary that you started with.  In  
> which case, the naive recursive search will go on for ever.
>
> I first met this in Squeak when I executed Smalltalk = Smalltalk.   
> Of course, Smalltalk, as the dictionary of all globals, includes  
> Smalltalk as a value, and the equality computation never  
> terminated.  I filed a bug fix for that one, and it's in the image  
> now.  My fix was to simply compare the two objects for identity  
> first, and only then do the recursive equality check.  This, or  
> course, answers true ot Smalltalk = Smalltalk quite quickly.
>
> The situation is the same with my recursive block example.
>
> 	b := [(i:= i-1) > 0 ifTrue: [j := j+1 . b value.] ifFalse:[j]] .
>
> It's compiled as a method, which has (presumably, I haven't looked)  
> a literal that points to itself ... and the equality comparision  
> goes on forever.  Why Squeak was doing an equality comparision when  
> all I asked for was to "DoIt" is another question.
>
> I'm deducing all this from Mathieu's fix.   The identity test isn't  
> a complete fix, however.  If you have two isomorphic recursive  
> structures that don't share any elements, the test will still go on  
> for ever.
>
> Here is a small example: two isomorphic one-element dictionaries.   
> Type this in a workspace (BUT DON"T TRY THIS IN AN IMAGE THAT YOU  
> WANT TO KEEP USING!)
>
> a := Dictionary new.
> a at: #foo put: a.
> b := Dictionary new.
> b at: #foo put: b.
>
> I expected that if I then did a printIt on a==b, I would get an  
> immediate "false", but if I did a printit on a=b, the system would  
> go off into never-land.
>
> But actually, it's more interesting than that.  The system goes off  
> into never-land almost regardless of what you do!  Debug it a == b  
> will do it.  So will a simple printIt of b or inspectIt of b.  A  
> doIt of "self halt. a == b" goes into a loop /before/ executing the  
> self halt.  If you interrupt the computation, you can look at b in  
> the debugger, and even launch an inspector from there:  
> printStringLimtedTo: does its job, as it's supposed to.  The system  
> is trying to evaluate hash on the cyclic dictionary; hash is also  
> broken for the same reason.  Asking for a fullStack in the debugger  
> to see _why_ it's trying to compute hash puts the system into  
> another recursive loop!
>
> This explains why I saw the same behavior with my recursive block,  
> but not the root cause.  Anyone have an explanation?
>
> By the way, the real solution to the recursive equality comparison  
> is to maintain a cache.
> A pair of elements goes into the cache when you start to comapre  
> them for equality.  If, in the process of doing the comparision,  
> you find that you are asking the same equaility question again,  
> then the second time around, you can assume that the answer is  
> true.  This is OK, evne though you haven't finished doing the  
> comparison yet.  If it turns out that there is indeed a difference,  
> the "outer" equality will eventually find it, and will answer  
> false.   The same trick can be employed for computing the hash.   
> Assume that the hash of the structure that you are computing the  
> has over is some constant, say 0.  Start computing its hash, but if  
> you find in the process that you need to compute the has of the  
> same structure again, then just assume that it is 0.  The "real"  
> hash might be thought of as the fixed-point of this process, but  
> since hashes are just approximations anyway, once around the loop  
> is enough.
>
> The queation remains: why does the mere existence of such recursive  
> structures throw Squeak into a tizzy?
>
> 	Andrew
>
>
>
>




More information about the Squeak-dev mailing list