Equality of Recursive Structures [Was: Closure Compiler in 3.9 ?]

Bergel, Alexandre bergel at iam.unibe.ch
Sat May 12 11:34:52 UTC 2007


Hi Andrew,

Thanks for this instructive email.

Alexandre


On 10 May 2007, at 13:50, stephane ducasse wrote:

> thanks for the email. It was cool.
> I learned something today.
>
> Stef
>
> On 10 mai 07, at 08:46, Andrew P. Black wrote:
>
>> This started as a story about what I realized in the shower this  
>> morning, and ended up as a story about what I don't know about  
>> Squeak (again :-( )
>>
>> On 8 May 2007, at 17:12, Andreas Raab wrote:
>>
>>> A while ago Association>>= was changed to compare the values in  
>>> addition to its keys which has bitten me many, many times since  
>>> it was introduced. Most likely it is the culprit here, too.
>>
>> No, Andreas, this was a different problem but an interesting one.   
>> This is the third time in my life that i has bitten me.  It's  
>> worth writing a pattern about: comparing recursive structures.
>>
>> Suppose that you have two potentially recursive data structures,  
>> and you need to compare them — perhaps for equality, or perhaps  
>> for dominance of one over the other.  (I met this problem first  
>> when figuring out the type conformance algorithm for Emerald in  
>> about 1984.)
>>
>> To be concrete, suppose that the structures are dictionaries,  
>> which contain simple data as keys (like symbols), and more complex  
>> data as values — including, perhaps, other dictionaries.  The  
>> obvious thing is to walk both structures in parallel, checking  
>> that they have the same keys, and that the values for each  
>> occurrence of a key are also the same.   It's possible that those  
>> values are also dictionaries, in which case you execute the same =  
>> code that you were in the middle of running ...
>>
>> This is not a problem, unless: the dictionary stored in the  
>> dictionary is the very same dictionary that you started with.  In  
>> which case, the naive recursive search will go on for ever.
>>
>> I first met this in Squeak when I executed Smalltalk = Smalltalk.   
>> Of course, Smalltalk, as the dictionary of all globals, includes  
>> Smalltalk as a value, and the equality computation never  
>> terminated.  I filed a bug fix for that one, and it's in the image  
>> now.  My fix was to simply compare the two objects for identity  
>> first, and only then do the recursive equality check.  This, or  
>> course, answers true ot Smalltalk = Smalltalk quite quickly.
>>
>> The situation is the same with my recursive block example.
>>
>> 	b := [(i:= i-1) > 0 ifTrue: [j := j+1 . b value.] ifFalse:[j]] .
>>
>> It's compiled as a method, which has (presumably, I haven't  
>> looked) a literal that points to itself ... and the equality  
>> comparision goes on forever.  Why Squeak was doing an equality  
>> comparision when all I asked for was to "DoIt" is another question.
>>
>> I'm deducing all this from Mathieu's fix.   The identity test  
>> isn't a complete fix, however.  If you have two isomorphic  
>> recursive structures that don't share any elements, the test will  
>> still go on for ever.
>>
>> Here is a small example: two isomorphic one-element dictionaries.   
>> Type this in a workspace (BUT DON"T TRY THIS IN AN IMAGE THAT YOU  
>> WANT TO KEEP USING!)
>>
>> a := Dictionary new.
>> a at: #foo put: a.
>> b := Dictionary new.
>> b at: #foo put: b.
>>
>> I expected that if I then did a printIt on a==b, I would get an  
>> immediate "false", but if I did a printit on a=b, the system would  
>> go off into never-land.
>>
>> But actually, it's more interesting than that.  The system goes  
>> off into never-land almost regardless of what you do!  Debug it a  
>> == b will do it.  So will a simple printIt of b or inspectIt of  
>> b.  A doIt of "self halt. a == b" goes into a loop /before/  
>> executing the self halt.  If you interrupt the computation, you  
>> can look at b in the debugger, and even launch an inspector from  
>> there: printStringLimtedTo: does its job, as it's supposed to.   
>> The system is trying to evaluate hash on the cyclic dictionary;  
>> hash is also broken for the same reason.  Asking for a fullStack  
>> in the debugger to see _why_ it's trying to compute hash puts the  
>> system into another recursive loop!
>>
>> This explains why I saw the same behavior with my recursive block,  
>> but not the root cause.  Anyone have an explanation?
>>
>> By the way, the real solution to the recursive equality comparison  
>> is to maintain a cache.
>> A pair of elements goes into the cache when you start to comapre  
>> them for equality.  If, in the process of doing the comparision,  
>> you find that you are asking the same equaility question again,  
>> then the second time around, you can assume that the answer is  
>> true.  This is OK, evne though you haven't finished doing the  
>> comparison yet.  If it turns out that there is indeed a  
>> difference, the "outer" equality will eventually find it, and will  
>> answer false.   The same trick can be employed for computing the  
>> hash.  Assume that the hash of the structure that you are  
>> computing the has over is some constant, say 0.  Start computing  
>> its hash, but if you find in the process that you need to compute  
>> the has of the same structure again, then just assume that it is  
>> 0.  The "real" hash might be thought of as the fixed-point of this  
>> process, but since hashes are just approximations anyway, once  
>> around the loop is enough.
>>
>> The queation remains: why does the mere existence of such  
>> recursive structures throw Squeak into a tizzy?
>>
>> 	Andrew
>>
>>
>>
>>
>
>
>

-- 
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.software-artist.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.






More information about the Squeak-dev mailing list