Equality of Recursive Structures [Was: Closure Compiler in 3.9
?]
Bergel, Alexandre
bergel at iam.unibe.ch
Sat May 12 11:34:52 UTC 2007
Hi Andrew,
Thanks for this instructive email.
Alexandre
On 10 May 2007, at 13:50, stephane ducasse wrote:
> thanks for the email. It was cool.
> I learned something today.
>
> Stef
>
> On 10 mai 07, at 08:46, Andrew P. Black wrote:
>
>> This started as a story about what I realized in the shower this
>> morning, and ended up as a story about what I don't know about
>> Squeak (again :-( )
>>
>> On 8 May 2007, at 17:12, Andreas Raab wrote:
>>
>>> A while ago Association>>= was changed to compare the values in
>>> addition to its keys which has bitten me many, many times since
>>> it was introduced. Most likely it is the culprit here, too.
>>
>> No, Andreas, this was a different problem but an interesting one.
>> This is the third time in my life that i has bitten me. It's
>> worth writing a pattern about: comparing recursive structures.
>>
>> Suppose that you have two potentially recursive data structures,
>> and you need to compare them — perhaps for equality, or perhaps
>> for dominance of one over the other. (I met this problem first
>> when figuring out the type conformance algorithm for Emerald in
>> about 1984.)
>>
>> To be concrete, suppose that the structures are dictionaries,
>> which contain simple data as keys (like symbols), and more complex
>> data as values — including, perhaps, other dictionaries. The
>> obvious thing is to walk both structures in parallel, checking
>> that they have the same keys, and that the values for each
>> occurrence of a key are also the same. It's possible that those
>> values are also dictionaries, in which case you execute the same =
>> code that you were in the middle of running ...
>>
>> This is not a problem, unless: the dictionary stored in the
>> dictionary is the very same dictionary that you started with. In
>> which case, the naive recursive search will go on for ever.
>>
>> I first met this in Squeak when I executed Smalltalk = Smalltalk.
>> Of course, Smalltalk, as the dictionary of all globals, includes
>> Smalltalk as a value, and the equality computation never
>> terminated. I filed a bug fix for that one, and it's in the image
>> now. My fix was to simply compare the two objects for identity
>> first, and only then do the recursive equality check. This, or
>> course, answers true ot Smalltalk = Smalltalk quite quickly.
>>
>> The situation is the same with my recursive block example.
>>
>> b := [(i:= i-1) > 0 ifTrue: [j := j+1 . b value.] ifFalse:[j]] .
>>
>> It's compiled as a method, which has (presumably, I haven't
>> looked) a literal that points to itself ... and the equality
>> comparision goes on forever. Why Squeak was doing an equality
>> comparision when all I asked for was to "DoIt" is another question.
>>
>> I'm deducing all this from Mathieu's fix. The identity test
>> isn't a complete fix, however. If you have two isomorphic
>> recursive structures that don't share any elements, the test will
>> still go on for ever.
>>
>> Here is a small example: two isomorphic one-element dictionaries.
>> Type this in a workspace (BUT DON"T TRY THIS IN AN IMAGE THAT YOU
>> WANT TO KEEP USING!)
>>
>> a := Dictionary new.
>> a at: #foo put: a.
>> b := Dictionary new.
>> b at: #foo put: b.
>>
>> I expected that if I then did a printIt on a==b, I would get an
>> immediate "false", but if I did a printit on a=b, the system would
>> go off into never-land.
>>
>> But actually, it's more interesting than that. The system goes
>> off into never-land almost regardless of what you do! Debug it a
>> == b will do it. So will a simple printIt of b or inspectIt of
>> b. A doIt of "self halt. a == b" goes into a loop /before/
>> executing the self halt. If you interrupt the computation, you
>> can look at b in the debugger, and even launch an inspector from
>> there: printStringLimtedTo: does its job, as it's supposed to.
>> The system is trying to evaluate hash on the cyclic dictionary;
>> hash is also broken for the same reason. Asking for a fullStack
>> in the debugger to see _why_ it's trying to compute hash puts the
>> system into another recursive loop!
>>
>> This explains why I saw the same behavior with my recursive block,
>> but not the root cause. Anyone have an explanation?
>>
>> By the way, the real solution to the recursive equality comparison
>> is to maintain a cache.
>> A pair of elements goes into the cache when you start to comapre
>> them for equality. If, in the process of doing the comparision,
>> you find that you are asking the same equaility question again,
>> then the second time around, you can assume that the answer is
>> true. This is OK, evne though you haven't finished doing the
>> comparison yet. If it turns out that there is indeed a
>> difference, the "outer" equality will eventually find it, and will
>> answer false. The same trick can be employed for computing the
>> hash. Assume that the hash of the structure that you are
>> computing the has over is some constant, say 0. Start computing
>> its hash, but if you find in the process that you need to compute
>> the has of the same structure again, then just assume that it is
>> 0. The "real" hash might be thought of as the fixed-point of this
>> process, but since hashes are just approximations anyway, once
>> around the loop is enough.
>>
>> The queation remains: why does the mere existence of such
>> recursive structures throw Squeak into a tizzy?
>>
>> Andrew
>>
>>
>>
>>
>
>
>
--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel http://www.software-artist.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
More information about the Squeak-dev
mailing list
|