Equality of Recursive Structures [Was: Closure Compiler in 3.9
?]
stephane ducasse
stephane.ducasse at free.fr
Thu May 10 11:50:03 UTC 2007
thanks for the email. It was cool.
I learned something today.
Stef
On 10 mai 07, at 08:46, Andrew P. Black wrote:
> This started as a story about what I realized in the shower this
> morning, and ended up as a story about what I don't know about
> Squeak (again :-( )
>
> On 8 May 2007, at 17:12, Andreas Raab wrote:
>
>> A while ago Association>>= was changed to compare the values in
>> addition to its keys which has bitten me many, many times since it
>> was introduced. Most likely it is the culprit here, too.
>
> No, Andreas, this was a different problem but an interesting one.
> This is the third time in my life that i has bitten me. It's worth
> writing a pattern about: comparing recursive structures.
>
> Suppose that you have two potentially recursive data structures,
> and you need to compare them — perhaps for equality, or perhaps for
> dominance of one over the other. (I met this problem first when
> figuring out the type conformance algorithm for Emerald in about
> 1984.)
>
> To be concrete, suppose that the structures are dictionaries, which
> contain simple data as keys (like symbols), and more complex data
> as values — including, perhaps, other dictionaries. The obvious
> thing is to walk both structures in parallel, checking that they
> have the same keys, and that the values for each occurrence of a
> key are also the same. It's possible that those values are also
> dictionaries, in which case you execute the same = code that you
> were in the middle of running ...
>
> This is not a problem, unless: the dictionary stored in the
> dictionary is the very same dictionary that you started with. In
> which case, the naive recursive search will go on for ever.
>
> I first met this in Squeak when I executed Smalltalk = Smalltalk.
> Of course, Smalltalk, as the dictionary of all globals, includes
> Smalltalk as a value, and the equality computation never
> terminated. I filed a bug fix for that one, and it's in the image
> now. My fix was to simply compare the two objects for identity
> first, and only then do the recursive equality check. This, or
> course, answers true ot Smalltalk = Smalltalk quite quickly.
>
> The situation is the same with my recursive block example.
>
> b := [(i:= i-1) > 0 ifTrue: [j := j+1 . b value.] ifFalse:[j]] .
>
> It's compiled as a method, which has (presumably, I haven't looked)
> a literal that points to itself ... and the equality comparision
> goes on forever. Why Squeak was doing an equality comparision when
> all I asked for was to "DoIt" is another question.
>
> I'm deducing all this from Mathieu's fix. The identity test isn't
> a complete fix, however. If you have two isomorphic recursive
> structures that don't share any elements, the test will still go on
> for ever.
>
> Here is a small example: two isomorphic one-element dictionaries.
> Type this in a workspace (BUT DON"T TRY THIS IN AN IMAGE THAT YOU
> WANT TO KEEP USING!)
>
> a := Dictionary new.
> a at: #foo put: a.
> b := Dictionary new.
> b at: #foo put: b.
>
> I expected that if I then did a printIt on a==b, I would get an
> immediate "false", but if I did a printit on a=b, the system would
> go off into never-land.
>
> But actually, it's more interesting than that. The system goes off
> into never-land almost regardless of what you do! Debug it a == b
> will do it. So will a simple printIt of b or inspectIt of b. A
> doIt of "self halt. a == b" goes into a loop /before/ executing the
> self halt. If you interrupt the computation, you can look at b in
> the debugger, and even launch an inspector from there:
> printStringLimtedTo: does its job, as it's supposed to. The system
> is trying to evaluate hash on the cyclic dictionary; hash is also
> broken for the same reason. Asking for a fullStack in the debugger
> to see _why_ it's trying to compute hash puts the system into
> another recursive loop!
>
> This explains why I saw the same behavior with my recursive block,
> but not the root cause. Anyone have an explanation?
>
> By the way, the real solution to the recursive equality comparison
> is to maintain a cache.
> A pair of elements goes into the cache when you start to comapre
> them for equality. If, in the process of doing the comparision,
> you find that you are asking the same equaility question again,
> then the second time around, you can assume that the answer is
> true. This is OK, evne though you haven't finished doing the
> comparison yet. If it turns out that there is indeed a difference,
> the "outer" equality will eventually find it, and will answer
> false. The same trick can be employed for computing the hash.
> Assume that the hash of the structure that you are computing the
> has over is some constant, say 0. Start computing its hash, but if
> you find in the process that you need to compute the has of the
> same structure again, then just assume that it is 0. The "real"
> hash might be thought of as the fixed-point of this process, but
> since hashes are just approximations anyway, once around the loop
> is enough.
>
> The queation remains: why does the mere existence of such recursive
> structures throw Squeak into a tizzy?
>
> Andrew
>
>
>
>
More information about the Squeak-dev
mailing list
|