[ENH][BUG][FIX] Dictionary>>addAll: (was: Enhancements to Collection from RB)

Richard A. O'Keefe ok at cs.otago.ac.nz
Wed Dec 17 02:58:21 UTC 2003


"Adrian Lienhard" <adi at netstyle.ch> wrote:
	The #, doesn't work for dictionaries

When I read that, I was very surprised.  Why would anyone _expect_ #,
to work on dictionaries?  Looking at the ANSI Smalltalk standard, I
see that #, is not defined for collections in general, and not defined
for dictionaries.  In fact, in ANSI Smalltalk, #, is *only* defined
for classes supporting the <sequencedReadableCollection> protocol.

One normally expects #, to satisfy the law
    (x , y) size = (x size + y size)
when both sides are defined.  I don't see any general way to make this
true for dictionaries if x and y have overlapping key sets.

	because #, uses Collection>>addAll:  which is broken for
	Dictionaries:  It iterates with #do:  over the other collection.
	Now, Dictionary>>do:  doesn't iterate over it's associations (as
	I would expect) but over its values!
	
OK, let's look at #,
Ah.  Squeak defines it for all collections.

    Collection>>
    , aCollection
        ^self copy addAll: aCollection; yourself

This would make sense for sets and bags, although the law I referred to
wouldn't hold for sets.  (It would for bags).  So let's look at #addAll:.

Again looking at the ANSI Smalltalk standard, we see that #addAll: is
specially defined for dictionaries:

    5.7.2.1 Message:  addAll: dictionary
      Synopsis
        Store the elements of dictionary in the receiver at the
        corresponding keys from dictionary.
      Definition: <abstractDictionary>
        This message is equivalent to repeatedly sending the #at:put:
        message to the receiver with each of the keys and elements in
        dictionary in turn.  If a key in dictionary is key equivalent
        to a key in the receiver, the associated element in dictionary
        replaces the element in the receiver.

This is arguably sufficiently different from other uses of #addAll: in
the collection protocols that a different name should have been used;
there's a rather unpleasant equivocation that pervades the Dictionary
design: is it a collection of Associations or a keyed collection of
elements?  Here it's being viewed as a collection of Associations,
*sort*of*.  Almost.  Rather like, but not quite.  (I suppose we could
have a nice long argument about whether (adic addAll: adic) should work
and who is to blame if it doesn't.)

So we expect the following definition:

	Dictionary>>
	addAll: aDictionary
	    aDictionary == self ifFalse: [
		aDictionary keysAndValuesDo: [:key :value |
		    self at: key put: value]].
	    ^aDictionary

But there isn't any Dictionary>>addAll: in Squeak 3.5 or Squeak 3.6g2.
OOPS.

	Since we probably do not want to change (high impact...) the
	behavior of Dictionary>>do:  (which is in my eyes not properly
	implemented because it changes the semantics),

No, the version of #do you get for Dictionary is EXACTLY RIGHT because
it gets the semantics exactly right.  #do is supposed to iterate over
the *elements* of a collection, not the keys.  Again, this is an
unfortunate consequence of the design botch of trying to make a Dictionary
look like BOTH a set of Associations AND a keyed collection of elements.

	I've implemented Dictionary>>addAll:  which overrides the method
	from Collection and iterates with #associationDo:  over its
	elements.
	
I don't think you should do that.  Why not?  Simple.  Consider this
example:

    a := Dictionary new.
    b := Dictionary new.
    b at: #x put: true.
    b associationsDo: [:asn | a add: asn].
    x := y := nil.
    a associationsDo: [:asn | x := asn].
    b associationsDo: [:asn | y := asn].
    x == y

==> true

That's right:  after doing (a associationsDo: [:asn | a add: asn])
the apparently separate Dictionary objects a and b *share* the same
Association object.

This is a REALLY bad idea, because now (a at: #x put: false) will change
(b at: #x).  Well, you could live with that if it was _always_ true.
But Dictionary>>add: doesn't *always* install the new Association.  If
it already has an Association with an equivalent key, it changes the
one it has.  So in general, you never really know whether #add: will
share the Association or not, and after your version of #, you don't
know which keys will have shared associations between the two dictionaries
and which won't.

Defining Dictionary>>addAll: this way is also a bit over-specific;
it requires the argument to support #associationsDo:.  But when you
are implementing your own dictionary-like objects, it is a real
!P!A!I!N! to have to drag associations into it at all; I for one am
happy to implement #keysAndValuesDo: but not to implement
#associationsDo: when I don't _have_ any Associations in my data structure.
Note for example that SequenceableCollection DOES implement
#keysAndValuesDo: but DOESN'T implement #associationsDo:.

The definition

	Dictionary>>
	addAll: aKeyedCollection
	    aKeyedCollection == self ifFalse: [
		aKeyedCollection keysAndValuesDo: [:key :value |
		    self at: key put: value]].
	    ^aKeyedCollection

(1) Works sanely when the argument and receiver are the same object;
(2) does NOT result in unexpected sharing between the argument and
    the receiver;
(3) works with any argument that implements #keysAndValuesDo:, whether
    it contains associations or not (for example, Array).

I have tried sending this change from Squeak, from two different images,
but ran into trouble.  I've just tried again from a third image; hope
that works.



More information about the Squeak-dev mailing list