AW: [squeak-dev] IdentitySet>>collect:

Levente Uzonyi leves at elte.hu
Sat Nov 29 02:05:13 UTC 2014


The problem with changing #species is that it's overused. There are at 
least three different things it's used for in the Collection hierarchy:
- the class of the object returned by #collect: (Set's subclasses are the 
exception since 2000 or so, except for WeakSet, but that's really 
strange, and should revisited)
- the class of the object returned by #select: and friends
- the class used for comparison of collections

HashedCollection >> #select: uses #species. Changing IdentitySet 
>> #species to return a Set would break #select:, because a ~~ b doesn't 
imply a ~= b. IdentitySet is the optimal class for the return value of 
IdentitySet >> #select:, because it's guaranteed to be able to hold all 
selected values. I think the same applies to all other kind of sets (and 
collections, because #select: is simply reducing the number of elements).

This is not the first time this discussion comes up. There was one in 
2003[1], but the thread is hard to follow. And there was one in 2009[2].
I think the conclusion was that it's better to leave the method as it is, 
because there's no better way to do it.
Sure, we could use #species in #collect:, but it would break quite a lot 
of stuff. For example:

| k |
k := KeyedSet keyBlock: [ :each | each first ].
k add: #[1]; add: #[2].
k collect: [ :each | each size ]

Currently this snippet returns "a Set(1)". Using #species in #collect: 
(via #copyEmpty) one would get an error, because SmallInteger does not 
understand #first. Changing the #species of KeyedSet would break #select:.

So IMHO when you #collect: from a set, you should always use #collect:as:. 
Here is a (partial) list of methods which send #collect: to a Set:

Categorizer >> #changeFromCategorySpecs:
ClassFactoryForTestCase >> #createdClassNames
Dictionary >> #unreferencedKeys
MCDependencySorterTest >> #assertItems:orderAs:withRequired:toLoad:extraProvisions:
MessageNode >> #analyseTempsWithin:rootNode:assignmentPools:
MethodNode >> #referencedValuesWithinBlockExtent:
SetTest >> #testCollect
SetWithNilTest >> #runSetWithNilTestOf:

There are probably many more, but we should fix all of them. A static 
analyzer could help a lot.

Levente

[1] http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-February/052659.html
[2] http://lists.squeakfoundation.org/pipermail/squeak-dev/2009-November/141016.html



On Wed, 26 Nov 2014, David T. Lewis wrote:

> It seems to me that Object>>species is intended to handle this sort of issue.
>
> For IdentitySet, it answers what Eliot was expecting:
>
>   IdentitySet new species ==> IdentitySet
>   Set new species ==> Set
>
> However, IdentitySet>>collect: does not make use of this, and it answers
> a Set for the reasons that Levente explained.
>
> Answering an Array or and OrderedCollection would not really make sense,
> because sets are unordered collections (but Bag might be better).
>
> Shouldn't we have an implementation of IdentitySet>>species that answers
> Set (or Bag), with a method comment explaining why this is the case, and
> with all of the collection methods using #species to answer the right
> kind of result?
>
> I note that IdentitySet>>collect: answers a Set, but IdentitySet>select:
> sends #species and therefore answers an IdentitySet.
>
> So I think that if we want the #species of an IdentitySet to be a Set,
> then we should make it so, and give it a method comment to explain the
> rationale. And the #collect: and #select: methods should both answer a
> result of that #species.
>
> Dave
>
>
> On Thu, Nov 27, 2014 at 01:14:30AM +0100, Levente Uzonyi wrote:
>> Your example hides the problem of ordering - what Tobias is asking about -
>> so here's another:
>>
>> (IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]
>>
>> If IdentitySet >> #collect: were returning an Array, then what would be the
>> answer?
>>
>> { SmallInteger. Float } or { Float. SmallInteger } ?
>>
>> If you really want to have the resulting collection have the same size,
>> but avoid the problem with ordering, then what you really need is a Bag.
>> Luckily you can have whatever you want (as long as it makes sense):
>>
>> (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Set.
>> "==> a Set(1)"
>> (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as:
>> IdentitySet.
>> "==> an IdentitySet(1)"
>> (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as:
>> Array.
>> "==> #(1 1)"
>> (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as:
>> OrderedCollection.
>> "==> an OrderedCollection(1 1)."
>> (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Bag.
>> "==> a Bag(1 1)"
>> (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as:
>> IdentityBag.
>> "==> an IdentityBag(1 1)"
>> (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as:
>> Heap.
>> "==> a Heap(1 1)"
>>
>> Levente
>>
>> On Thu, 27 Nov 2014, Frank Lesser wrote:
>>
>>> Hi Tobias,
>>> agree, a problem of "OrderedCollection"
>>> not to break a lot of other things we could return an Array.
>>> but for me collecting has priority.
>>> Frank
>>>
>>> -----Urspr?ngliche Nachricht-----
>>> Von: squeak-dev-bounces at lists.squeakfoundation.org
>>> [mailto:squeak-dev-bounces at lists.squeakfoundation.org] Im Auftrag von
>>> Tobias
>>> Pape
>>> Gesendet: Donnerstag, 27. November 2014 00:48
>>> An: The general-purpose Squeak developers list
>>> Betreff: Re: [squeak-dev] IdentitySet>>collect:
>>>
>>>
>>> On 27.11.2014, at 00:34, Frank Lesser <frank-lesser at lesser-software.com>
>>> wrote:
>>>
>>>> hmm, not convinced
>>>>
>>>> (IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ]
>>>> OrderedCollection(1 1 )
>>>>
>>>> in LSWVST ( one-to-one ), you collect results of evaluating a block on
>>>> objects.
>>>>
>>>> Frank
>>>> maybe I am wrong ...
>>>
>>> Where would the order come from for that _Ordered_Collection?
>>>
>>> Best
>>> 	-Tobias
>>>
>
>


More information about the Squeak-dev mailing list