AW: [squeak-dev] IdentitySet>>collect:
Ben Coman
btc at openInWorld.com
Sat Nov 29 03:42:00 UTC 2014
Levente Uzonyi wrote:
> The problem with changing #species is that it's overused. There are at
> least three different things it's used for in the Collection hierarchy:
> - the class of the object returned by #collect: (Set's subclasses are
> the exception since 2000 or so, except for WeakSet, but that's really
> strange, and should revisited)
> - the class of the object returned by #select: and friends
> - the class used for comparison of collections
So should there be #collectSpecies ?
Which defaults to "^self species" and is overidden as required.
In the case of Sets, you start with an "unordered collection without
duplicates", but since #collect can introduce duplicates, you end up
with "an unordered collection with duplicates" --> Bag (per Dave's
suggestion)
So I think its reasonable that the following two snippets should produce
the same result.
oc := OrderedCollection newFrom: #( 1 2 3 4 5 ).
(oc collect: [ :e | e even ]) occurrencesOf: true. "--> 2"
s := Set newFrom: #( 1 2 3 4 5 ).
(s collect: [ :e | e even ]) occurrencesOf: true. "--> 2"
which you get with the following change...
Set>>collectSpecies
^Bag
Set>>collect: aBlock
| result |
result := self collectSpecies new: self size.
array do: [:each | each ifNotNil: [result add: (aBlock value: each
enclosedSetElement)]].
^ result
>
> HashedCollection >> #select: uses #species. Changing IdentitySet
>>> #species to return a Set would break #select:, because a ~~ b doesn't
> imply a ~= b. IdentitySet is the optimal class for the return value of
> IdentitySet >> #select:, because it's guaranteed to be able to hold all
> selected values. I think the same applies to all other kind of sets (and
> collections, because #select: is simply reducing the number of elements).
>
> This is not the first time this discussion comes up. There was one in
> 2003[1], but the thread is hard to follow. And there was one in 2009[2].
> I think the conclusion was that it's better to leave the method as it
> is, because there's no better way to do it.
> Sure, we could use #species in #collect:, but it would break quite a lot
> of stuff. For example:
>
> | k |
> k := KeyedSet keyBlock: [ :each | each first ].
> k add: #[1]; add: #[2].
> k collect: [ :each | each size ]
That works with the above change to produce --> a Bag(1 1)
>
> Currently this snippet returns "a Set(1)". Using #species in #collect:
> (via #copyEmpty) one would get an error, because SmallInteger does not
> understand #first. Changing the #species of KeyedSet would break #select:.
>
> So IMHO when you #collect: from a set, you should always use
> #collect:as:. Here is a (partial) list of methods which send #collect:
> to a Set:
>
> Categorizer >> #changeFromCategorySpecs:
> ClassFactoryForTestCase >> #createdClassNames
> Dictionary >> #unreferencedKeys
> MCDependencySorterTest >>
> #assertItems:orderAs:withRequired:toLoad:extraProvisions:
> MessageNode >> #analyseTempsWithin:rootNode:assignmentPools:
> MethodNode >> #referencedValuesWithinBlockExtent:
> SetTest >> #testCollect
> SetWithNilTest >> #runSetWithNilTestOf:
>
> There are probably many more, but we should fix all of them. A static
> analyzer could help a lot.
>
> Levente
>
> [1]
> http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-February/052659.html
>
> [2]
> http://lists.squeakfoundation.org/pipermail/squeak-dev/2009-November/141016.html
>
>
>
>
> On Wed, 26 Nov 2014, David T. Lewis wrote:
>
>> It seems to me that Object>>species is intended to handle this sort of
>> issue.
>>
>> For IdentitySet, it answers what Eliot was expecting:
>>
>> IdentitySet new species ==> IdentitySet
>> Set new species ==> Set
>>
>> However, IdentitySet>>collect: does not make use of this, and it answers
>> a Set for the reasons that Levente explained.
>>
>> Answering an Array or and OrderedCollection would not really make sense,
>> because sets are unordered collections (but Bag might be better).
Bag is better.
>>
>> Shouldn't we have an implementation of IdentitySet>>species that answers
>> Set (or Bag), with a method comment explaining why this is the case, and
>> with all of the collection methods using #species to answer the right
>> kind of result?
>>
>> I note that IdentitySet>>collect: answers a Set, but IdentitySet>select:
>> sends #species and therefore answers an IdentitySet.
>>
>> So I think that if we want the #species of an IdentitySet to be a Set,
>> then we should make it so, and give it a method comment to explain the
>> rationale. And the #collect: and #select: methods should both answer a
>> result of that #species.
I think the species of #collect: and #select: are intrinsically
different. #collect: is a transformation.
>>
>> Dave
>>
>>
>> On Thu, Nov 27, 2014 at 01:14:30AM +0100, Levente Uzonyi wrote:
>>> Your example hides the problem of ordering - what Tobias is asking
>>> about -
>>> so here's another:
>>>
>>> (IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]
>>>
>>> If IdentitySet >> #collect: were returning an Array, then what would
>>> be the
>>> answer?
>>>
>>> { SmallInteger. Float } or { Float. SmallInteger } ?
--> a Bag(SmallInteger Float)
>>>
>>> If you really want to have the resulting collection have the same size,
>>> but avoid the problem with ordering, then what you really need is a Bag.
To me that makes the most sense.
>>>
>>> On Thu, 27 Nov 2014, Frank Lesser wrote:
>>>
>>>> Hi Tobias,
>>>> agree, a problem of "OrderedCollection"
>>>> not to break a lot of other things we could return an Array.
>>>> but for me collecting has priority.
>>>> Frank
>>>>
>>>> -----Urspr?ngliche Nachricht-----
>>>> Von: squeak-dev-bounces at lists.squeakfoundation.org
>>>> [mailto:squeak-dev-bounces at lists.squeakfoundation.org] Im Auftrag von
>>>> Tobias
>>>> Pape
>>>> Gesendet: Donnerstag, 27. November 2014 00:48
>>>> An: The general-purpose Squeak developers list
>>>> Betreff: Re: [squeak-dev] IdentitySet>>collect:
>>>>
>>>>
>>>> On 27.11.2014, at 00:34, Frank Lesser
>>>> <frank-lesser at lesser-software.com>
>>>> wrote:
>>>>
>>>>> hmm, not convinced
>>>>>
>>>>> (IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ]
>>>>> OrderedCollection(1 1 )
>>>>>
>>>>> in LSWVST ( one-to-one ), you collect results of evaluating a block on
>>>>> objects.
>>>>>
>>>>> Frank
>>>>> maybe I am wrong ...
>>>>
>>>> Where would the order come from for that _Ordered_Collection?
An unordered OrderedCollection --> Bag
cheers -ben
More information about the Squeak-dev
mailing list
|