IdentitySet>>collect:

List overview All Threads
Download

newer

older

The Trunk: Kernel-eem.888.mcz

Daily Commit Log

Eliot Miranda

26 Nov 2014 26 Nov '14

7:19 p.m.

Hi All,

IdentitySet>>collect: answers a Set, not an IdentitySet. Anyone else agree this is a serious bug? Anyone else disagree?

WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3)

-- best, Eliot

Attachments:

attachment.html (text/html — 390 bytes)

Show replies by date

tim Rowledge

26 Nov 26 Nov

7:49 p.m.

On 26-11-2014, at 10:19 AM, Eliot Miranda eliot.miranda@gmail.com wrote:

...

Hi All,
IdentitySet>>collect: answers a Set, not an IdentitySet.  Anyone else agree this is a serious bug?  Anyone else disagree?
WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3)

I think I agree. That looks pretty wrong.

tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: HEM: Hide Evidence of Malfunction

Levente Uzonyi

8:01 p.m.

On Wed, 26 Nov 2014, Eliot Miranda wrote:

...

Hi All, IdentitySet>>collect: answers a Set, not an IdentitySet. Anyone else agree this is a serious bug? Anyone else disagree?

Sometimes it's feasable to return an IdentitySet, other times it's not, so there's no optimal solution - #collect: can't cover all cases.

Set >> #collect: explicitly returns a Set, because this is the least bad solution to handle all of its subclasses reasonably well. In case of WeakSet, returning a WeakSet makes no sense, because some of your objects will disappear immediately. In case of IdentitySet, PluggableSet, KeyedSet and KeyedIdentitySet the returned values may not behave well in the original collection.

The best is to always be explicit:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [ :e | e ] as: IdentitySet "==> an IdentitySet(1 2 3 1.0 2.0 3.0)"

Levente

...

WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3)

best,Eliot

David T. Lewis

10:10 p.m.

On Wed, Nov 26, 2014 at 08:01:49PM +0100, Levente Uzonyi wrote:

...

On Wed, 26 Nov 2014, Eliot Miranda wrote:

...
Hi All, ? ? IdentitySet>>collect: answers a Set, not an IdentitySet.? Anyone else agree this is a serious bug?? Anyone else disagree?

WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3)

best,Eliot

Sometimes it's feasable to return an IdentitySet, other times it's not, so there's no optimal solution - #collect: can't cover all cases.

Set >> #collect: explicitly returns a Set, because this is the least bad solution to handle all of its subclasses reasonably well. In case of WeakSet, returning a WeakSet makes no sense, because some of your objects will disappear immediately. In case of IdentitySet, PluggableSet, KeyedSet and KeyedIdentitySet the returned values may not behave well in the original collection.

The best is to always be explicit:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [ :e | e ] as: IdentitySet "==> an IdentitySet(1 2 3 1.0 2.0 3.0)"

Levente

In Squeak 2.8 (checked on http://bertfreudenberg.github.io/SqueakJS/run/):

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(2.0 1 2 3 3.0 1.0)

But in trunk I get this:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(1 2 3)

So answering a Set makes sense for the reasons that Levente explains, but trunk is definitely broken WRT the contents of that set.

This bug needs a unit test to go along with whatever fix we agree on.

Dave

Levente Uzonyi

10:26 p.m.

On Wed, 26 Nov 2014, David T. Lewis wrote:

...

On Wed, Nov 26, 2014 at 08:01:49PM +0100, Levente Uzonyi wrote:

...
On Wed, 26 Nov 2014, Eliot Miranda wrote:

...
Hi All, ? ? IdentitySet>>collect: answers a Set, not an IdentitySet.? Anyone else agree this is a serious bug?? Anyone else disagree?

WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3)

best,Eliot

Sometimes it's feasable to return an IdentitySet, other times it's not, so there's no optimal solution - #collect: can't cover all cases.

Set >> #collect: explicitly returns a Set, because this is the least bad solution to handle all of its subclasses reasonably well. In case of WeakSet, returning a WeakSet makes no sense, because some of your objects will disappear immediately. In case of IdentitySet, PluggableSet, KeyedSet and KeyedIdentitySet the returned values may not behave well in the original collection.

The best is to always be explicit:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [ :e | e ] as: IdentitySet "==> an IdentitySet(1 2 3 1.0 2.0 3.0)"

Levente

In Squeak 2.8 (checked on http://bertfreudenberg.github.io/SqueakJS/run/):

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(2.0 1 2 3 3.0 1.0)

But in trunk I get this:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(1 2 3)

So answering a Set makes sense for the reasons that Levente explains, but trunk is definitely broken WRT the contents of that set.

This bug needs a unit test to go along with whatever fix we agree on.

It's not a bug either. The Floats are not there, because they are equal to the integers:

1 = 1.0 "==> true" 1.0 = 1 "==> true"

Therefore they have the same hash:

1 hash "==> 1" 1.0 hash "==> 1"

The bug was in Squeak 2.8: 1.0 = 1 "==> true" 1 = 1.0 "==> true" 1 hash "==> 1" 1.0 hash "==> 61440"

Here's the comment of Float >> hash from The Trunk: "Hash is reimplemented because = is implemented. Both words of the float are used. (The bitShift:'s ensure that the intermediate results do not become a large integer.) Care is taken to answer same hash as an equal Integer."

Levente

...

Dave

Eliot Miranda

10:56 p.m.

On Wed, Nov 26, 2014 at 1:26 PM, Levente Uzonyi leves@elte.hu wrote:

...

On Wed, 26 Nov 2014, David T. Lewis wrote:

On Wed, Nov 26, 2014 at 08:01:49PM +0100, Levente Uzonyi wrote:

...
...
On Wed, 26 Nov 2014, Eliot Miranda wrote:

Hi All,

...
? ? IdentitySet>>collect: answers a Set, not an IdentitySet.? Anyone else agree this is a serious bug?? Anyone else disagree?

WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3) -- best,Eliot

Sometimes it's feasable to return an IdentitySet, other times it's not, so there's no optimal solution - #collect: can't cover all cases.

Set >> #collect: explicitly returns a Set, because this is the least bad solution to handle all of its subclasses reasonably well. In case of WeakSet, returning a WeakSet makes no sense, because some of your objects will disappear immediately. In case of IdentitySet, PluggableSet, KeyedSet and KeyedIdentitySet the returned values may not behave well in the original collection.

The best is to always be explicit:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [ :e | e ] as: IdentitySet "==> an IdentitySet(1 2 3 1.0 2.0 3.0)"

Levente

In Squeak 2.8 (checked on http://bertfreudenberg.github.io/SqueakJS/run/ ):

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(2.0 1 2 3 3.0 1.0)

But in trunk I get this:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(1 2 3)

So answering a Set makes sense for the reasons that Levente explains, but trunk is definitely broken WRT the contents of that set.

This bug needs a unit test to go along with whatever fix we agree on.

It's not a bug either. The Floats are not there, because they are equal to the integers:

1 = 1.0 "==> true" 1.0 = 1 "==> true"

Therefore they have the same hash:

1 hash "==> 1" 1.0 hash "==> 1"

The bug was in Squeak 2.8: 1.0 = 1 "==> true" 1 = 1.0 "==> true" 1 hash "==> 1" 1.0 hash "==> 61440"

Here's the comment of Float >> hash from The Trunk: "Hash is reimplemented because = is implemented. Both words of the float are used. (The bitShift:'s ensure that the intermediate results do not become a large integer.) Care is taken to answer same hash as an equal Integer."

Exactly.

My hope would be something like this (Set order is arbitrary)

((IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e]) => IdentitySet (1 2 3 1.0 2.0 3.0)

but that (Set withAll: #(1 2 3 1.0 2.0 3.0)) => Set (1 2 3)

(and it always contains just the integers since these are added first).

But Levente's collect:as: solution is perfectly acceptable and as he points out has the advantage of being explicit.

...

Levente

...
Dave

-- best, Eliot

Igor Stasenko

11:30 p.m.

i would make an #identityCollect: for sets. so it is more explicit, and also can be used with any other sets as input.

On 26 November 2014 at 22:56, Eliot Miranda eliot.miranda@gmail.com wrote:

...

On Wed, Nov 26, 2014 at 1:26 PM, Levente Uzonyi leves@elte.hu wrote:

...
On Wed, 26 Nov 2014, David T. Lewis wrote:

On Wed, Nov 26, 2014 at 08:01:49PM +0100, Levente Uzonyi wrote:

...
...
On Wed, 26 Nov 2014, Eliot Miranda wrote:

Hi All,

...
? ? IdentitySet>>collect: answers a Set, not an IdentitySet.? Anyone else agree this is a serious bug?? Anyone else disagree?

WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3) -- best,Eliot

Sometimes it's feasable to return an IdentitySet, other times it's not, so there's no optimal solution - #collect: can't cover all cases.

Set >> #collect: explicitly returns a Set, because this is the least bad solution to handle all of its subclasses reasonably well. In case of WeakSet, returning a WeakSet makes no sense, because some of your objects will disappear immediately. In case of IdentitySet, PluggableSet, KeyedSet and KeyedIdentitySet the returned values may not behave well in the original collection.

The best is to always be explicit:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [ :e | e ] as: IdentitySet "==> an IdentitySet(1 2 3 1.0 2.0 3.0)"

Levente

In Squeak 2.8 (checked on http://bertfreudenberg.github.io/SqueakJS/run/ ):

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(2.0 1 2 3 3.0 1.0)

But in trunk I get this:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(1 2 3)

So answering a Set makes sense for the reasons that Levente explains, but trunk is definitely broken WRT the contents of that set.

This bug needs a unit test to go along with whatever fix we agree on.

It's not a bug either. The Floats are not there, because they are equal to the integers:

1 = 1.0 "==> true" 1.0 = 1 "==> true"

Therefore they have the same hash:

1 hash "==> 1" 1.0 hash "==> 1"

The bug was in Squeak 2.8: 1.0 = 1 "==> true" 1 = 1.0 "==> true" 1 hash "==> 1" 1.0 hash "==> 61440"

Here's the comment of Float >> hash from The Trunk: "Hash is reimplemented because = is implemented. Both words of the float are used. (The bitShift:'s ensure that the intermediate results do not become a large integer.) Care is taken to answer same hash as an equal Integer."

Exactly.

My hope would be something like this (Set order is arbitrary)
    ((IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e]) =>
IdentitySet (1 2 3 1.0 2.0 3.0)

but that (Set withAll: #(1 2 3 1.0 2.0 3.0)) => Set (1 2 3)

(and it always contains just the integers since these are added first).

But Levente's collect:as: solution is perfectly acceptable and as he points out has the advantage of being explicit.

...
Levente

...
Dave

-- best, Eliot

-- Best regards, Igor Stasenko.

Frank Lesser

11:37 p.m.

New subject: AW: IdentitySet>>collect:

hmm, the problem is different IMO,

**corrected, but still wrong**

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] IdentitySet(1 )

collect: should always preserve the size of the original collection so it must return an OrderedCollection !

_____

Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Eliot Miranda Gesendet: Mittwoch, 26. November 2014 22:57 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On Wed, Nov 26, 2014 at 1:26 PM, Levente Uzonyi leves@elte.hu wrote:

On Wed, 26 Nov 2014, David T. Lewis wrote:

On Wed, Nov 26, 2014 at 08:01:49PM +0100, Levente Uzonyi wrote:

On Wed, 26 Nov 2014, Eliot Miranda wrote:

Hi All, ? ? IdentitySet>>collect: answers a Set, not an IdentitySet.? Anyone else agree this is a serious bug?? Anyone else disagree?

WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3) -- best,Eliot

Sometimes it's feasable to return an IdentitySet, other times it's not, so there's no optimal solution - #collect: can't cover all cases.

The best is to always be explicit:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [ :e | e ] as: IdentitySet "==> an IdentitySet(1 2 3 1.0 2.0 3.0)"

Levente

In Squeak 2.8 (checked on http://bertfreudenberg.github.io/SqueakJS/run/):

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(2.0 1 2 3 3.0 1.0)

But in trunk I get this:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(1 2 3)

So answering a Set makes sense for the reasons that Levente explains, but trunk is definitely broken WRT the contents of that set.

This bug needs a unit test to go along with whatever fix we agree on.

It's not a bug either. The Floats are not there, because they are equal to the integers:

1 = 1.0 "==> true" 1.0 = 1 "==> true"

Therefore they have the same hash:

1 hash "==> 1" 1.0 hash "==> 1"

The bug was in Squeak 2.8: 1.0 = 1 "==> true" 1 = 1.0 "==> true" 1 hash "==> 1" 1.0 hash "==> 61440"

Exactly.

My hope would be something like this (Set order is arbitrary)

((IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e]) => IdentitySet (1 2 3 1.0 2.0 3.0)

but that

(Set withAll: #(1 2 3 1.0 2.0 3.0)) => Set (1 2 3)

(and it always contains just the integers since these are added first).

But Levente's collect:as: solution is perfectly acceptable and as he points out has the advantage of being explicit.

Levente

Dave

-- best, Eliot

Eliot Miranda

11:43 p.m.

On Wed, Nov 26, 2014 at 2:37 PM, Frank Lesser < frank-lesser@lesser-software.com> wrote:

...

hmm, the problem is different IMO,

**corrected, but still wrong**

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] IdentitySet(1 )

collect: should always preserve the size of the original collection so it must return an OrderedCollection !

Since when? I'm not sure I accept that collect: is one-to-one. For Sets I wouldn't expect that.

...

*Von:* squeak-dev-bounces@lists.squeakfoundation.org [mailto: squeak-dev-bounces@lists.squeakfoundation.org] *Im Auftrag von *Eliot Miranda *Gesendet:* Mittwoch, 26. November 2014 22:57 *An:* The general-purpose Squeak developers list *Betreff:* Re: [squeak-dev] IdentitySet>>collect:

On Wed, Nov 26, 2014 at 1:26 PM, Levente Uzonyi leves@elte.hu wrote:

On Wed, 26 Nov 2014, David T. Lewis wrote:

On Wed, Nov 26, 2014 at 08:01:49PM +0100, Levente Uzonyi wrote:

On Wed, 26 Nov 2014, Eliot Miranda wrote:

Hi All, ? ? IdentitySet>>collect: answers a Set, not an IdentitySet.? Anyone else agree this is a serious bug?? Anyone else disagree?

WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3)

best,Eliot

Sometimes it's feasable to return an IdentitySet, other times it's not, so there's no optimal solution - #collect: can't cover all cases.

Set >> #collect: explicitly returns a Set, because this is the least bad solution to handle all of its subclasses reasonably well. In case of WeakSet, returning a WeakSet makes no sense, because some of your objects will disappear immediately. In case of IdentitySet, PluggableSet, KeyedSet and KeyedIdentitySet the returned values may not behave well in the original collection.

The best is to always be explicit:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [ :e | e ] as: IdentitySet "==> an IdentitySet(1 2 3 1.0 2.0 3.0)"

Levente

In Squeak 2.8 (checked on http://bertfreudenberg.github.io/SqueakJS/run/):

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(2.0 1 2 3 3.0 1.0)

But in trunk I get this:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(1 2 3)

So answering a Set makes sense for the reasons that Levente explains, but trunk is definitely broken WRT the contents of that set.

This bug needs a unit test to go along with whatever fix we agree on.

It's not a bug either. The Floats are not there, because they are equal to the integers:

1 = 1.0 "==> true" 1.0 = 1 "==> true"

Therefore they have the same hash:

1 hash "==> 1" 1.0 hash "==> 1"

The bug was in Squeak 2.8: 1.0 = 1 "==> true" 1 = 1.0 "==> true" 1 hash "==> 1" 1.0 hash "==> 61440"

Here's the comment of Float >> hash from The Trunk: "Hash is reimplemented because = is implemented. Both words of the float are used. (The bitShift:'s ensure that the intermediate results do not become a large integer.) Care is taken to answer same hash as an equal Integer."

Exactly.

My hope would be something like this (Set order is arbitrary)
    ((IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e]) =>
IdentitySet (1 2 3 1.0 2.0 3.0)

but that
   (Set withAll: #(1 2 3 1.0 2.0 3.0)) => Set (1 2 3)
(and it always contains just the integers since these are added first).

But Levente's collect:as: solution is perfectly acceptable and as he points out has the advantage of being explicit.

Levente

Dave

--

best,

Eliot

-- best, Eliot

Frank Lesser

11:52 p.m.

New subject: AW: ** SPAM ** Re: IdentitySet>>collect:

Hi Eliiot.

I am not a native English - but collect: means collect IMO.

so for every element deliver the result of evaluating the block. DOT

interesting issue.

Frank

_____

Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Eliot Miranda Gesendet: Mittwoch, 26. November 2014 23:44 An: The general-purpose Squeak developers list Betreff: ** SPAM ** Re: [squeak-dev] IdentitySet>>collect:

On Wed, Nov 26, 2014 at 2:37 PM, Frank Lesser frank-lesser@lesser-software.com wrote:

hmm, the problem is different IMO,

**corrected, but still wrong**

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] IdentitySet(1 )

collect: should always preserve the size of the original collection so it must return an OrderedCollection !

Since when? I'm not sure I accept that collect: is one-to-one. For Sets I wouldn't expect that.

_____

On Wed, Nov 26, 2014 at 1:26 PM, Levente Uzonyi leves@elte.hu wrote:

On Wed, 26 Nov 2014, David T. Lewis wrote:

On Wed, Nov 26, 2014 at 08:01:49PM +0100, Levente Uzonyi wrote:

On Wed, 26 Nov 2014, Eliot Miranda wrote:

Hi All, ? ? IdentitySet>>collect: answers a Set, not an IdentitySet.? Anyone else agree this is a serious bug?? Anyone else disagree?

WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3) -- best,Eliot

Sometimes it's feasable to return an IdentitySet, other times it's not, so there's no optimal solution - #collect: can't cover all cases.

The best is to always be explicit:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [ :e | e ] as: IdentitySet "==> an IdentitySet(1 2 3 1.0 2.0 3.0)"

Levente

In Squeak 2.8 (checked on http://bertfreudenberg.github.io/SqueakJS/run/):

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(2.0 1 2 3 3.0 1.0)

But in trunk I get this:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(1 2 3)

So answering a Set makes sense for the reasons that Levente explains, but trunk is definitely broken WRT the contents of that set.

This bug needs a unit test to go along with whatever fix we agree on.

It's not a bug either. The Floats are not there, because they are equal to the integers:

1 = 1.0 "==> true" 1.0 = 1 "==> true"

Therefore they have the same hash:

1 hash "==> 1" 1.0 hash "==> 1"

The bug was in Squeak 2.8: 1.0 = 1 "==> true" 1 = 1.0 "==> true" 1 hash "==> 1" 1.0 hash "==> 61440"

Exactly.

My hope would be something like this (Set order is arbitrary)

((IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e]) => IdentitySet (1 2 3 1.0 2.0 3.0)

but that

(Set withAll: #(1 2 3 1.0 2.0 3.0)) => Set (1 2 3)

(and it always contains just the integers since these are added first).

But Levente's collect:as: solution is perfectly acceptable and as he points out has the advantage of being explicit.

Levente

Dave

-- best, Eliot -- best, Eliot

David T. Lewis

27 Nov 27 Nov

12:11 a.m.

On Wed, Nov 26, 2014 at 11:52:43PM +0100, Frank Lesser wrote:

...

Hi Eliiot.

I am not a native English - but collect: means collect IMO.

so for every element deliver the result of evaluating the block. DOT

interesting issue.

Frank

I am a native American speaker, so your English is probably better than mine ;-)

I also was surprised by the result. But IdentitySet>>collect: is doing the right thing in iterating over all the elements of the identity set:

| r | r := Random new. ((IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e | r next]) size ==> 6

The confusing behavior is the result of answering a Set rather than IdentitySet.

Eliot argues for answering an IdentitySet, Levente explains why this may not be a good idea, and Eliot agrees that Levente's suggestion of using the explicit #collect:as: is a satisfactory solution.

Dave

Frank Lesser

12:34 a.m.

New subject: AW: IdentitySet>>collect:

hmm, not convinced

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] OrderedCollection(1 1 )

in LSWVST ( one-to-one ), you collect results of evaluating a block on objects.

Frank maybe I am wrong ...

-----Ursprüngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von David T. Lewis Gesendet: Donnerstag, 27. November 2014 00:11 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On Wed, Nov 26, 2014 at 11:52:43PM +0100, Frank Lesser wrote:

...

Hi Eliiot.

I am not a native English - but collect: means collect IMO.

so for every element deliver the result of evaluating the block. DOT

interesting issue.

Frank

I am a native American speaker, so your English is probably better than mine ;-)

I also was surprised by the result. But IdentitySet>>collect: is doing the right thing in iterating over all the elements of the identity set:

| r | r := Random new. ((IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e | r next]) size ==> 6

The confusing behavior is the result of answering a Set rather than IdentitySet.

Eliot argues for answering an IdentitySet, Levente explains why this may not be a good idea, and Eliot agrees that Levente's suggestion of using the explicit #collect:as: is a satisfactory solution.

Dave

Tobias Pape

12:47 a.m.

On 27.11.2014, at 00:34, Frank Lesser frank-lesser@lesser-software.com wrote:

...

hmm, not convinced

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] OrderedCollection(1 1 )

in LSWVST ( one-to-one ), you collect results of evaluating a block on objects.

Frank maybe I am wrong ...

Where would the order come from for that _Ordered_Collection?

Best -Tobias

Frank Lesser

12:52 a.m.

New subject: AW: IdentitySet>>collect:

Hi Tobias, agree, a problem of "OrderedCollection" not to break a lot of other things we could return an Array. but for me collecting has priority. Frank

-----Ursprüngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Tobias Pape Gesendet: Donnerstag, 27. November 2014 00:48 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On 27.11.2014, at 00:34, Frank Lesser frank-lesser@lesser-software.com wrote:

...

hmm, not convinced

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] OrderedCollection(1 1 )

in LSWVST ( one-to-one ), you collect results of evaluating a block on objects.

Frank maybe I am wrong ...

Where would the order come from for that _Ordered_Collection?

Best -Tobias

Levente Uzonyi

1:14 a.m.

New subject: AW: IdentitySet>>collect:

Your example hides the problem of ordering - what Tobias is asking about - so here's another:

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be the answer?

{ SmallInteger. Float } or { Float. SmallInteger } ?

If you really want to have the resulting collection have the same size, but avoid the problem with ordering, then what you really need is a Bag. Luckily you can have whatever you want (as long as it makes sense):

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Set. "==> a Set(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentitySet. "==> an IdentitySet(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Array. "==> #(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: OrderedCollection. "==> an OrderedCollection(1 1)." (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Bag. "==> a Bag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentityBag. "==> an IdentityBag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Heap. "==> a Heap(1 1)"

Levente

On Thu, 27 Nov 2014, Frank Lesser wrote:

...

Hi Tobias, agree, a problem of "OrderedCollection" not to break a lot of other things we could return an Array. but for me collecting has priority. Frank

-----Ursprüngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Tobias Pape Gesendet: Donnerstag, 27. November 2014 00:48 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On 27.11.2014, at 00:34, Frank Lesser frank-lesser@lesser-software.com wrote:

...
hmm, not convinced

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] OrderedCollection(1 1 )

in LSWVST ( one-to-one ), you collect results of evaluating a block on objects.

Frank maybe I am wrong ...

Where would the order come from for that _Ordered_Collection?

Best -Tobias

David T. Lewis

1:31 a.m.

New subject: AW: IdentitySet>>collect:

It seems to me that Object>>species is intended to handle this sort of issue.

For IdentitySet, it answers what Eliot was expecting:

IdentitySet new species ==> IdentitySet Set new species ==> Set

However, IdentitySet>>collect: does not make use of this, and it answers a Set for the reasons that Levente explained.

Answering an Array or and OrderedCollection would not really make sense, because sets are unordered collections (but Bag might be better).

Shouldn't we have an implementation of IdentitySet>>species that answers Set (or Bag), with a method comment explaining why this is the case, and with all of the collection methods using #species to answer the right kind of result?

I note that IdentitySet>>collect: answers a Set, but IdentitySet>select: sends #species and therefore answers an IdentitySet.

So I think that if we want the #species of an IdentitySet to be a Set, then we should make it so, and give it a method comment to explain the rationale. And the #collect: and #select: methods should both answer a result of that #species.

Dave

On Thu, Nov 27, 2014 at 01:14:30AM +0100, Levente Uzonyi wrote:

...

Your example hides the problem of ordering - what Tobias is asking about - so here's another:

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be the answer?

{ SmallInteger. Float } or { Float. SmallInteger } ?

If you really want to have the resulting collection have the same size, but avoid the problem with ordering, then what you really need is a Bag. Luckily you can have whatever you want (as long as it makes sense):

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Set. "==> a Set(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentitySet. "==> an IdentitySet(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Array. "==> #(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: OrderedCollection. "==> an OrderedCollection(1 1)." (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Bag. "==> a Bag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentityBag. "==> an IdentityBag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Heap. "==> a Heap(1 1)"

Levente

On Thu, 27 Nov 2014, Frank Lesser wrote:

...
Hi Tobias, agree, a problem of "OrderedCollection" not to break a lot of other things we could return an Array. but for me collecting has priority. Frank

-----Urspr?ngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Tobias Pape Gesendet: Donnerstag, 27. November 2014 00:48 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On 27.11.2014, at 00:34, Frank Lesser frank-lesser@lesser-software.com wrote:

...
hmm, not convinced

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] OrderedCollection(1 1 )

in LSWVST ( one-to-one ), you collect results of evaluating a block on objects.

Frank maybe I am wrong ...

Where would the order come from for that _Ordered_Collection?

Best -Tobias

Eliot Miranda

1:51 a.m.

New subject: AW: IdentitySet>>collect:

On Wed, Nov 26, 2014 at 4:31 PM, David T. Lewis lewis@mail.msen.com wrote:

...

It seems to me that Object>>species is intended to handle this sort of issue.

You're right. That's what it's there for.

For IdentitySet, it answers what Eliot was expecting:

...

IdentitySet new species ==> IdentitySet Set new species ==> Set

However, IdentitySet>>collect: does not make use of this, and it answers a Set for the reasons that Levente explained.

Answering an Array or and OrderedCollection would not really make sense, because sets are unordered collections (but Bag might be better).

Shouldn't we have an implementation of IdentitySet>>species that answers Set (or Bag), with a method comment explaining why this is the case, and with all of the collection methods using #species to answer the right kind of result?

I note that IdentitySet>>collect: answers a Set, but IdentitySet>select: sends #species and therefore answers an IdentitySet.

So I think that if we want the #species of an IdentitySet to be a Set, then we should make it so, and give it a method comment to explain the rationale. And the #collect: and #select: methods should both answer a result of that #species.

...

Dave

On Thu, Nov 27, 2014 at 01:14:30AM +0100, Levente Uzonyi wrote:

...
Your example hides the problem of ordering - what Tobias is asking about

...
so here's another:

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be

the

...
answer?

{ SmallInteger. Float } or { Float. SmallInteger } ?

If you really want to have the resulting collection have the same size, but avoid the problem with ordering, then what you really need is a Bag. Luckily you can have whatever you want (as long as it makes sense):

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as:

Set.

...
"==> a Set(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentitySet. "==> an IdentitySet(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Array. "==> #(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: OrderedCollection. "==> an OrderedCollection(1 1)." (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as:

Bag.

...
"==> a Bag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentityBag. "==> an IdentityBag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Heap. "==> a Heap(1 1)"

Levente

On Thu, 27 Nov 2014, Frank Lesser wrote:

...
Hi Tobias, agree, a problem of "OrderedCollection" not to break a lot of other things we could return an Array. but for me collecting has priority. Frank

-----Urspr?ngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Tobias Pape Gesendet: Donnerstag, 27. November 2014 00:48 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On 27.11.2014, at 00:34, Frank Lesser <frank-lesser@lesser-software.com

...
wrote:

...
hmm, not convinced

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] OrderedCollection(1 1 )

in LSWVST ( one-to-one ), you collect results of evaluating a block on objects.

Frank maybe I am wrong ...

Where would the order come from for that _Ordered_Collection?

Best -Tobias

-- best, Eliot

Levente Uzonyi

29 Nov 29 Nov

3:05 a.m.

New subject: AW: IdentitySet>>collect:

The problem with changing #species is that it's overused. There are at least three different things it's used for in the Collection hierarchy: - the class of the object returned by #collect: (Set's subclasses are the exception since 2000 or so, except for WeakSet, but that's really strange, and should revisited) - the class of the object returned by #select: and friends - the class used for comparison of collections

HashedCollection >> #select: uses #species. Changing IdentitySet

...

...
#species to return a Set would break #select:, because a ~~ b doesn't

imply a ~= b. IdentitySet is the optimal class for the return value of IdentitySet >> #select:, because it's guaranteed to be able to hold all selected values. I think the same applies to all other kind of sets (and collections, because #select: is simply reducing the number of elements).

This is not the first time this discussion comes up. There was one in 2003[1], but the thread is hard to follow. And there was one in 2009[2]. I think the conclusion was that it's better to leave the method as it is, because there's no better way to do it. Sure, we could use #species in #collect:, but it would break quite a lot of stuff. For example:

| k | k := KeyedSet keyBlock: [ :each | each first ]. k add: #[1]; add: #[2]. k collect: [ :each | each size ]

Currently this snippet returns "a Set(1)". Using #species in #collect: (via #copyEmpty) one would get an error, because SmallInteger does not understand #first. Changing the #species of KeyedSet would break #select:.

So IMHO when you #collect: from a set, you should always use #collect:as:. Here is a (partial) list of methods which send #collect: to a Set:

Categorizer >> #changeFromCategorySpecs: ClassFactoryForTestCase >> #createdClassNames Dictionary >> #unreferencedKeys MCDependencySorterTest >> #assertItems:orderAs:withRequired:toLoad:extraProvisions: MessageNode >> #analyseTempsWithin:rootNode:assignmentPools: MethodNode >> #referencedValuesWithinBlockExtent: SetTest >> #testCollect SetWithNilTest >> #runSetWithNilTestOf:

There are probably many more, but we should fix all of them. A static analyzer could help a lot.

Levente

[1] http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-February/052659.... [2] http://lists.squeakfoundation.org/pipermail/squeak-dev/2009-November/141016....

On Wed, 26 Nov 2014, David T. Lewis wrote:

...

It seems to me that Object>>species is intended to handle this sort of issue.

For IdentitySet, it answers what Eliot was expecting:

IdentitySet new species ==> IdentitySet Set new species ==> Set

However, IdentitySet>>collect: does not make use of this, and it answers a Set for the reasons that Levente explained.

Answering an Array or and OrderedCollection would not really make sense, because sets are unordered collections (but Bag might be better).

Shouldn't we have an implementation of IdentitySet>>species that answers Set (or Bag), with a method comment explaining why this is the case, and with all of the collection methods using #species to answer the right kind of result?

I note that IdentitySet>>collect: answers a Set, but IdentitySet>select: sends #species and therefore answers an IdentitySet.

So I think that if we want the #species of an IdentitySet to be a Set, then we should make it so, and give it a method comment to explain the rationale. And the #collect: and #select: methods should both answer a result of that #species.

Dave

On Thu, Nov 27, 2014 at 01:14:30AM +0100, Levente Uzonyi wrote:

...
Your example hides the problem of ordering - what Tobias is asking about - so here's another:

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be the answer?

{ SmallInteger. Float } or { Float. SmallInteger } ?

If you really want to have the resulting collection have the same size, but avoid the problem with ordering, then what you really need is a Bag. Luckily you can have whatever you want (as long as it makes sense):

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Set. "==> a Set(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentitySet. "==> an IdentitySet(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Array. "==> #(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: OrderedCollection. "==> an OrderedCollection(1 1)." (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Bag. "==> a Bag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentityBag. "==> an IdentityBag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Heap. "==> a Heap(1 1)"

Levente

On Thu, 27 Nov 2014, Frank Lesser wrote:

...
Hi Tobias, agree, a problem of "OrderedCollection" not to break a lot of other things we could return an Array. but for me collecting has priority. Frank

-----Urspr?ngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Tobias Pape Gesendet: Donnerstag, 27. November 2014 00:48 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On 27.11.2014, at 00:34, Frank Lesser frank-lesser@lesser-software.com wrote:

...
hmm, not convinced

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] OrderedCollection(1 1 )

in LSWVST ( one-to-one ), you collect results of evaluating a block on objects.

Frank maybe I am wrong ...

Where would the order come from for that _Ordered_Collection?

Best -Tobias

Ben Coman

4:42 a.m.

New subject: AW: IdentitySet>>collect:

Levente Uzonyi wrote:

...

The problem with changing #species is that it's overused. There are at least three different things it's used for in the Collection hierarchy:

the class of the object returned by #collect: (Set's subclasses are

the exception since 2000 or so, except for WeakSet, but that's really strange, and should revisited)

the class of the object returned by #select: and friends

the class used for comparison of collections

So should there be #collectSpecies ? Which defaults to "^self species" and is overidden as required.

In the case of Sets, you start with an "unordered collection without duplicates", but since #collect can introduce duplicates, you end up with "an unordered collection with duplicates" --> Bag (per Dave's suggestion)

So I think its reasonable that the following two snippets should produce the same result.

oc := OrderedCollection newFrom: #( 1 2 3 4 5 ). (oc collect: [ :e | e even ]) occurrencesOf: true. "--> 2"

s := Set newFrom: #( 1 2 3 4 5 ). (s collect: [ :e | e even ]) occurrencesOf: true. "--> 2"

which you get with the following change...

Set>>collectSpecies ^Bag

Set>>collect: aBlock | result | result := self collectSpecies new: self size. array do: [:each | each ifNotNil: [result add: (aBlock value: each enclosedSetElement)]]. ^ result

...

HashedCollection >> #select: uses #species. Changing IdentitySet

...
...
#species to return a Set would break #select:, because a ~~ b doesn't

imply a ~= b. IdentitySet is the optimal class for the return value of IdentitySet >> #select:, because it's guaranteed to be able to hold all selected values. I think the same applies to all other kind of sets (and collections, because #select: is simply reducing the number of elements).

This is not the first time this discussion comes up. There was one in 2003[1], but the thread is hard to follow. And there was one in 2009[2]. I think the conclusion was that it's better to leave the method as it is, because there's no better way to do it. Sure, we could use #species in #collect:, but it would break quite a lot of stuff. For example:

| k | k := KeyedSet keyBlock: [ :each | each first ]. k add: #[1]; add: #[2]. k collect: [ :each | each size ]

That works with the above change to produce --> a Bag(1 1)

...

Currently this snippet returns "a Set(1)". Using #species in #collect: (via #copyEmpty) one would get an error, because SmallInteger does not understand #first. Changing the #species of KeyedSet would break #select:.

So IMHO when you #collect: from a set, you should always use #collect:as:. Here is a (partial) list of methods which send #collect: to a Set:

Categorizer >> #changeFromCategorySpecs: ClassFactoryForTestCase >> #createdClassNames Dictionary >> #unreferencedKeys MCDependencySorterTest >> #assertItems:orderAs:withRequired:toLoad:extraProvisions: MessageNode >> #analyseTempsWithin:rootNode:assignmentPools: MethodNode >> #referencedValuesWithinBlockExtent: SetTest >> #testCollect SetWithNilTest >> #runSetWithNilTestOf:

There are probably many more, but we should fix all of them. A static analyzer could help a lot.

Levente

[1] http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-February/052659....

[2] http://lists.squeakfoundation.org/pipermail/squeak-dev/2009-November/141016....

On Wed, 26 Nov 2014, David T. Lewis wrote:

...
It seems to me that Object>>species is intended to handle this sort of issue.

For IdentitySet, it answers what Eliot was expecting:

IdentitySet new species ==> IdentitySet Set new species ==> Set

However, IdentitySet>>collect: does not make use of this, and it answers a Set for the reasons that Levente explained.

Answering an Array or and OrderedCollection would not really make sense, because sets are unordered collections (but Bag might be better).

Bag is better.

...

...
Shouldn't we have an implementation of IdentitySet>>species that answers Set (or Bag), with a method comment explaining why this is the case, and with all of the collection methods using #species to answer the right kind of result?

I note that IdentitySet>>collect: answers a Set, but IdentitySet>select: sends #species and therefore answers an IdentitySet.

So I think that if we want the #species of an IdentitySet to be a Set, then we should make it so, and give it a method comment to explain the rationale. And the #collect: and #select: methods should both answer a result of that #species.

I think the species of #collect: and #select: are intrinsically different. #collect: is a transformation.

...

...
Dave

On Thu, Nov 27, 2014 at 01:14:30AM +0100, Levente Uzonyi wrote:

...
Your example hides the problem of ordering - what Tobias is asking about - so here's another:

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be the answer?

{ SmallInteger. Float } or { Float. SmallInteger } ?

--> a Bag(SmallInteger Float)

...

...
...
If you really want to have the resulting collection have the same size, but avoid the problem with ordering, then what you really need is a Bag.

To me that makes the most sense.

...

...
...
On Thu, 27 Nov 2014, Frank Lesser wrote:

...
Hi Tobias, agree, a problem of "OrderedCollection" not to break a lot of other things we could return an Array. but for me collecting has priority. Frank

-----Urspr?ngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Tobias Pape Gesendet: Donnerstag, 27. November 2014 00:48 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On 27.11.2014, at 00:34, Frank Lesser frank-lesser@lesser-software.com wrote:

...
hmm, not convinced

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] OrderedCollection(1 1 )

in LSWVST ( one-to-one ), you collect results of evaluating a block on objects.

Frank maybe I am wrong ...

Where would the order come from for that _Ordered_Collection?

An unordered OrderedCollection --> Bag

cheers -ben

Eliot Miranda

1 Dec 1 Dec

5:26 p.m.

New subject: AW: IdentitySet>>collect:

On Fri, Nov 28, 2014 at 7:42 PM, Ben Coman btc@openinworld.com wrote:

...

Levente Uzonyi wrote:

...
The problem with changing #species is that it's overused. There are at least three different things it's used for in the Collection hierarchy:

the class of the object returned by #collect: (Set's subclasses are the

exception since 2000 or so, except for WeakSet, but that's really strange, and should revisited)

the class of the object returned by #select: and friends

the class used for comparison of collections

So should there be #collectSpecies ?

No. That's what species was invented for in the first place.

...

Which defaults to "^self species" and is overidden as required.

In the case of Sets, you start with an "unordered collection without duplicates", but since #collect can introduce duplicates, you end up with "an unordered collection with duplicates" --> Bag (per Dave's suggestion)

So I think its reasonable that the following two snippets should produce the same result.

oc := OrderedCollection newFrom: #( 1 2 3 4 5 ). (oc collect: [ :e | e even ]) occurrencesOf: true. "--> 2"

s := Set newFrom: #( 1 2 3 4 5 ). (s collect: [ :e | e even ]) occurrencesOf: true. "--> 2"

which you get with the following change...

Set>>collectSpecies ^Bag

Set>>collect: aBlock | result | result := self collectSpecies new: self size. array do: [:each | each ifNotNil: [result add: (aBlock value: each enclosedSetElement)]]. ^ result

...
HashedCollection >> #select: uses #species. Changing IdentitySet

...
#species to return a Set would break #select:, because a ~~ b doesn't

...
imply a ~= b. IdentitySet is the optimal class for the return value of

IdentitySet >> #select:, because it's guaranteed to be able to hold all selected values. I think the same applies to all other kind of sets (and collections, because #select: is simply reducing the number of elements).

This is not the first time this discussion comes up. There was one in 2003[1], but the thread is hard to follow. And there was one in 2009[2]. I think the conclusion was that it's better to leave the method as it is, because there's no better way to do it. Sure, we could use #species in #collect:, but it would break quite a lot of stuff. For example:

| k | k := KeyedSet keyBlock: [ :each | each first ]. k add: #[1]; add: #[2]. k collect: [ :each | each size ]

That works with the above change to produce --> a Bag(1 1)

...
Currently this snippet returns "a Set(1)". Using #species in #collect: (via #copyEmpty) one would get an error, because SmallInteger does not understand #first. Changing the #species of KeyedSet would break #select:.

So IMHO when you #collect: from a set, you should always use #collect:as:. Here is a (partial) list of methods which send #collect: to a Set:

Categorizer >> #changeFromCategorySpecs: ClassFactoryForTestCase >> #createdClassNames Dictionary >> #unreferencedKeys MCDependencySorterTest >> #assertItems:orderAs:withRequired:toLoad: extraProvisions: MessageNode >> #analyseTempsWithin:rootNode:assignmentPools: MethodNode >> #referencedValuesWithinBlockExtent: SetTest >> #testCollect SetWithNilTest >> #runSetWithNilTestOf:

There are probably many more, but we should fix all of them. A static analyzer could help a lot.

Levente

[1] http://lists.squeakfoundation.org/pipermail/squeak-dev/2003- February/052659.html [2] http://lists.squeakfoundation.org/pipermail/squeak-dev/2009- November/141016.html

On Wed, 26 Nov 2014, David T. Lewis wrote:

It seems to me that Object>>species is intended to handle this sort of

...
issue.

For IdentitySet, it answers what Eliot was expecting:

IdentitySet new species ==> IdentitySet Set new species ==> Set

However, IdentitySet>>collect: does not make use of this, and it answers a Set for the reasons that Levente explained.

Answering an Array or and OrderedCollection would not really make sense, because sets are unordered collections (but Bag might be better).

Bag is better.

...
...
Shouldn't we have an implementation of IdentitySet>>species that answers Set (or Bag), with a method comment explaining why this is the case, and with all of the collection methods using #species to answer the right kind of result?

I note that IdentitySet>>collect: answers a Set, but IdentitySet>select: sends #species and therefore answers an IdentitySet.

So I think that if we want the #species of an IdentitySet to be a Set, then we should make it so, and give it a method comment to explain the rationale. And the #collect: and #select: methods should both answer a result of that #species.

I think the species of #collect: and #select: are intrinsically different. #collect: is a transformation.

...
...
Dave

On Thu, Nov 27, 2014 at 01:14:30AM +0100, Levente Uzonyi wrote:

...
Your example hides the problem of ordering - what Tobias is asking about - so here's another:

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be the answer?

{ SmallInteger. Float } or { Float. SmallInteger } ?

--> a Bag(SmallInteger Float)

...
...
...
If you really want to have the resulting collection have the same size, but avoid the problem with ordering, then what you really need is a Bag.

To me that makes the most sense.

...
...
...
On Thu, 27 Nov 2014, Frank Lesser wrote:

Hi Tobias,

...
agree, a problem of "OrderedCollection" not to break a lot of other things we could return an Array. but for me collecting has priority. Frank

-----Urspr?ngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Tobias Pape Gesendet: Donnerstag, 27. November 2014 00:48 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On 27.11.2014, at 00:34, Frank Lesser <frank-lesser@lesser-software. com> wrote:

hmm, not convinced

...
(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] OrderedCollection(1 1 )

in LSWVST ( one-to-one ), you collect results of evaluating a block on objects.

Frank maybe I am wrong ...

Where would the order come from for that _Ordered_Collection?

An unordered OrderedCollection --> Bag

cheers -ben

-- best, Eliot

tim Rowledge

29 Nov 29 Nov

4:42 a.m.

On 28-11-2014, at 6:05 PM, Levente Uzonyi leves@elte.hu wrote:

...

I think the conclusion was that it's better to leave the method as it is, because there's no better way to do it.

There’s *always* a better way to do it. Somebody just has to think about it the right way.

tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim It is easier to change the specification to fit the program than vice versa.

Levente Uzonyi

3:55 p.m.

On Fri, 28 Nov 2014, tim Rowledge wrote:

...

On 28-11-2014, at 6:05 PM, Levente Uzonyi leves@elte.hu wrote:

...
I think the conclusion was that it's better to leave the method as it is, because there's no better way to do it.

There’s *always* a better way to do it. Somebody just has to think about it the right way.

There are too many factors for a better solution to exist. Let's say we change the method to return a Bag. Some will say it's better that way, because the resulting collection has the same size as the original. And some people will say that you broke their code, because the following will return false instead of true:

| s t | s := #(1 2 3) asSet. t := s collect: [ :each | each ]. s = t

IMHO the best solution is still to replace sends of #collect: with #collect:as:, but it's a lot of work.

Levente

...

tim

tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim It is easier to change the specification to fit the program than vice versa.

Florin Mateoc

7:26 p.m.

On 11/29/2014 9:55 AM, Levente Uzonyi wrote:

...

On Fri, 28 Nov 2014, tim Rowledge wrote:

...
On 28-11-2014, at 6:05 PM, Levente Uzonyi leves@elte.hu wrote:

...
I think the conclusion was that it's better to leave the method as it is, because there's no better way to do it.

There’s *always* a better way to do it. Somebody just has to think about it the right way.

There are too many factors for a better solution to exist. Let's say we change the method to return a Bag. Some will say it's better that way, because the resulting collection has the same size as the original. And some people will say that you broke their code, because the following will return false instead of true:

| s t | s := #(1 2 3) asSet. t := s collect: [ :each | each ]. s = t

IMHO the best solution is still to replace sends of #collect: with #collect:as:, but it's a lot of work.

Levente

I don't understand why anybody would think s should be equal to t in this example. collect:, or map as it is called in other languages, maps the elements in the receiver collection, to elements in the result, based on the mapping function provided as a block argument.

Conceptually, collect: is closer to returning a keyedCollection (which is also called a map), where the keys are the elements and the values are their mappings. If we see it this way, it is obvious that the order of the original elements is irrelevant, it is also obvious that the number of mapped values is/has to be the same as the number of elements (keys), and it is also obvious that the kind (or species) of collection holding the original elements (keys) is also irrelevant to the kind of collection holding the values. That's why always returning say an orderedCollection is fine, as long as we can map the collected values back to the original elements - even using a (long-term unstable) iteration order. When the original collection is a sequenceableCollection, we can shortcut all this by just relying on the indexes, which index both the original elements and their mappings. But that is not because the original order is itself important for the collect operation.

Florin

Chris Muller

8:04 p.m.

On Sat, Nov 29, 2014 at 12:26 PM, Florin Mateoc florin.mateoc@gmail.com wrote:

...

On 11/29/2014 9:55 AM, Levente Uzonyi wrote:

...
On Fri, 28 Nov 2014, tim Rowledge wrote:

...
On 28-11-2014, at 6:05 PM, Levente Uzonyi leves@elte.hu wrote:

...
I think the conclusion was that it's better to leave the method as it is, because there's no better way to do it.

There’s *always* a better way to do it. Somebody just has to think about it the right way.

There are too many factors for a better solution to exist. Let's say we change the method to return a Bag. Some will say it's better that way, because the resulting collection has the same size as the original. And some people will say that you broke their code, because the following will return false instead of true:

| s t | s := #(1 2 3) asSet. t := s collect: [ :each | each ]. s = t

IMHO the best solution is still to replace sends of #collect: with #collect:as:, but it's a lot of work.

Levente

I don't understand why anybody would think s should be equal to t in this example. collect:, or map as it is called in other languages, maps the elements in the receiver collection, to elements in the result, based on the mapping function provided as a block argument.

I've never regarded collect as a mapping function, only a transformation function. If I want a map of the original to the transformed, it doesn't need to be built into the collect: method itself to do that:

#(1 2 3) collect: [ : each | each -> each asWords ].

There, you have a mapping without needing to change collect:.

...

Conceptually, collect: is closer to returning a keyedCollection (which is also called a map), where the keys are the elements and the values are their mappings. If we see it this way, it is obvious that the order of the original elements is irrelevant, it is also obvious that the number of mapped values is/has to be the same as the number of elements (keys), and it is also obvious that the kind (or species) of collection holding the original elements (keys) is also irrelevant to the kind of collection holding the values. That's why always returning say an orderedCollection is fine, as long as we can map the collected values back to the original elements - even using a (long-term unstable) iteration order. When the original collection is a sequenceableCollection, we can shortcut all this by just relying on the indexes, which index both the original elements and their mappings. But that is not because the original order is itself important for the collect operation.

Perhaps we need an #asDictionary method to implement the "mapping" use-case. As far as #collect:, however, I find myself agreeing with Levente. #collect: is simply too entrenched to change on a whim. It appears that #collect:as: can address a variety of additional use-cases..

Colin Putney

9:45 p.m.

On Sat, Nov 29, 2014 at 11:04 AM, Chris Muller asqueaker@gmail.com wrote:

...

I've never regarded collect as a mapping function, only a transformation function.

Yeah, I agree. I think the important features of #collect: are that (a) the result is the same type of collection as the original, and (b) the result contains all the the objects returned from the block. If the semantics of that particular collection lead to the result being a different size from the original, that's fine. It's also fine if there's no implicit mapping from the original to the result; if you need such a mapping, build a Dictionary, or at least a collection of Associations, as Chris suggests.

Keeping it simple will make it easy to predict what the particular behaviour of any #collect: implementation will be.

Colin

Bert Freudenberg

30 Nov 30 Nov

3:20 p.m.

On 29.11.2014, at 21:45, Colin Putney colin@wiresong.com wrote:

...

On Sat, Nov 29, 2014 at 11:04 AM, Chris Muller <asqueaker@gmail.com mailto:asqueaker@gmail.com> wrote:

I've never regarded collect as a mapping function, only a transformation function.

Yeah, I agree. I think the important features of #collect: are that (a) the result is the same type of collection as the original, and (b) the result contains all the the objects returned from the block.

How would you define “type” here? Same class?

...

If the semantics of that particular collection lead to the result being a different size from the original, that's fine. It's also fine if there's no implicit mapping from the original to the result; if you need such a mapping, build a Dictionary, or at least a collection of Associations, as Chris suggests.

Keeping it simple will make it easy to predict what the particular behaviour of any #collect: implementation will be.

Colin

Maybe that’s the best we can strive for: predictability. We should be able to describe the result in simple words, for all collection classes.

- Bert -

Colin Putney

1 Dec 1 Dec

7:35 a.m.

On Sun, Nov 30, 2014 at 6:20 AM, Bert Freudenberg bert@freudenbergs.de wrote:

...

How would you define “type” here? Same class?

Yes, same class. It might be reasonable to answer instances of analogous classes for weak collections, and expect the caller to use #collect:as: if they want to retain weakness.

Colin

Bert Freudenberg

11:35 a.m.

On 01.12.2014, at 07:35, Colin Putney colin@wiresong.com wrote:

...

On Sun, Nov 30, 2014 at 6:20 AM, Bert Freudenberg <bert@freudenbergs.de mailto:bert@freudenbergs.de> wrote:

How would you define “type” here? Same class?

Yes, same class. It might be reasonable to answer instances of analogous classes for weak collections, and expect the caller to use #collect:as: if they want to retain weakness.

Colin

That’s certainly simple:

“collect: returns a collection of the same class as the receiver, except for weak collections, which answer non-weak equivalents. To return a collection of a different class, use collect:as:"

- Bert -

Eliot Miranda

5:29 p.m.

On Mon, Dec 1, 2014 at 2:35 AM, Bert Freudenberg bert@freudenbergs.de wrote:

...

On 01.12.2014, at 07:35, Colin Putney colin@wiresong.com wrote:

On Sun, Nov 30, 2014 at 6:20 AM, Bert Freudenberg bert@freudenbergs.de wrote:

...
How would you define “type” here? Same class?

Yes, same class. It might be reasonable to answer instances of analogous classes for weak collections, and expect the caller to use #collect:as: if they want to retain weakness.

Colin

That’s certainly simple:

“collect: returns a collection of the same class as the receiver, except for weak collections, which answer non-weak equivalents. To return a collection of a different class, use collect:as:"

except collect: can't possibly do this for Interval. IMO, reify the idea of species in the comments for collect: and species and say that collect: answers something of the same species. The idea being that species is the same as class when the receiver is a mutable collection, but some other suitable class when it doesn't.

-- best, Eliot

Chris Muller

5:38 p.m.

On Mon, Dec 1, 2014 at 10:29 AM, Eliot Miranda eliot.miranda@gmail.com wrote:

...

On Mon, Dec 1, 2014 at 2:35 AM, Bert Freudenberg bert@freudenbergs.de wrote:

...
On 01.12.2014, at 07:35, Colin Putney colin@wiresong.com wrote:

On Sun, Nov 30, 2014 at 6:20 AM, Bert Freudenberg bert@freudenbergs.de wrote:

...
How would you define “type” here? Same class?

Yes, same class. It might be reasonable to answer instances of analogous classes for weak collections, and expect the caller to use #collect:as: if they want to retain weakness.

Colin

That’s certainly simple:

“collect: returns a collection of the same class as the receiver, except for weak collections, which answer non-weak equivalents. To return a collection of a different class, use collect:as:"

except collect: can't possibly do this for Interval. IMO, reify the idea of

Not only that, but the "except for weak collections" is a totally arbitrary exception, because the collected objects may or may not be referenced from elsewhere.

...

species in the comments for collect: and species and say that collect: answers something of the same species. The idea being that species is the same as class when the receiver is a mutable collection, but some other suitable class when it doesn't.

David T. Lewis

29 Nov 29 Nov

9:27 p.m.

On Sat, Nov 29, 2014 at 03:55:57PM +0100, Levente Uzonyi wrote:

...

On Fri, 28 Nov 2014, tim Rowledge wrote:

...
On 28-11-2014, at 6:05 PM, Levente Uzonyi leves@elte.hu wrote:

...
I think the conclusion was that it's better to leave the method as it is, because there's no better way to do it.

There?s *always* a better way to do it. Somebody just has to think about it the right way.

There are too many factors for a better solution to exist. Let's say we change the method to return a Bag. Some will say it's better that way, because the resulting collection has the same size as the original. And some people will say that you broke their code, because the following will return false instead of true:

| s t | s := #(1 2 3) asSet. t := s collect: [ :each | each ]. s = t

IMHO the best solution is still to replace sends of #collect: with #collect:as:, but it's a lot of work.

I like your argument for making it explicit. How about if we make it explicit like this:

anIdentitySet collect: [:e | ... ] as: Set

This would avoid the surprising behavior of having #collect: answer a collection of size different from that of the original collection.

Dave

Nicolas Cellier

2:04 p.m.

New subject: AW: IdentitySet>>collect:

2014-11-29 3:05 GMT+01:00 Levente Uzonyi leves@elte.hu:

...

The problem with changing #species is that it's overused. There are at least three different things it's used for in the Collection hierarchy:

the class of the object returned by #collect: (Set's subclasses are the

exception since 2000 or so, except for WeakSet, but that's really strange, and should revisited)

the class of the object returned by #select: and friends

the class used for comparison of collections

HashedCollection >> #select: uses #species. Changing IdentitySet

...
#species to return a Set would break #select:, because a ~~ b doesn't

...
imply a ~= b. IdentitySet is the optimal class for the return value of

IdentitySet >> #select:, because it's guaranteed to be able to hold all selected values. I think the same applies to all other kind of sets (and collections, because #select: is simply reducing the number of elements).

This is not the first time this discussion comes up. There was one in 2003[1], but the thread is hard to follow. And there was one in 2009[2]. I think the conclusion was that it's better to leave the method as it is, because there's no better way to do it. Sure, we could use #species in #collect:, but it would break quite a lot of stuff. For example:

| k | k := KeyedSet keyBlock: [ :each | each first ]. k add: #[1]; add: #[2]. k collect: [ :each | each size ]

Currently this snippet returns "a Set(1)". Using #species in #collect: (via #copyEmpty) one would get an error, because SmallInteger does not understand #first. Changing the #species of KeyedSet would break #select:.

IMO select: should NOT use species, because most of the time we can select: in same class (but maybe in case of Interval).

...

So IMHO when you #collect: from a set, you should always use #collect:as:. Here is a (partial) list of methods which send #collect: to a Set:

Categorizer >> #changeFromCategorySpecs: ClassFactoryForTestCase >> #createdClassNames Dictionary >> #unreferencedKeys MCDependencySorterTest >> #assertItems:orderAs:withRequired:toLoad: extraProvisions: MessageNode >> #analyseTempsWithin:rootNode:assignmentPools: MethodNode >> #referencedValuesWithinBlockExtent: SetTest >> #testCollect SetWithNilTest >> #runSetWithNilTestOf:

There are probably many more, but we should fix all of them. A static analyzer could help a lot.

Levente

[1] http://lists.squeakfoundation.org/pipermail/squeak-dev/2003- February/052659.html [2] http://lists.squeakfoundation.org/pipermail/squeak-dev/2009- November/141016.html

On Wed, 26 Nov 2014, David T. Lewis wrote:

It seems to me that Object>>species is intended to handle this sort of

...
issue.

For IdentitySet, it answers what Eliot was expecting:

IdentitySet new species ==> IdentitySet Set new species ==> Set

However, IdentitySet>>collect: does not make use of this, and it answers a Set for the reasons that Levente explained.

Answering an Array or and OrderedCollection would not really make sense, because sets are unordered collections (but Bag might be better).

Shouldn't we have an implementation of IdentitySet>>species that answers Set (or Bag), with a method comment explaining why this is the case, and with all of the collection methods using #species to answer the right kind of result?

I note that IdentitySet>>collect: answers a Set, but IdentitySet>select: sends #species and therefore answers an IdentitySet.

So I think that if we want the #species of an IdentitySet to be a Set, then we should make it so, and give it a method comment to explain the rationale. And the #collect: and #select: methods should both answer a result of that #species.

Dave

On Thu, Nov 27, 2014 at 01:14:30AM +0100, Levente Uzonyi wrote:

...
Your example hides the problem of ordering - what Tobias is asking about

so here's another:

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be the answer?

{ SmallInteger. Float } or { Float. SmallInteger } ?

If you really want to have the resulting collection have the same size, but avoid the problem with ordering, then what you really need is a Bag. Luckily you can have whatever you want (as long as it makes sense):

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Set. "==> a Set(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentitySet. "==> an IdentitySet(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Array. "==> #(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: OrderedCollection. "==> an OrderedCollection(1 1)." (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Bag. "==> a Bag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentityBag. "==> an IdentityBag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Heap. "==> a Heap(1 1)"

Levente

On Thu, 27 Nov 2014, Frank Lesser wrote:

Hi Tobias,

...
agree, a problem of "OrderedCollection" not to break a lot of other things we could return an Array. but for me collecting has priority. Frank

-----Urspr?ngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Tobias Pape Gesendet: Donnerstag, 27. November 2014 00:48 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On 27.11.2014, at 00:34, Frank Lesser <frank-lesser@lesser-software.com

...
wrote:

hmm, not convinced

...
(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] OrderedCollection(1 1 )

in LSWVST ( one-to-one ), you collect results of evaluating a block on objects.

Frank maybe I am wrong ...

Where would the order come from for that _Ordered_Collection?

Best -Tobias

Eliot Miranda

30 Nov 30 Nov

6:39 p.m.

New subject: AW: IdentitySet>>collect:

Hi Levente,

On Nov 28, 2014, at 6:05 PM, Levente Uzonyi leves@elte.hu wrote:

...

The problem with changing #species is that it's overused. There are at least three different things it's used for in the Collection hierarchy:

the class of the object returned by #collect: (Set's subclasses are the exception since 2000 or so, except for WeakSet, but that's really strange, and should revisited)

the class of the object returned by #select: and friends

the class used for comparison of collections

The first two are conceptually the same, and are the rationale for species. I base that assessment on my experience with Smalltalk-80's collection library and derivatives. species served to answer a mutable collection to hold the results of collect:, select: and reject:. For example Interval, being a synthetic collection, cannot hold arbitrary results. So it's species must be different (Array).

I don't see that collect:, select: and reject: are different use cases.

As far as comparison, that's johnny come lately. In Smalltalk-80 for example Sets were incomparable, #= & hash being inherited from Object (I.e. #== & identityHash). So species uses based on collect:, select: and reject: should have priority.

...

HashedCollection >> #select: uses #species. Changing IdentitySet

...
...
#species to return a Set would break #select:, because a ~~ b doesn't

imply a ~= b. IdentitySet is the optimal class for the return value of IdentitySet >> #select:, because it's guaranteed to be able to hold all selected values. I think the same applies to all other kind of sets (and collections, because #select: is simply reducing the number of elements).

This is not the first time this discussion comes up. There was one in 2003[1], but the thread is hard to follow. And there was one in 2009[2]. I think the conclusion was that it's better to leave the method as it is, because there's no better way to do it. Sure, we could use #species in #collect:, but it would break quite a lot of stuff. For example:

| k | k := KeyedSet keyBlock: [ :each | each first ]. k add: #[1]; add: #[2]. k collect: [ :each | each size ]

Currently this snippet returns "a Set(1)". Using #species in #collect: (via #copyEmpty) one would get an error, because SmallInteger does not understand #first. Changing the #species of KeyedSet would break #select:.

So IMHO when you #collect: from a set, you should always use #collect:as:. Here is a (partial) list of methods which send #collect: to a Set:

Categorizer >> #changeFromCategorySpecs: ClassFactoryForTestCase >> #createdClassNames Dictionary >> #unreferencedKeys MCDependencySorterTest >> #assertItems:orderAs:withRequired:toLoad:extraProvisions: MessageNode >> #analyseTempsWithin:rootNode:assignmentPools: MethodNode >> #referencedValuesWithinBlockExtent: SetTest >> #testCollect SetWithNilTest >> #runSetWithNilTestOf:

There are probably many more, but we should fix all of them. A static analyzer could help a lot.

Levente

[1] http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-February/052659.... [2] http://lists.squeakfoundation.org/pipermail/squeak-dev/2009-November/141016....

On Wed, 26 Nov 2014, David T. Lewis wrote:

...
It seems to me that Object>>species is intended to handle this sort of issue.

For IdentitySet, it answers what Eliot was expecting:

IdentitySet new species ==> IdentitySet Set new species ==> Set

However, IdentitySet>>collect: does not make use of this, and it answers a Set for the reasons that Levente explained.

Answering an Array or and OrderedCollection would not really make sense, because sets are unordered collections (but Bag might be better).

Shouldn't we have an implementation of IdentitySet>>species that answers Set (or Bag), with a method comment explaining why this is the case, and with all of the collection methods using #species to answer the right kind of result?

I note that IdentitySet>>collect: answers a Set, but IdentitySet>select: sends #species and therefore answers an IdentitySet.

So I think that if we want the #species of an IdentitySet to be a Set, then we should make it so, and give it a method comment to explain the rationale. And the #collect: and #select: methods should both answer a result of that #species.

Dave

On Thu, Nov 27, 2014 at 01:14:30AM +0100, Levente Uzonyi wrote:

...
Your example hides the problem of ordering - what Tobias is asking about - so here's another:

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be the answer?

{ SmallInteger. Float } or { Float. SmallInteger } ?

If you really want to have the resulting collection have the same size, but avoid the problem with ordering, then what you really need is a Bag. Luckily you can have whatever you want (as long as it makes sense):

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Set. "==> a Set(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentitySet. "==> an IdentitySet(1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Array. "==> #(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: OrderedCollection. "==> an OrderedCollection(1 1)." (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Bag. "==> a Bag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: IdentityBag. "==> an IdentityBag(1 1)" (IdentitySet withAll: #(1 1.0)) collect: [ :each | each asInteger ] as: Heap. "==> a Heap(1 1)"

Levente

On Thu, 27 Nov 2014, Frank Lesser wrote:

...
Hi Tobias, agree, a problem of "OrderedCollection" not to break a lot of other things we could return an Array. but for me collecting has priority. Frank

-----Urspr?ngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Tobias Pape Gesendet: Donnerstag, 27. November 2014 00:48 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On 27.11.2014, at 00:34, Frank Lesser frank-lesser@lesser-software.com wrote:

...
hmm, not convinced

(IdentitySet withAll: #(1 1.0)) collect: [:e| e asInteger ] OrderedCollection(1 1 )

in LSWVST ( one-to-one ), you collect results of evaluating a block on objects.

Frank maybe I am wrong ...

Where would the order come from for that _Ordered_Collection?

Best -Tobias

Florin Mateoc

27 Nov 27 Nov

2:23 a.m.

New subject: AW: IdentitySet>>collect:

On 11/26/2014 7:14 PM, Levente Uzonyi wrote:

...

Your example hides the problem of ordering - what Tobias is asking about - so here's another:

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be the answer?

{ SmallInteger. Float } or { Float. SmallInteger } ?

snip

...

Levente

Then why does Dictionary>>keys return an Array? Where does the order come from in that example? Similarly, where does the order come from when you invoke collect:as: on a set with Array as an argument? The answer is quite simple: it is the iteration order. collect: is part of the _iteration_ protocol.

I agree with Frank here, for me a more important aspect of collect: is preserving the _mapping_ between the original elements and the collected values. There is no obvious mapping if there are fewer collected values than elements.

Florin

Bert Freudenberg

8:57 a.m.

On 27.11.2014, at 02:23, Florin Mateoc florin.mateoc@gmail.com wrote:

...

On 11/26/2014 7:14 PM, Levente Uzonyi wrote:

...
Your example hides the problem of ordering - what Tobias is asking about - so here's another:

(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be the answer?

{ SmallInteger. Float } or { Float. SmallInteger } ?

snip

...
Levente

Then why does Dictionary>>keys return an Array? Where does the order come from in that example? Similarly, where does the order come from when you invoke collect:as: on a set with Array as an argument? The answer is quite simple: it is the iteration order. collect: is part of the _iteration_ protocol.

I agree with Frank here, for me a more important aspect of collect: is preserving the _mapping_ between the original elements and the collected values. There is no obvious mapping if there are fewer collected values than elements.

Florin

I agree. The order is unimportant, it's a Set after all. What *is* important is getting a collection of the same size back - by default. For other needs we have collect:as:.

- Bert -

Frank Lesser

9:21 a.m.

New subject: AW: IdentitySet>>collect:

hmm ( Set withAll: #( 1 -1 )) collect: [ : e | e abs ] a Set(1)

-----Ursprüngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Bert Freudenberg Gesendet: Donnerstag, 27. November 2014 08:57 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On 27.11.2014, at 02:23, Florin Mateoc florin.mateoc@gmail.com wrote:

...

On 11/26/2014 7:14 PM, Levente Uzonyi wrote:

...
Your example hides the problem of ordering - what Tobias is asking about

- so here's another:

...

...
(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be

the answer?

...

...
{ SmallInteger. Float } or { Float. SmallInteger } ?

snip

...
Levente

Then why does Dictionary>>keys return an Array? Where does the order come

from in that example?

...

Similarly, where does the order come from when you invoke collect:as: on a

set with Array as an argument?

...

The answer is quite simple: it is the iteration order. collect: is part of

the _iteration_ protocol.

...

I agree with Frank here, for me a more important aspect of collect: is

preserving the _mapping_ between the original

...

elements and the collected values. There is no obvious mapping if there are fewer collected values than

elements.

...

Florin

I agree. The order is unimportant, it's a Set after all. What *is* important is getting a collection of the same size back - by default. For other needs we have collect:as:.

- Bert -

Nicolas Cellier

1:07 p.m.

2014-11-27 9:21 GMT+01:00 Frank Lesser frank-lesser@lesser-software.com:

...

hmm ( Set withAll: #( 1 -1 )) collect: [ : e | e abs ] a Set(1)

IMO, these snippets have no sense... In somes cases, we will want to preserve uniqueness, in some cases we will want to preserve the size... #collect:as: answers to most use cases, explicitely.

Answering an Array is a bit against the current trend which is to preserve as far as possible the collection properties when selecting/collecting. For example, Dictionary collect: now answer a Dictionary, what it did not in the past. In most cases (except maybe Interval) this is possible when selecting, but not necessarily when collecting due to restrictions on element type (ByteArray WordArray String IdentitySet...), or due to Identity or non weak-references properties that have absolutely no guaranty to be preserved once transformed. Note that #species was originally used for both selecting and collecting, but we started to distinguish these two, that might be the reason for the current non-uniformity of its usage. Some collection change of species when collecting, some other just fail, obviously because we expect aString collect: #asUppercase to answer a String, not an Array... In all these cases, collect:as: is a generic answer.

Anyway, a 1-to-1 mapping with something unordered is quite useless... How are we going to map elements of the transformed collection with the original? We cannot iterate like this: (aSet with: anOrderedCollection do: [:a :b | ]), nor with a Stream or any sequenceable protocol on an unordered collection.

The only guaranty you have is that the block will be evaluated once with each item (if there is no block return or exception raised...). Even this expectation might be questionable, I once suggested to iterate only once per different value in a Bag, or once per run in a RunArray when collecting/selecting... i did not, because too many code rely on the side effects like | i | i := 0. aSet collect: [:each | i := i + 1. each transformWithRank: i]. IMO, this is bad style.

Nicolas

...

-----Ursprüngliche Nachricht----- Von: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] Im Auftrag von Bert Freudenberg Gesendet: Donnerstag, 27. November 2014 08:57 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] IdentitySet>>collect:

On 27.11.2014, at 02:23, Florin Mateoc florin.mateoc@gmail.com wrote:

...
On 11/26/2014 7:14 PM, Levente Uzonyi wrote:

...
Your example hides the problem of ordering - what Tobias is asking about

so here's another:

...
...
(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be

the answer?

...
...
{ SmallInteger. Float } or { Float. SmallInteger } ?

snip

...
Levente

Then why does Dictionary>>keys return an Array? Where does the order come

from in that example?

...
Similarly, where does the order come from when you invoke collect:as: on

a set with Array as an argument?

...
The answer is quite simple: it is the iteration order. collect: is part

of the _iteration_ protocol.

...
I agree with Frank here, for me a more important aspect of collect: is

preserving the _mapping_ between the original

...
elements and the collected values. There is no obvious mapping if there are fewer collected values than

elements.

...
Florin

I agree. The order is unimportant, it's a Set after all. What *is* important is getting a collection of the same size back - by default. For other needs we have collect:as:.

Bert -

Nicolas Cellier

12:37 p.m.

New subject: AW: IdentitySet>>collect:

In st-80, the keys did answer a Set and the values a Bag, with no obvious mapping between these two order-wise. In Squeak, there were historically a fasterKeys that did answer an Array in order to avoid the overhead of testing uniqueness of keys and to avoid rehashing the keys... Then we decided to simplify everything by answering an Array for both keys and values with the guaranty that (self at: (self keys at: i)) == (self values at: i) for any i from 1 to keys size... But it's just a conveniency, it was not intended to be generalized... Also, keep in mind that the order may change at any time with the mutations of the Dictionary...

2014-11-27 2:23 GMT+01:00 Florin Mateoc florin.mateoc@gmail.com:

...

On 11/26/2014 7:14 PM, Levente Uzonyi wrote:

...
Your example hides the problem of ordering - what Tobias is asking about

so here's another:

...
(IdentitySet withAll: #(1 1.0)) collect: [ :each | each class ]

If IdentitySet >> #collect: were returning an Array, then what would be

the answer?

...
{ SmallInteger. Float } or { Float. SmallInteger } ?

snip

...
Levente

Then why does Dictionary>>keys return an Array? Where does the order come from in that example? Similarly, where does the order come from when you invoke collect:as: on a set with Array as an argument? The answer is quite simple: it is the iteration order. collect: is part of the _iteration_ protocol.

I agree with Frank here, for me a more important aspect of collect: is preserving the _mapping_ between the original elements and the collected values. There is no obvious mapping if there are fewer collected values than elements.

Florin

David T. Lewis

26 Nov 26 Nov

11:43 p.m.

On Wed, Nov 26, 2014 at 01:56:32PM -0800, Eliot Miranda wrote:

...

On Wed, Nov 26, 2014 at 1:26 PM, Levente Uzonyi leves@elte.hu wrote:

...
On Wed, 26 Nov 2014, David T. Lewis wrote:

On Wed, Nov 26, 2014 at 08:01:49PM +0100, Levente Uzonyi wrote:

...
...
On Wed, 26 Nov 2014, Eliot Miranda wrote:

Hi All,

...
? ? IdentitySet>>collect: answers a Set, not an IdentitySet.? Anyone else agree this is a serious bug?? Anyone else disagree?

WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3) -- best,Eliot

Sometimes it's feasable to return an IdentitySet, other times it's not, so there's no optimal solution - #collect: can't cover all cases.

Set >> #collect: explicitly returns a Set, because this is the least bad solution to handle all of its subclasses reasonably well. In case of WeakSet, returning a WeakSet makes no sense, because some of your objects will disappear immediately. In case of IdentitySet, PluggableSet, KeyedSet and KeyedIdentitySet the returned values may not behave well in the original collection.

The best is to always be explicit:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [ :e | e ] as: IdentitySet "==> an IdentitySet(1 2 3 1.0 2.0 3.0)"

Levente

In Squeak 2.8 (checked on http://bertfreudenberg.github.io/SqueakJS/run/ ):

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(2.0 1 2 3 3.0 1.0)

But in trunk I get this:

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] ==> a Set(1 2 3)

So answering a Set makes sense for the reasons that Levente explains, but trunk is definitely broken WRT the contents of that set.

This bug needs a unit test to go along with whatever fix we agree on.

It's not a bug either. The Floats are not there, because they are equal to the integers:

1 = 1.0 "==> true" 1.0 = 1 "==> true"

Therefore they have the same hash:

1 hash "==> 1" 1.0 hash "==> 1"

The bug was in Squeak 2.8: 1.0 = 1 "==> true" 1 = 1.0 "==> true" 1 hash "==> 1" 1.0 hash "==> 61440"

Here's the comment of Float >> hash from The Trunk: "Hash is reimplemented because = is implemented. Both words of the float are used. (The bitShift:'s ensure that the intermediate results do not become a large integer.) Care is taken to answer same hash as an equal Integer."

Exactly.

Yes. I confused myself by going too far back in the wayback machine. It seems that in Squeak 2.8 we had this:

Set withAll: #(1 2 3 1.0 2.0 3.0) ==> Set(2.0 1 2 3 3.0 1.0)

Which is clearly wrong, as Levente explains above.

Dave

...

My hope would be something like this (Set order is arbitrary)
    ((IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e]) =>
IdentitySet (1 2 3 1.0 2.0 3.0)

but that (Set withAll: #(1 2 3 1.0 2.0 3.0)) => Set (1 2 3)

(and it always contains just the integers since these are added first).

But Levente's collect:as: solution is perfectly acceptable and as he points out has the advantage of being explicit.

Marcus Denker

27 Nov 27 Nov

8:21 a.m.

New subject: [Pharo-dev] IdentitySet>>collect:

...

On 26 Nov 2014, at 19:19, Eliot Miranda eliot.miranda@gmail.com wrote:

Hi All,
IdentitySet>>collect: answers a Set, not an IdentitySet.  Anyone else agree this is a serious bug?  Anyone else disagree?
WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3)

Yes, I would say so. I think we fixed some others like that already (not 100% sure)

I added an issue tracker entry

14535 IdentitySet>>collect: answers a Set, not an IdentitySet. https://pharo.fogbugz.com/f/cases/14535

Marcus

Marcus Denker

28 Nov 28 Nov

3:35 p.m.

New subject: [Pharo-dev] IdentitySet>>collect:

...

On 27 Nov 2014, at 08:21, Marcus Denker marcus.denker@inria.fr wrote:

...
On 26 Nov 2014, at 19:19, Eliot Miranda eliot.miranda@gmail.com wrote:

Hi All,

IdentitySet>>collect: answers a Set, not an IdentitySet. Anyone else agree this is a serious bug? Anyone else disagree?

WTF??

(IdentitySet withAll: #(1 2 3 1.0 2.0 3.0)) collect: [:e| e] a Set(1.0 2 3)

Yes, I would say so. I think we fixed some others like that already (not 100% sure)

I added an issue tracker entry

14535 IdentitySet>>collect: answers a Set, not an IdentitySet. https://pharo.fogbugz.com/f/cases/14535

#collect: directly uses Set in Set>>#collect, I committed a fix (to the issue tracker, to be reviewed) to use “self species” instead (plus the example from above as a test in IdentitySetTest).

Marcus

3456

Age (days ago)

3461

Last active (days ago)

squeak-dev@lists.squeakfoundation.org

40 comments

14 participants

tags (0)

participants (14)

Ben Coman
Bert Freudenberg
Chris Muller
Colin Putney
David T. Lewis
Eliot Miranda
Florin Mateoc
Frank Lesser
Igor Stasenko
Levente Uzonyi
Marcus Denker
Nicolas Cellier
tim Rowledge
Tobias Pape