[BUG] Set>>collect:

Andrew P. Black black at cse.ogi.edu
Fri Feb 14 09:02:04 UTC 2003


The problem that 'aSubclassofSet collect:' answers a Set was 
something that I "fixed" at Camp Smalltalk about three years ago, and 
fixed again in November.  In both cases my fix was the same as Bill 
Spight's.

Richard O'Keefe's argument that this fix is wrong is convincing -- 
but I hadn't seen the discussion previously.  It just appeared that 
the fix had found its way into a black hole, that Harvesting did not 
work, etc.

While working on the trait refactoring of collections, we realized 
that 'self species" plays two incompatible roles.  One is as part of 
the logic for the equality test: two collections can be equal only if 
they are of the same species.  The other role is as a way of making a 
new "collecting collection" for the collect:, select: and reject: 
methods.  We reserved species for the first purpose, and introduced 
"emptyCopyOfSameSize" for the second purpose.  We also used a private 
"add an element" method that works both for indexable collections 
like arrays, and for extensible collections like set, as well as for 
sortedCollections.  We called this private method "unsafeAdd: 
anElement possiblyAt: anIndex", and added a method "makesafe" that 
was intended for use when creating something like a SortedCollection. 
Collections that don't support indexing just ignore "anIndex"; 
collections that have no cleaning-up to do after a bunch of additions 
can implement makesafe as a null method.

This refactoring in no way addresses Richard's other point, which is 
that the type of the element as well as the type of the collection 
determine the correct answer to emptyCopyOfSameSize.  We had not 
thought about this.  Richard's collect:into: is an elegant solution 
to this problem, but it needs to be combined with our "unsafeAdd: 
anElement possiblyAt: anIndex" trick.  Why is this?  Because if the 
client gets to choose the kind of collection, then code in 
collect:into: can't know whether the elements should be inserted with 
add: or at: put:

Of course, inject: into: can be used to parameterize over both the 
initial collection and the message used to add the successive 
elements.  collect:into: has readability arguments in its favour, and 
can be implemented more efficiently with withIndexDo: than with 
inject:into:.  However, there is potential for confusion between

	aCollection inject: anInitialValue into: aBinaryBlock
and
	aCollection collect: aUnaryBlock into: anInitialValue

Perhaps

	aCollection inject: anInitialValue andCollect: aUnaryBlock

is more mnemonic?

So, here is my suggestion:

collect: aBlock
	"Evaluate aBlock with each of the receiver's elements as the argument.
	Collect the resulting values into a collection like the 
receiver. Answer
	the new collection."

	| newCollection |
	newCollection := self emptyCopyOfSameSize.
	^ self inject: newCollection andCollect: aBlock

inject: newCollection andCollect: aBlock
	self withIndexDo: [:each :index |
		newCollection unsafeAdd: (aBlock value: each) 
possiblyAt: index].
	^ newCollection makeSafe.


This is perfectly generic, i.e., it will work for all collections, as 
are these methods:


withIndexDo: elementAndIndexBlock
	| index |
	index := 1.
	self do: [:each |
		elementAndIndexBlock value: each value: index.
		index := index + 1].

inject: thisValue into: binaryBlock
	"Accumulate a running value associated with evaluating the argument,
	binaryBlock, with the current value of the argument, thisValue, and the
	receiver as block arguments. For instance, to sum the numeric elements
	of a collection,
		aCollection inject: 0 into: [:subTotal :next | 
subTotal + next]."

	| nextValue |
	nextValue := thisValue.
	self do: [:each | nextValue := binaryBlock value: nextValue 
value: each].
	^nextValue


I believe that perfectly impenetrable comment in inject:into: has 
lot to answer for in the lack of use of this powerful method!

	Andrew



More information about the Squeak-dev mailing list