comparing Collections. Can we do better!?

Richard A. O'Keefe ok at cs.otago.ac.nz
Tue Dec 16 00:17:11 UTC 2003


Ralph Boland <rboland at porchlight.ca> wrote:
	Finally, a question.
	Would it not be a good idea for all (or at least most) Smalltalk
	Collections (Sequenceable or otherwise) to store the hash value?
	
	I realize that this means that each at:  or at:put:  or similar
	operations then have to do additional work and perhaps since these
	operations are very common it might not actually be a good idea.

Exactly so.  Like to many "speed" questions, it's an empirical one.

 - How often are collections modified?
 - How much does it cost to update a stored hash value when this is done?
 - How often are collections used as elements of Sets or Bags or as keys
   of Dictionaries, as opposed to elements of IdentitySets or keys of
   IdentityDictionaries?

On the other hand, there are some problems in principle:

 - Given that it is notoriously dangerous to have mutable objects as
   elements of sets or bags or as keys of dictionaries, how often is it
   necessary to use *mutable* collections in this way?
 - Given that changing an element of an Array can change the hash value
   of an Array even though the array itself does not change, is it really
   sufficient to cache the hash value of an Array and only update it when
   the Array itself is changed?

	If this is the case, should there not at least be hashed
	versions of collection classes so that people like me don't have
	to implement their own???
	
My suggestion would be to have a *read-only* prehashed array, where you
would build up a collection using an OrderedCollection and then use an
#asHashedReadOnlyArray to get your set/bag element/dictionary key.

This would still be no defence against the array elements themselves
changing, but it would at least reduce the risk.



More information about the Squeak-dev mailing list