faster sets and dictionaries

Ralph Boland ralph.boland at acadiau.ca
Wed Apr 7 18:30:43 UTC 2004


About a month ago I posted here on the fact
that set add operations could be made faster by
cutting out the element comparisons when the size of a set
is increased. I received few comments but
someone posted asking for the code so I am obliging.

Attached are two .st files.
The first file contains two classes:
FastSet which is a subclass of Set and
FastDictionary which is a subclass of Dictionary.
Both FastSet and FastDictionary override the methods
"noCheckAdd" with new versions that invoke "scanForQuick" instead of 
"scanFor".
FastSet implements method "scanForQuick" whcih does no element 
comparisons and
is thus faster than "scanFor".


For the code that I am writing whichs uses set and dictionary operations 
A LOT
and with sometimes large objects to be compared this simple change 
results in a performance
improvement of 1-15%.

The second file contains a a "scanForQuick" method and sometimes a 
"noCheckAdd"
method (overriding the method in Squeak) for each class that gains from 
the new code.
The classes affected are:

Set,
Dictionary,
IdentitySet,
IdentityDictionary,
LiteralDictionary,
methodDictionary,
PluggableDictionary,
PluggableSet,
WeakIdentityKeyDictionary,
WeakKeyDictionary, and
WeakSet.

I don't know what half of these classes do so perhaps someone who does
could test them properly
I have only tested this code for a few hours but I haven't crashed yet.
Over the Easter holiday I will test it more throughly. If I find 
problems I will post
a fix early next week. In the meantime, the adventurous may load it into 
their systems.
I would be interested if anybody does some kind of system test to 
measure the difference
in performance of Squeak with the new code. I imagine the improvement
is probably 1% or less but given how
simple the modification is, it is a worthwhile modification.
I am hoping that this modification can be added to the next release of 
Squeak.

NOTE:
Perhaps a better name for "scanForQuick" is "scanForNil" which exists in 
Squeak.
However "scanForNil" returns 0 if there is no unused entry in the set while
"scanForQuick" reports an error since presumably the size of the array 
has just been
increased. Perhaps some would prefer "scanForDirty".

Ralph Boland

(Hoping someone will take notice this time.)

-------------- next part --------------
SystemOrganization addCategory: #FastSets!


Dictionary subclass: #FastDictionary
	instanceVariableNames: ''
	classVariableNames: ''
	poolDictionaries: ''
	category: 'FastSets'!

Set subclass: #FastSet
	instanceVariableNames: ''
	classVariableNames: ''
	poolDictionaries: ''
	category: 'FastSets'!

FastDictionary class
	instanceVariableNames: ''!

FastSet class
	instanceVariableNames: ''!


!FastDictionary methodsFor: 'as yet unclassified' stamp: 'RPB 4/7/2004 09:51'!
noCheckAdd: anObject
	"Must be defined separately for Dictionary because (self findElementOrNil:) expects a key, not an association.  9/7/96 tk"

	array at: (self scanForQuick: anObject key) put: anObject.
	tally _ tally + 1! !

!FastSet methodsFor: 'as yet unclassified' stamp: 'RPB 4/7/2004 09:49'!
noCheckAdd: anObject

	array at: (self scanForQuick: anObject) put: anObject.
	tally _ tally + 1! !

!FastSet methodsFor: 'as yet unclassified' stamp: 'RPB 4/7/2004 09:39'!
scanForQuick: anObject

	"Scan the key array for the first slot containing either a nil (indicating an empty slot) or an element that matches anObject. Answer the index of that slot or zero if no slot is found. This method will be overridden in various subclasses that have different interpretations for matching elements."

	| start finish |

	start _ (anObject hash \\ array size) + 1.
	finish _ array size.

	"Search from (hash mod size) to the end."
	start to: finish do:
		[:index | ((array at: index) == nil)
			ifTrue: [^ index ]].

	"Search from 1 to where we started."
	1 to: start-1 do:
		[:index | ((array at: index) == nil)
			ifTrue: [^ index ]].

	self error:  'Set failed to grow'.! !
-------------- next part --------------
SystemOrganization addCategory: #WorkRightOrDie!



!Dictionary methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 10:45'!
noCheckAdd: anObject
	"Must be defined separately for Dictionary because (self findElementOrNil:) expects a key, not an association.  9/7/96 tk"

	array at: (self scanForQuick: anObject key) put: anObject.
	tally _ tally + 1! !

!Set methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 10:44'!
noCheckAdd: anObject
	array at: (self scanForQuick: anObject) put: anObject.
	tally _ tally + 1! !

!IdentityDictionary methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 10:17'!
scanForQuick: anObject
	"Scan the key array for the first slot containing either a nil (indicating an empty slot) or an element that matches anObject. Answer the index of that slot or zero if no slot is found. This method will be overridden in various subclasses that have different interpretations for matching elements."
	| finish hash start |
	finish _ array size.
	finish > 4096
		ifTrue: [hash _ anObject identityHash * (finish // 4096)]
		ifFalse: [hash _ anObject identityHash].
	start _ (hash \\ array size) + 1.

	"Search from (hash mod size) to the end."
	start to: finish do:
		[:index | ((array at: index) == nil)
			ifTrue: [^ index ]].

	"Search from 1 to where we started."
	1 to: start-1 do:
		[:index | ((array at: index) == nil)
			ifTrue: [^ index ]].

	self error:  'Dictionary failed to grow'.! !

!IdentitySet methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 10:20'!
scanForQuick: anObject
	"Scan the key array for the first slot containing either a nil (indicating an empty slot) or an element that matches anObject. Answer the index of that slot or zero if no slot is found. This method will be overridden in various subclasses that have different interpretations for matching elements."

	| finish hash start  |
	finish _ array size.
	finish > 4096
		ifTrue: [hash _ anObject identityHash * (finish // 4096)]
		ifFalse: [hash _ anObject identityHash].
	start _ (hash \\ array size) + 1.

	"Search from (hash mod size) to the end."
	start to: finish do:
		[:index | ((array at: index) == nil)
			ifTrue: [^ index ]].

	"Search from 1 to where we started."
	1 to: start-1 do:
		[:index | ((array at: index) == nil)
			ifTrue: [^ index ]].
	
	self error:  'Set failed to grow'.! !

!LiteralDictionary methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 15:24'!
scanForQuick: anObject
	"Scan the key array for the first slot containing either a nil (indicating an empty slot) or an element that matches anObject. Answer the index of that slot or zero if no slot is found. This method will be overridden in various subclasses that have different interpretations for matching elements."
	| start finish |
	start _ (anObject hash \\ array size) + 1.
	finish _ array size.

	"Search from (hash mod size) to the end."
	start to: finish do:
		[:index | ((array at: index) == nil)
					ifTrue: [^ index ]].

	"Search from 1 to where we started."
	1 to: start-1 do:
		[:index | ((array at: index) == nil)
					ifTrue: [^ index ]].

	self error:  'LiteralDictionary failed to grow'.! !

!MethodDictionary methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 11:15'!
scanForQuick: anObject
	"Scan the key array for the first slot containing either a nil (indicating an empty slot) or an element that matches anObject. Answer the index of that slot or zero if no slot is found. This method will be overridden in various subclasses that have different interpretations for matching elements."
	| start finish |
	start _ (anObject identityHash \\ array size) + 1.
	finish _ array size.

	"Search from (hash mod size) to the end."
	start to: finish do:
		[:index | ((self basicAt: index) == nil)
			ifTrue: [^ index ]].

	"Search from 1 to where we started."
	1 to: start-1 do:
		[:index | ((self basicAt: index) == nil)
			ifTrue: [^ index ]].

	self error:  'MethodDictionary failed to grow'! !

!PluggableDictionary methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 11:15'!
scanForQuick: anObject 
	"Scan the key array for the first slot containing either a nil
(indicating 
	  an empty slot) or an element that matches anObject. Answer the index 
	  
	of that slot or zero if no slot is found. This  method will be
overridden   
	in various subclasses that have different interpretations for matching 
 
	elements."
	| start finish |
	start _ (hashBlock ifNil: [anObject hash]
				ifNotNil: [hashBlock value: anObject])
				\\ array size + 1.
	finish _ array size.
	"Search from (hash mod size) to the end."
	start to: finish do: [:index | ((array at: index) == nil)
			ifTrue: [^ index]].
	"Search from 1 to where we started."
	1 to: start - 1 do: [:index | ((array at: index) == nil)
			ifTrue: [^ index]].


	self error:  'PluggableDictionary failed to grow'.! !

!PluggableSet methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 11:00'!
scanForQuick: anObject 
	"Scan the key array for the first slot containing either a nil
(indicating 
	  an empty slot) or an element that matches anObject. Answer the index 
	  
	of that slot or zero if no slot is found. This  method will be
overridden   
	in various subclasses that have different interpretations for matching 
 
	elements."
	| start finish |
	start _ (hashBlock ifNil: [anObject hash]
				ifNotNil: [hashBlock value: anObject])
				\\ array size + 1.
	finish _ array size.
	"Search from (hash mod size) to the end."
	start to: finish do: [:index | ((array at: index) == nil)
			ifTrue: [^ index]].
	"Search from 1 to where we started."
	1 to: start - 1 do: [:index | ((array at: index) == nil)
			ifTrue: [^ index]].
	
	self error:  'PluggableSet failed to grow'.! !

!Set methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 10:43'!
scanForQuick: anObject

	"Scan the key array for the first slot containing either a nil (indicating an empty slot) or an element that matches anObject. Answer the index of that slot or zero if no slot is found. This method will be overridden in various subclasses that have different interpretations for matching elements."

	| start finish |

	start _ (anObject hash \\ array size) + 1.
	finish _ array size.

	"Search from (hash mod size) to the end."
	start to: finish do:
		[:index | ((array at: index) == nil)
			ifTrue: [^ index ]].

	"Search from 1 to where we started."
	1 to: start-1 do:
		[:index | ((array at: index) == nil)
			ifTrue: [^ index ]].

	self error:  'Set failed to grow'.! !

!WeakIdentityKeyDictionary methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 11:07'!
scanForQuick: anObject
	"Private. Scan the key array for the first slot containing nil (indicating an empty slot). Answer the index of that slot."
	| start finish hash |
	finish _ array size.
	finish > 4096
		ifTrue: [hash _ anObject identityHash * (finish // 4096)]
		ifFalse: [hash _ anObject identityHash].
	start _ (hash \\ array size) + 1.

	"Search from (hash mod size) to the end."
	start to: finish do:
		[:index | (array at: index) == nil ifTrue: [^ index ]].

	"Search from 1 to where we started."
	1 to: start-1 do:
		[:index | (array at: index) == nil ifTrue: [^ index ]].

	self error:  'WeakIdentityKeyDictionary could not grow'.! !

!WeakKeyDictionary methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 11:08'!
scanForQuick: anObject
	"Private. Scan the key array for the first slot containing nil (indicating an empty slot). Answer the index of that slot."
	| start finish |
	start _ (anObject hash \\ array size) + 1.
	finish _ array size.

	"Search from (hash mod size) to the end."
	start to: finish do:
		[:index | (array at: index) == nil ifTrue: [^ index ]].

	"Search from 1 to where we started."
	1 to: start-1 do:
		[:index | (array at: index) == nil ifTrue: [^ index ]].

	self error:  'WeakIdentityKeyDictionary could not grow'.! !

!WeakSet methodsFor: '*WorkRightOrDie' stamp: 'RPB 4/7/2004 11:12'!
scanForQuick: anObject
	"Scan the key array for the first slot containing either a nil (indicating an empty slot) or an element that matches anObject. Answer the index of that slot or zero if no slot is found. This method will be overridden in various subclasses that have different interpretations for matching elements"

	| start finish |

	start _ (anObject hash \\ array size) + 1.
	finish _ array size.

	"Search from (hash mod size) to the end."
	start to: finish do:
		[:index | ((array at: index) == flag)
			ifTrue: [^ index ]].

	"Search from 1 to where we started."
	1 to: start-1 do:
		[:index | ((array at: index) == flag)
			ifTrue: [^ index ]].

	 self error:  'WeakSet failed to grow'.! !


More information about the Squeak-dev mailing list