Image using high CPU in WeakIdentityKeyDictionary>>scanFor:

Ross Boylan RossBoylan at stanfordalumni.org
Tue Sep 20 19:26:27 UTC 2005


On Tue, Sep 20, 2005 at 03:13:28AM -0700, Daniel Vainsencher wrote:
> Hi Ross.
> 
> There you go -
> 
> Your scanFor: is indeed screwy. Since its an identity dictionary, your 
> hash values are 0-4095. Therefore once array size >= 4096, each key goes 
> directly to its own hash value, instead of being more or less evenly 
> spread out over the array.
My array size (though not the contents size) is already over the
limit:
WeakIdentityKeyDictionary allInstances collect: [:x | x capacity]  #(24576 192 12288)

> The effect is seen easily by inspection - in that image, inspect the 
> array, and you'll see half is very congested, and half is completely
> empty.

When I tried to open an inspector on Object's class var EventsField it
took so long I had to kill it--not sure if that leads to some kind of
loop.  But if hashes are < 4096 (I didn't realize they were so small),
things are clearly in bad shape.

> 
> Look at the equivalent code in a modern version, you'll see some simple 
> arithmetic spreading the 4095 values over the size of the array.
In 3.8 final there's this:
	finish _ array size.
	finish > 4096
		ifTrue: [hash _ anObject identityHash * (finish // 4096)]
		ifFalse: [hash _ anObject identityHash].
	start _ (hash \\ array size) + 1.

> 
> Your arrays are of an under critical size - at >4096 entries you would 
> be effectively working with a slow linear search. At the moment, you 
> probably occasionally get some mileage from the hash... but its still 
> far slower than it should be.
> 
> So you should definitely think about upgrading, either the image or at 
Upgrading the image is a big project, which I've started but will not
soon finish :(.
> least that method. Of course you should study the matter and take the 
> appropriate precautions, and test that it really solves your problem, 
> but strictly at the information level, don't forget to rehash very 
> quickly after any change in the hashing scheme of any affected data 
> structures.

Should the two methods be sufficient to change?  This looks like a
hazardous operation.  I have a process which tickles the image every
few seconds and cause updates which would use some of these
dictionaries.  I'll suspend that process, but the operation looks very
thread-unsafe.

I'm sure this won't end my performance problems, but I hope it will
help.

> 
> Sheesh, amazing how long lived this bug is... if someone has some time 
> and the inclination to finally put a stake through it, I have some advice.
> 

I think the bug is dead, so nothing needs to filed about it.  Recall
that the image with the problem is basically at 3.6 final.  The 3.8
final has the improved code quoted above.



More information about the Squeak-dev mailing list