[Magma] performance (was: test result)

Stephen Pair spair at acm.org
Wed Aug 7 13:04:38 UTC 2002


So what is the logic behind: 

Integer>>hash
	"Hash is reimplemented because = is implemented."

	^(self lastDigit bitShift: 8) + (self digitAt: 1)

????

That looks to me like it's only going to offer 16 bits of hash values
(correct?).  Also, why arbitrarily choose to incorporate the last digit
into the hash?  That seems to me to be a completely random choice that
may or may not be valid for any given set of large integers, yet is
presumably more expensive than "self bitAnd: 16r3FFFFFFF" (which is also
a random choice, but would hopefully perform better).

Is this a hold-over from the days of 16bit pointers?  Should this method
be changed to "self bitAnd: 16r3FFFFFFF"?

- Stephen


> -----Original Message-----
> From: squeak-dev-admin at lists.squeakfoundation.org 
> [mailto:squeak-dev-admin at lists.squeakfoundation.org] On 
> Behalf Of Andreas Raab
> Sent: Wednesday, August 07, 2002 6:57 AM
> To: squeak-dev at lists.squeakfoundation.org
> Subject: RE: [Magma] performance (was: test result)
> 
> 
> Chris,
> 
> > Even with a few collisions, I'm perplexed why removing 2001
> > keys takes **5 hours**..!
> 
> A few collisions?! I don't think so ;-) Let's do a little math:
> 
> Dictionaries use linear searches in order to fix the 
> collision chains. Your objects had hashes ranging from h to 
> h+256. You added roughly 2K objects to the dictionary. Let's look:
> 
> * the first 256 adds had no collisions
> * the next 256 adds had collision chains of approx. length 128
> * the next 256 adds had collision chains of approx. length 
> 384 ... So on average we have (after adding 2K objects) a 
> collision chain of length 1000 for each object (very rough 
> estimation).
> 
> Collision fixing does the following: It traverses the 
> collision chain for the object being removed and finds the 
> new position for each object in the chain. Meaning that we 
> have 1000 loops (traversing the collision
> chain) of 1000 operations (finding new index for object in 
> collision chain - which will naturally also traverse any 
> existing collision chain up to the end). Meaning we got 
> roughly 1,000,000 comparisons for removing a single object.
> 
> Do this 2000 times and you end up with
> 
> 	2,000,000,000 large integer comparisons.
> 
> That *may* take a little, right?! ;-)
> 
> 
> >   ^self bitAnd: 16r3FFFFFFF
> 
> I think that's the right solution.
> 
> Cheers,
>   - Andreas
> 
> 
> 




More information about the Squeak-dev mailing list