identityHash patch discussion.

Stephen Pair spair at advantive.com
Mon Dec 10 21:01:08 UTC 2001


Why not just add a method like largeHash, and implement "large" variants
of the existing hashed structures (i.e. LargeSet).  Then you'd only use
the large variant for the collections that you anticipate will grow very
large.  That would seem to eliminate the problem of getting it into the
image.

- Stephen

> -----Original Message-----
> From: squeak-dev-admin at lists.squeakfoundation.org 
> [mailto:squeak-dev-admin at lists.squeakfoundation.org] On 
> Behalf Of Scott A Crosby
> Sent: Monday, December 10, 2001 3:33 PM
> To: David Chase
> Cc: squeak-dev at lists.squeakfoundation.org
> Subject: identityHash patch discussion.
> 
> 
> On Mon, 10 Dec 2001, David Chase wrote:
> 
> > Perhaps I am misunderstanding this proposed longer hash
> 
> Yes. All objects in squeak have 12 bits in their header that 
> form the default hash of that object. (Though identityhash).
> 
> The problem is that those bits are used as-is, to construct a 
> number in the *specific* range 0...4095, which is used as the 
> starting index in
> Set>>scanFor: This flaw means that *any* set and *any* 
> dictionary that 
> Set>>has
> >4000 elements in it will devolve into a linear search through all of 
> >the
> elements, when it could be 4000 times faster.
> 
> This is *not* a problem in there being only 12 bits in the 
> object header, but that those 12 bits are interpreted as 
> values in the *specific* range 0...4095.
> 
> My patch just multiplies that 12-bit default-hash number by a 
> constant, to spread the numbers out to avoid this horrible 
> unscalability. The hard part is getting it into the image, 
> because you have to rebuild all existing hash tables.
> 
> See my 'Horrible unscalability in collections' thread from a 
> few months back.
> 
> 
> Scott
> 
> 
> 
> 





More information about the Squeak-dev mailing list