[Magma] performance (was: test result)

Chris Muller afunkyobject at yahoo.com
Wed Aug 7 18:30:32 UTC 2002


Stephen Pair wrote:

>So what is the logic behind: 
>
>Integer>>hash
>	"Hash is reimplemented because = is implemented."
>
>	^(self lastDigit bitShift: 8) + (self digitAt: 1)

I was curious about that as well.  I haven't attempted a deep understanding of
it, but for LargePositiveInteger, my first inclination for was to answer the
modulo against the highest SmallInteger because it would seem to "repeat" the
cycle of SmallInteger hash values over and over for Large ones.

Then, when reworking my oid ranges, I noticed that 16r3FFFFFFF *is* actually
the largest SmallInteger, not 16rFFFFFFFF, and the 16r3FFFFFFF suddenly made
more sense.  Maybe a slightly more general answer for
LargePositiveInteger>>hash would be:

  ^self bitAnd: SmallInteger maxVal

In answering whether such a change would be "safe", Cees wrote:

> I wouldn't know why not. As we've seen, all the damage that bad hash values
> can do is to cause excessive collisions, so if you have a better hash
> function, I'd say add it.

The only thing I was wondering..  I haven't checked a true base 3.2final image
yet, but I had 106000 instances of LargePositiveInteger in my fairly stock 3.2
image.  If any of those are referenced by hashed collections needed by the
Squeak system, changing the #hash method would presumably require a rehash of
those collections to avoid potential system instability??  I just don't know..

Some good news is, I may be able to defer this to someone more qualified and
still start user-object oids below 32-bits without having to sacrifice the
64-bit maximum potential size of a repository.  By assigning the
"special-oids", such as the non-integral atomics starting at 0, *below* the
first user-object oid instead of above it, the user-object oids can be a
contiguous range from something still well-below 32-bits to just a couple
billion shy of 16rFFFFFFFFFFFFFFFF, where SmallIntegers will now occupy. 
Output from the serializer and, thus, network transmission, will also be
reduced making for even better peformance.

SmallIntegers will be just a *little* more costly, because they'll have to be
subtracted down to their real value (w/ LargeInteger arithmetic), but in
exchange for the huge improvement in user-objects, it's way worth it.  I sure
had it backward!  I originally put SmallIntegers at the low-end to avoid the
LargeInteger arithmetic, but that doesn't appear to be an issue.

I also want to include the stubOut: method that Avi helped me understand is
truly needed.  I hope to finish this up and running tests over the next couple
of days.

Thank you very much for your assistance!

Regards,
  Chris


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com



More information about the Squeak-dev mailing list