MethodDictionary (was Re: Performing a method on only those objects which will understand it)

Scott A Crosby crosby at qwes.math.cmu.edu
Tue Jan 22 02:09:10 UTC 2002


On Mon, 21 Jan 2002, Lex Spoon wrote:

> I'll pipe up with the contrary view.  It's not a big deal, up until
> people start trying to define selectors with the *same* name.  I've

Actually, it is a big deal..

The identityHash stuff I was talking about last time also applies to
MethodDictionary, except its slightly worse because of the multiple
lookups. [**]

There's a brick wall, *any* MethodDictionary with more than 4000 elements
will be 1000 times slower than what we have now. The threshold for things
to start to get bad is about 3000. Thus, you don't want a class with more
than 3000 methods, unless you want performance degradation. (Lower down, I
propose that the limit should be 2000.)

So,
 2000-3000 is the limit you don't want to go over for MethodDictionary.
 3800 is the absolute limit, unless you want degradation of more than 10x
>4000 will be horrible. (40x-1000x degradation)

There are two ways of working around this:
  1. Take my suggested changes for identityHash, and apply them to
methodDictionary. This would require rebuilding both a new VM and
incompatibly translating an image. [*]
  2. Use doesNotUnderstand, and have it catch the extra
selectors and dispatch into an IdentityDictionary in an image using my
identityHash patch.
  3. Introduce a special hack for >2000 methods, where you shift the
hashBits over.


[*] This function seems good for the lookup key:
      ''self hashBits xor: (self hashBits leftShift: 5)
which should allow scalability to 20-40k methods in one class.

[**] I never looking at this code before, I may be able to speed up the
slow lookup-the-hierarchy by 50% in cases where superclasses handle most
of the work.

The problem:

Currently, MethodDictionaries are only doubled in size when they're 3/4
full, thus meaning that they're between 3/8 and 6/8 loaded:

I give the expected costs (in terms of probes into the table) of both a
hit and a miss as a function of different load factors.

      Miss   Hit
1/8   1.14  1.07
2/8   1.33  1.16
3/8   1.60  1.30
4/8   2.00  1.50
5/8   2.66  1.83
6/8   3.00  2.00
7/8   8.00  4.50

NOW   2.13  1.57  (average load: .53)
ME    1.59  1.29  (average load: .37)

NOW is the costs as computed given the current average load in the system.

ME is what we'd have if we changed the load factor for all
methodDictionaries to make sure they're 25-50% full.

The cost is about 50% larger methodDictionaries. (256kb)

Note that we pay the above miss cost each time the method search has to go
up one class level in the heirarchy to try to find a method.

This change wouldn't help inner loops that use the methodCache, but
anything touching a lot of methods, most of which are in superclasses of
the object, would be helped, perhaps by 30-40%.

--

Implementing it is easy, just redefine methodDictionary fullCheck to
rehash when the set gets over half-full.

So what do y'all say?

Scott





More information about the Squeak-dev mailing list