New method cache, 30% faster macrobenchmarks and ineffeciencies.

Scott A Crosby crosby at qwes.math.cmu.edu
Mon Dec 10 07:04:37 UTC 2001


On Mon, 10 Dec 2001, Stephan Rudlof wrote:

> Scott A Crosby wrote:
> >
> > On Sun, 9 Dec 2001, Stephan Rudlof wrote:
> >
> >
> > > But for ordinary - not VM related - hashed collections an alternative hash
> > > scheme with VM support (named #longHash?) could make sense. And the use of
> >
> > Its probably not relevant. My hashed collection stuff avoids a nasty big
> > performance degradation for when you get to >4000 elements (where
> > performace will fall by a factor of 1000 as the algorithm devolves into a
> > linear search).. If someone needs to store >100k objects in a hash table,
> > they'll probably be far better served by writing a custom 'hash' function
> > and store a 30-bit hash value in a slot.
>
> One last - hopefully ;-) - point:
> Why not building support for the changed ProtoObject>>identityHash named
> e.g. #longHash into the VM?

Smalltalk already has two notions of hash, the builting hash
'identityHash' and some user-chosen custom hash 'hash'. Many classes
define their own notion of 'hash'. (String, the collections, etc)

Thus, the added layer of 'longHash' seems ill-defined and pointless. If an
object needs more hash bits, can it not store those bits as an internal
field, and override 'hash' with a new function returning those bits?

IE, #identityhash returns SOME number based on the object's as-constructed
ID, with nothing forcing that number to be exactly #hashBits. And your
notion of #longHash is equivalent to #hash?

> Answer: there is only about a 25% performance loss for a *very* simple loop
> for the ST version:
>

Not too surprised, its an extra method invocation. Note that this is
overstating the numbers, usually, we invoke #hash, not #identityHash, so
that we're going from 2 levels of indirection to 3 levels, and the
relative degradation will be lower.

>
> This isn't worth introducing a new prim, isn't it?
>

Not yet. :) Well, unless the next incompatible VM comes out, then it
probably makes sence to reclaim the bytecode and make both #hashBits and
#identityHash named prims.


Scott






More information about the Squeak-dev mailing list