Color>>#hash

Andres Valloud sqrmax at cvtci.com.ar
Sat Feb 27 23:58:26 UTC 1999


Hi.

Colors are implemented by keeping the RGB components into 10 bits each, and all
of them fit in a SmallInteger called rgb. The hash of colors is ^rgb. The
instance variable rgb stores all what the color is, in terms of the RGB space
model. So, rgb is pretty much like this:

[ 10 bits red | 10 bits green | 10 bits blue ] [sign] [SmallInteger mark bit]

And so is the hash of colors. Let's put a ton (let's say 1024 to make round
numbers) of colors in a Dictionary as keys. When stored there, the keys are put
inside an Array and the index at which they will be is determined from the hash
value of such objects. It's pretty like looking up in a real dictionary by
saying "well I'm looking for a word that starts with A, I won't binary search
the whole dictionary, just a subsection of the whole A section".

To determine such indexes, Dictionaries get the size of the key Array and then
take the remainder of dividing the hash of the key by that size. Then what
happens?

Colors are just hashed by their red component.

And that's bad. Example.

When the nearest color from an array of colors to a given color is computer, the
result gets cached because once it's determined it won't change and the
computation is quite hard on the cpu cycles. The operation that takes most time
to complete, when rendering colors from a Form to another one with a different
color map, is precisely to access the cache with at:, because the cache becomes
a Dictionary. About 20 something % of the time is spent doing #at:ifAbsent:.
Here are the leaves of a tally on the whole process, worth 4m 16s 547ms.

24.7 Dictionary>>scanFor:
7.0 Color>>distanceTo:
6.4 BitBlt>>pixelAt:
6.0 BitBlt>>pixelAt:put:
5.4 Form class>>extent:depth:
5.3 BitBlt class>>destForm:sourceForm:hal...rigin:extent:clipRect:
5.0 Form>>setExtent:depth:
4.5 Rectangle class>>origin:extent:
2.5 Color class>>r:g:b:range:
2.1 ColorReducer>>closestColorIndexFrom:to:

In this case, the dictionary in question has 630 keys and 256 different value
colors. So, to make 512x384 = 196608 accesses to a dictionary takes roughly a
minute and change, which gives around 3200 accesses per second. I wonder how
well a sorted collection of colors will perform (sorting by hash which would be
lexicographical order). In this case Dictionaries perform slowly since there are
a lot of colors that are very close to each other and so have similar red
components and so hash collisions.

Then, what to do? It says that Color was implemented that way on purpose to save
space. Is it ok to compute a slower hash? What slower hash? IMHO, it should be
representative of the color in question more evenly through the bits of the
hash.

What, if anything, should be done with the hash?

Andres.





More information about the Squeak-dev mailing list