[squeak-dev] hashing question
Levente Uzonyi
leves at elte.hu
Wed Nov 3 18:55:50 UTC 2010
On Wed, 3 Nov 2010, Chris Muller wrote:
> Hi all, assuming I have a special-kind of array that can grow without
> enumerating the entire old array, is there any way to implement a
> Dictionary that can grow without enumerating its internal array? I'll
> elaborate.
I guess you are looking for something like this:
http://en.wikipedia.org/wiki/Linear_hashing
http://en.wikipedia.org/wiki/Consistent_hashing
Levente
>
> In our own Dictionary, when we grow there are two reasons why we must
> enumerate the entire old array:
>
> 1) because Array's in Squeak can't grow; for that we have
> OrderedCollection. So we must allocate a new Array and copy the
> existing elements to the new-sized array.
> 2) during this time, the elements are redistributed based on modulo
> the new array's size.
>
> I have a special kind of Array (let's call it SpecialArray) that can
> grow without having to re-allocate (because it's on disk and files can
> grow). I want to use this as the basis for a new kind of Dictionary,
> but I want to avoid the need to _redistribute_ the elements.
>
> One idea is to pre-allocate as large a SpecialArray as I think I'll
> ever need (say, 30 million) and simply don't grow. At 6-bytes per
> pointer, that's a 90MB file, but at least it won't ever get any
> bigger. "Disk space is cheap," and, in most cases, this actually is
> fine for my needs.
>
> Still, it prevents one from having a lot of instances of SpecialArray
> if they wanted, and one doesn't always know what will be "more than
> enough".
>
> Is there a more elegant solution that could make use of SpecialArray's
> ability to grow, but not redistribute? It seems like not, since not
> only we need to use the new "upper half" of the Array for new
> elements, we still need to find elements in the "lower-half" (the
> first half). The size of the internal array is used in the hash key
> calculation, so that would be a problem for finding the old elements..
>
> - Chris
>
> PS - I thought I remember someone posting a book about hashing
> algorithms, but I cannot find it..
>
>
More information about the Squeak-dev
mailing list
|