[FIX][BUG] Set>>atRandom: is not random ( how to close the matter? )

Lothar Schenk lothar.schenk at gmx.de
Sat Feb 28 11:25:48 UTC 2004


Ned Konz wrote:

> > I think, I have now a better solution. It takes advantage of the fact,
> > that Sets are not guaranteed to be in a particular order, i.e. it should
> > be possible to move the elements around within the array used for
> > implementation without affecting the outside behavior. (SOMEBODY PLEASE
> > CONFIRM THIS).
>
> No, the array is in a particular order: its members are in slots whose
> indices are computed using the hash of the object modulo the size of the
> array.
>
> So re-ordering the array (at least without changing the rest of the logic)
> will cause lookup failures, or big slowdowns. Don't do that.

I see. That's ruled out for good, then.

How can we close this matter, then? I think we have three possibilities:

(1) Accept the status quo, i.e. have an #atRandom: method in Set which is not 
satisfactorily distributed for small Sets or Sets with many empty slots. In 
that case I think there should be a caveat appended to the method commentary.

(2) Adopt Torge Husfeldts first solution involving repeated calls to the 
random number generator until one hits an occupied slot. It's essentially up 
to chance how long that takes.

(3) Adopt Torge Husfeldts second solution, which picks the index of an 
existing element randomly and scans the Set from the beginning until it has 
found the element. This solution has the advantage that it seems to work 
reliably and presumably satisfies the requirements on uniform distribution of 
the results.

Personally, I think that anyone who seriously wants to use random picking of 
elements from a Set would probably also require that the distribution of 
those elements follow a theoretical norm as a primary consideration. 
Efficiency would only be of secondary importance.

My vote goes to solution (3), as I can understand this most readily, with 
solution (2) as my preferred secondary option.

Regards, Lothar




More information about the Squeak-dev mailing list