[FIX][BUG] Set>>atRandom: is not random ( another - better? -
solution)
Lothar Schenk
lothar.schenk at gmx.de
Fri Feb 27 17:42:14 UTC 2004
I had tried to treat this bug, because Samir's original post was listed as
'Unreviewed' in BFAV. Now, I see that there has already been a discussion
about possible solutions by Torge Husfeldt and Daniel Vainsencher. Torge
Husfeldt has already presented 2 possible solutions, both of which were not
quite satisfactory due to different drawbacks. Torge's second solution is
essentially the same as mine (though more succinctly implemented). I agree
that it has the drawback of being time-consuming in that it has to iterate
(on the average) over half the Set, which may be significant for large Sets.
I think, I have now a better solution. It takes advantage of the fact, that
Sets are not guaranteed to be in a particular order, i.e. it should be
possible to move the elements around within the array used for implementation
without affecting the outside behavior. (SOMEBODY PLEASE CONFIRM THIS).
My new solution is essentially the same as the original one, except that
afterwards (when a scan from a nil element was necessary) it exchanges the
place of the element that is found with the one which was originally randomly
chosen. This has the effect that subsequent hits behind the first slot will
no longer lead to the same element.
As an example, if the Set is implemented by array
( nil 'a' 'b' nil nil nil)
and our random sequence begins with the elements #5 #1 ...)
then the original implementation hits nil at spot #5 and scans onward until it
finds 'a' at spot #2, then it hits nil at spot #1 and scans onward, again
finding 'a' at spot #2.
In my implementation, after the first scan, the 'a' gets moved to spot #5, so
the array then is
(nil nil 'b' nil 'a' nil)
and the subsequent scan after the hit at #1 now produces 'b' as result.
I have tested this with the code given by Samir and found that this trick does
indeed produce a more ideally uniform distribution, while not being as costly
as the first solution.
Regards, Lothar
===============================================
Subject: [FIX][BUG] Set>>atRandom: is not random
Date: Thu, 26 Feb 2004 21:22:49 +0100
From: Lothar Schenk <lothar.schenk at gmx.de>
To: The general-purpose Squeak developers list
<squeak-dev at lists.squeakfoundation.org>
The problem still persists in a 3.7a image. Samir's explanation is correct.
The current implementation picks a random element in the array internal to
Set, and if this element is nil it starts to scan for a non-nil element from
there. This leads to a non-uniform distribution of the sampled elements when
nil-elements are present.
I have attached a fix.
Regards, Lothar
==============================================
Subject: [BUG] Set>>atRandom: is not random
Author: Samir Saidani
Date Posted: 26 June 2003
Archive ID: 10798
Comments: Here is a little piece of code to reproduce the bug:
|s a|
s_ Set new add: 'a'; add: 'b'; yourself.
a_Random new.
i_0.
500 timesRepeat: [ ((s atRandom: a)='a') ifTrue:[ i_i+1 ]. ].
Transcript show: i; cr.
Results (~100) show that you have 1/5 to have the first element.
A workaround is to change a set as an array (probability is 1/2).
|s a|
s_ Set new add: 'a'; add: 'b'; yourself.
a_Random new.
i_0.
500 timesRepeat: [ ((s asArray atRandom: a)='a') ifTrue:[ i_i+1 ]. ].
Transcript show: i; cr.
Don't have time for the moment to fix it, but certainly due to the
"array" representation of a set (Set(1,2) appears as an array of size
5 #(nil 1 2 nil nil nil).
--
Samir SAIDANI
-------------------------------------------------------
-------------- next part --------------
'From Squeak3.7alpha of 11 September 2003 [latest update: #5707] on 27 February 2004 at 5:54:21 pm'!
!Set methodsFor: 'accessing' stamp: 'los 2/27/2004 17:54'!
atRandom: aGenerator
"Answer a random element of the receiver. Uses aGenerator which
should be kept by the user in a variable and used every time. Use
this instead of #atRandom for better uniformity of random numbers
because only you use the generator. Causes an error if self has no
elements."
| ind ind2 |
self emptyCheck. "There must be at least one regular (not-nil) element in array"
ind := aGenerator nextInt: array size.
(array at: ind) ifNotNil: [ ^ array at: ind ].
"We hit a nil element (empty slot) -
scan for the next regular (non-nil) element"
ind2 := ind.
[ (array at: ind2) isNil ] whileTrue:
[ ind2 := ind2 + 1.
(ind2 > array size) ifTrue: [ ind2 := 1 ] "wrap if beyond end of array" ].
"Move the element we found to the original spot that was first selected
to counteract the effect that nil elements may have on the uniformity
of the distribution of selected elements"
array at: ind put: (array at: ind2).
array at: ind2 put: nil.
^ array at: ind! !
More information about the Squeak-dev
mailing list
|