On Jun 28, 9:23 pm, Generic Usenet Account <use...@sta.samsung.com>

wrote:

On Jun 27, 5:55 am, James Kanze <james.ka...@gmail.comwrote:

I wonder if it's really necessary to bother. He's using rand()

to choose an element, and RAND_MAX can be (and far too often is)

as little as 32767. So if he has more elements that that, his

code isn't going to work. Also, he used the expression

rand() % distance to choose the element. This introduces a bias

toward the smaller elements; the closer distance is to RAND_MAX,

the larger the biais. Unless the number of elements is actually

very small, the biais will be noticeable. So we can conclude

that the actual number of elements will probably not be much

more than 10 or 20. And that a simple int will work just fine.

I would really appreciate if you

could tell me how I can avoid the "bias towards smaller elements". I

have always used the "modulo division by random number" trick to

restrict my pseudo-random numbers to a certain range.

That's usually sufficient for fairly small values. The bias

towards the smaller elements problems is easily understood,

however, if you consider a perfect random generator with a very

small RAND_MAX. Suppose RAND_MAX is 9, i.e. your generator

generates all of the numbers from 0 to 9, with equal

probability. Now consider what happens if you do a modulo 6.

What are the results for each number generated:

0 % 6 = 0

1 % 6 = 1

2 % 6 = 2

3 % 6 = 3

4 % 6 = 4

5 % 6 = 5

6 % 6 = 0

7 % 6 = 1

8 % 6 = 2

9 % 6 = 3

It's obvious that the values 0-3 each appear 20% of the time,

whereas 4 and 5 only appear 10%.

The problem doesn't occur if RAND_MAX + 1 is a multiple of your

value, so the usual solution is just to throw out any values

above the last multiple. For a random value in the range

0...N-1, I use something like:

unsigned const limit = RAND_MAX+1 - (RAND_MAX+1) % N ;

unsigned result = next() ;

while ( result >= limit ) {

result = next() ;

}

return result % N ;

I am keen to

avoid this mistake at some future date. A sample code snippet would

be great. Although I do not expect anywhere close to 32767 entries in

my collection, it is certainly going to be much more than 10 or 20.

Be careful with rand(). In the past, at least, a lot of

implementations have been very bad... some systematically

alternate even and odd, for example. For anything more than

just playing around, get a good generator with known behavior.

(Boost has quite a few.)

--

James Kanze (GABI Software) email:ja*********@gmail.com

Conseils en informatique orientée objet/

Beratung in objektorientierter Datenverarbeitung

9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34