Hi,
I've found several sites on google telling me that I shouldn't use
rand() % range+1
and I should, instead, use something like:
lowest+int(range*rand()/(RAND_MAX + 1.0))
They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that? 10 2764
On 25 Aug., 22:35, Rafael Cunha de Almeida <n...@email.provided>
wrote:
Hi,
I've found several sites on google telling me that I shouldn't use
* * * * rand() % range+1
and I should, instead, use something like:
* * * * lowest+int(range*rand()/(RAND_MAX + 1.0))
They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that?
Take an example of a random generator delivering numbers in the range
0..99. If you want numbers in the range 0..79, your solution would
give an expectancy of numbers 0..19 twice as high as the range 20..79.
So your method is bad if you require an even distribution (but might
be ok if you just want somewhat random distribution as eg for a game).
The other method is better, but not perfect.
/Peter
On 20080825 16:35:21 0400, Rafael Cunha de Almeida <no@email.providedsaid:
>
I've found several sites on google telling me that I shouldn't use
rand() % range+1
and I should, instead, use something like:
lowest+int(range*rand()/(RAND_MAX + 1.0))
They fail to make it clear why.
That's because it's nonsense.
There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean.
Both have the same problem. Pretend for the moment that RAND_MAX is 4
(that is, rand() returns 0, 1, 2, 3, or 4) and range is also 4 (that
is, you want values of 0, 1, 2, or 3). Since rand() generates five
different values, you can't get four distinct values with equal
likelihood. Try it with pencil and paper. Both ways.
The number of values that you start with has to be an exact multiple of
the number of values that you want to get. The way to do that is to
throw away some values from rand(), namely, those that exceed the
maximum multiple of range that's less than RAND_MAX+1. Like this:
int mult = (RAND_MAX + 1) / range;
int max = range * mult;
int temp = rand();
while (max < temp)
temp = rand();
return temp % range;
Now, that doesn't address the possibility of less randomness in the low
bits, but that's a separate issue, and depends on how rand() is
implemented.
Would it not be normally distributed or something like that?
rand() produces a uniform distribution, not a normal distribution, and
usually when someone is looking for a restricted range they're still
looking for a uniform distribution.

Pete
Roundhouse Consulting, Ltd. ( www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
( www.petebecker.com/tr1book)
On 20080825 17:00:48 0400, peter koch <pe***************@gmail.comsaid:
On 25 Aug., 22:35, Rafael Cunha de Almeida <n...@email.provided>
wrote:
>Hi,
I've found several sites on google telling me that I shouldn't use
Â* Â* Â* Â* rand() % range+1
and I should, instead, use something like:
Â* Â* Â* Â* lowest+int(range*rand()/(RAND_MAX + 1.0))
They fail to make it clear why. There seems to be less randomness in the lower bits or something like that. I don't know what less randomness would mean. Would it not be normally distributed or something like that?
Take an example of a random generator delivering numbers in the range
0..99. If you want numbers in the range 0..79, your solution would
give an expectancy of numbers 0..19 twice as high as the range 20..79.
So your method is bad if you require an even distribution (but might
be ok if you just want somewhat random distribution as eg for a game).
The other method is better, but not perfect.
Well, the other method is better in that it makes the nonrandomness
less obvious. <gBut with the same sample ranges it still produces
twenty values twice as often as the rest; it's just that they're not
the twenty lowest ones any more.
Probability is fundamentally just counting. If you can reduce the
problem to a small enough set of values, all you need to do is count
the results. For each possible value that rand() can produce, what
value does each of the formulae above produce? It's a bit tedious to do
for rand() generating numbers in the range 0..99, but for 0..9, with a
target range of 0..7, it won't take more than a few minutes. Don't
trust your intuition until you've gone through several examples and
seen how they really work.

Pete
Roundhouse Consulting, Ltd. ( www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
( www.petebecker.com/tr1book)
On 25 Aug., 23:15, Pete Becker <p...@versatilecoding.comwrote:
On 20080825 17:00:48 0400, peter koch <peter.koch.lar...@gmail.comsaid:
On 25 Aug., 22:35, Rafael Cunha de Almeida <n...@email.provided>
wrote:
Hi,
I've found several sites on google telling me that I shouldn't use
* * * * rand() % range+1
and I should, instead, use something like:
* * * * lowest+int(range*rand()/(RAND_MAX + 1.0))
They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that?
Take an example of a random generator delivering numbers in the range
0..99. If you want numbers in the range 0..79, your solution would
give an expectancy of numbers 0..19 twice as high as the range 20..79.
So your method is bad if you require an even distribution (but might
be ok if you just want somewhat random distribution as eg for a game).
The other method is better, but not perfect.
Well, the other method is better in that it makes the nonrandomness
less obvious. <gBut with the same sample ranges it still produces
twenty values twice as often as the rest; it's just that they're not
the twenty lowest ones any more.
You are right  the other method also produces twenty numbers twice as
frequent as the first one  just spread evenly. To my embarassment, I
must admit that I did not realise this when I replied. So much for
perfection!
/Peter
On Aug 25, 2:05 pm, Pete Becker <p...@versatilecoding.comwrote:
The number of values that you start with has to be an exact multiple of
the number of values that you want to get. The way to do that is to
throw away some values from rand(), namely, those that exceed the
maximum multiple of range that's less than RAND_MAX+1. Like this:
int mult = (RAND_MAX + 1) / range;
There is a problem if RAND_MAX equals INT_MAX. Adding one would wrap
the value and mult would become zero. So RAND_MAX could be casted to a
suitable type and I think 'unsigned int' works. Or an unsigned 1 would
work:
int mult = (RAND_MAX + 1u) / range;
I remember reading about this algorithm and that correction to it by
Andrew Koenig on clc++m posts.
int max = range * mult;
int temp = rand();
while (max < temp)
temp = rand();
return temp % range;
Ali
On 20080825 17:21:18 0400, ac******@gmail.com said:
On Aug 25, 2:05 pm, Pete Becker <p...@versatilecoding.comwrote:
>The number of values that you start with has to be an exact multiple of the number of values that you want to get. The way to do that is to throw away some values from rand(), namely, those that exceed the maximum multiple of range that's less than RAND_MAX+1. Like this:
int mult = (RAND_MAX + 1) / range;
There is a problem if RAND_MAX equals INT_MAX. Adding one would wrap
the value and mult would become zero.
Yes, usually when I post this information I put in a note that it's
only a sketch, and probably has an offbyone error. Or maybe two of
them.
So RAND_MAX could be casted to a
suitable type and I think 'unsigned int' works. Or an unsigned 1 would
work:
Maybe. RAND_MAX is required to be an integer constant expression, which
doesn't preclude it being unsigned to begin with.

Pete
Roundhouse Consulting, Ltd. ( www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
( www.petebecker.com/tr1book)
On 20080825 17:18:54 0400, peter koch <pe***************@gmail.comsaid:
You are right  the other method also produces twenty numbers twice as
frequent as the first one  just spread evenly. To my embarassment, I
must admit that I did not realise this when I replied. So much for
perfection!
Don't be embarassed. It's a very common mistake. Most people's
intuition here leads them astray, which is why it's useful to work
through some simple examples to retrain it.

Pete
Roundhouse Consulting, Ltd. ( www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
( www.petebecker.com/tr1book)
On Aug 25, 3:35 pm, Rafael Cunha de Almeida <n...@email.provided>
wrote:
Hi,
I've found several sites on google telling me that I shouldn't use
rand() % range+1
and I should, instead, use something like:
lowest+int(range*rand()/(RAND_MAX + 1.0))
Or function doing roughly the equivalent using integer math, which
might be useful if range is large (to avoid signed integer overflow).
They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that?
If rand() is implemented with a linear congruential generator f(x)=ax
+b mod m, then it is a relatively easy automated check that the lower
bits are in general highly correlated (given the first, the second is
somewhat predictable). Directly verifying algebraically is harder,
but a standard homework exercise in the field. The lowest separation
required to demonstrate correlation is usually two through six
invocations apart.
More sophisticated algorithms are harder to verify, but reputedly have
similar problems unless explicitly designed otherwise. [As an extreme
case, the Mersenne Twister has been formally verified to not to be
correlated across 637 consecutive invocations.]
It's moderately unusual to use a highquality RNG to implement rand().
The overall results would still be as approximately uniformly
distributed as rand(), except for unavoidable discretization errors as
mentioned in earlier replies (which wouldn't be that large for small
moduli since RAND_MAX should be at least 32767).
On 20080825 17:47:03 0400, za*****@zaimoni.com said:
On Aug 25, 3:35 pm, Rafael Cunha de Almeida <n...@email.provided>
wrote:
>Hi,
I've found several sites on google telling me that I shouldn't use
rand() % range+1
and I should, instead, use something like:
lowest+int(range*rand()/(RAND_MAX + 1.0))
Or function doing roughly the equivalent using integer math, which
might be useful if range is large (to avoid signed integer overflow).
>They fail to make it clear why. There seems to be less randomness in the lower bits or something like that. I don't know what less randomness would mean. Would it not be normally distributed or something like that?
If rand() is implemented with a linear congruential generator f(x)=ax
+b mod m, then it is a relatively easy automated check that the lower
bits are in general highly correlated (given the first, the second is
somewhat predictable). Directly verifying algebraically is harder,
but a standard homework exercise in the field. The lowest separation
required to demonstrate correlation is usually two through six
invocations apart.
More sophisticated algorithms are harder to verify, but reputedly have
similar problems unless explicitly designed otherwise. [As an extreme
case, the Mersenne Twister has been formally verified to not to be
correlated across 637 consecutive invocations.]
It's moderately unusual to use a highquality RNG to implement rand().
The overall results would still be as approximately uniformly
distributed as rand(), except for unavoidable discretization errors as
mentioned in earlier replies (which wouldn't be that large for small
moduli since RAND_MAX should be at least 32767).
Of course, if the goal is to get working code quickly, the best answer
to all of this is to unask the question. Don't try to implement this
stuff; use the random number generators in TR1. There's an adaptor to
generate uniformly distributed values within a specified range from a
generator that produces uniformly distributed values in a different
range.

Pete
Roundhouse Consulting, Ltd. ( www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
( www.petebecker.com/tr1book)
On Aug 25, 4:47 pm, zaim...@zaimoni.com wrote:
More sophisticated algorithms are harder to verify, but reputedly have
similar problems unless explicitly designed otherwise. [As an extreme
case, the Mersenne Twister has been formally verified to not to be
correlated across 637 consecutive invocations.]
Wrong: 623 consecutive invocations. Please accept my apologies. This discussion thread is closed Replies have been disabled for this discussion. Similar topics
4 posts
views
Thread by August1 
last post: by

41 posts
views
Thread by AngleWyrm 
last post: by

6 posts
views
Thread by ChasW 
last post: by

6 posts
views
Thread by SenLung Chen 
last post: by

4 posts
views
Thread by Bill Burris 
last post: by

10 posts
views
Thread by Frank Silvermann 
last post: by

26 posts
views
Thread by Gary Wessle 
last post: by

13 posts
views
Thread by Spiros Bousbouras 
last post: by

18 posts
views
Thread by Jordan Glassman 
last post: by
          