Problem with rand() % range+1

On 2008-08-25 16:35:21 -0400, Rafael Cunha de Almeida <no@email.providedsaid:

>
I've found several sites on google telling me that I shouldn't use

rand() % range+1

and I should, instead, use something like:

lowest+int(range*rand()/(RAND_MAX + 1.0))

They fail to make it clear why.

That's because it's nonsense.

There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean.

Both have the same problem. Pretend for the moment that RAND_MAX is 4
(that is, rand() returns 0, 1, 2, 3, or 4) and range is also 4 (that
is, you want values of 0, 1, 2, or 3). Since rand() generates five
different values, you can't get four distinct values with equal
likelihood. Try it with pencil and paper. Both ways.

The number of values that you start with has to be an exact multiple of
the number of values that you want to get. The way to do that is to
throw away some values from rand(), namely, those that exceed the
maximum multiple of range that's less than RAND_MAX+1. Like this:

int mult = (RAND_MAX + 1) / range;
int max = range * mult;
int temp = rand();
while (max < temp)
temp = rand();
return temp % range;

Now, that doesn't address the possibility of less randomness in the low
bits, but that's a separate issue, and depends on how rand() is
implemented.

Would it not be normally distributed or something like that?

rand() produces a uniform distribution, not a normal distribution, and
usually when someone is looking for a restricted range they're still
looking for a uniform distribution.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Aug 25 '08 #3

On 2008-08-25 17:00:48 -0400, peter koch <pe***************@gmail.comsaid:

On 25 Aug., 22:35, Rafael Cunha de Almeida <n...@email.provided>
wrote:
>Hi,

I've found several sites on google telling me that I shouldn't use

Â* Â* Â* Â* rand() % range+1

and I should, instead, use something like:

Â* Â* Â* Â* lowest+int(range*rand()/(RAND_MAX + 1.0))

They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that?

Take an example of a random generator delivering numbers in the range
0..99. If you want numbers in the range 0..79, your solution would
give an expectancy of numbers 0..19 twice as high as the range 20..79.
So your method is bad if you require an even distribution (but might
be ok if you just want somewhat random distribution as eg for a game).
The other method is better, but not perfect.

Well, the other method is better in that it makes the non-randomness
less obvious. <gBut with the same sample ranges it still produces
twenty values twice as often as the rest; it's just that they're not
the twenty lowest ones any more.

Probability is fundamentally just counting. If you can reduce the
problem to a small enough set of values, all you need to do is count
the results. For each possible value that rand() can produce, what
value does each of the formulae above produce? It's a bit tedious to do
for rand() generating numbers in the range 0..99, but for 0..9, with a
target range of 0..7, it won't take more than a few minutes. Don't
trust your intuition until you've gone through several examples and
seen how they really work.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Aug 25 '08 #4

peter koch

On 25 Aug., 23:15, Pete Becker <p...@versatilecoding.comwrote:

On 2008-08-25 17:00:48 -0400, peter koch <peter.koch.lar...@gmail.comsaid:

On 25 Aug., 22:35, Rafael Cunha de Almeida <n...@email.provided>
wrote:
Hi,

I've found several sites on google telling me that I shouldn't use

* * * * rand() % range+1

and I should, instead, use something like:

* * * * lowest+int(range*rand()/(RAND_MAX + 1.0))

They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that?

Take an example of a random generator delivering numbers in the range
0..99. If you want numbers in the range 0..79, your solution would
give an expectancy of numbers 0..19 twice as high as the range 20..79.
So your method is bad if you require an even distribution (but might
be ok if you just want somewhat random distribution as eg for a game).
The other method is better, but not perfect.

Well, the other method is better in that it makes the non-randomness
less obvious. <gBut with the same sample ranges it still produces
twenty values twice as often as the rest; it's just that they're not
the twenty lowest ones any more.

You are right - the other method also produces twenty numbers twice as
frequent as the first one - just spread evenly. To my embarassment, I
must admit that I did not realise this when I replied. So much for
perfection!

/Peter

Aug 25 '08 #5

acehreli

On Aug 25, 2:05 pm, Pete Becker <p...@versatilecoding.comwrote:

The number of values that you start with has to be an exact multiple of
the number of values that you want to get. The way to do that is to
throw away some values from rand(), namely, those that exceed the
maximum multiple of range that's less than RAND_MAX+1. Like this:

int mult = (RAND_MAX + 1) / range;

There is a problem if RAND_MAX equals INT_MAX. Adding one would wrap
the value and mult would become zero. So RAND_MAX could be casted to a
suitable type and I think 'unsigned int' works. Or an unsigned 1 would
work:

int mult = (RAND_MAX + 1u) / range;

I remember reading about this algorithm and that correction to it by
Andrew Koenig on clc++m posts.

int max = range * mult;
int temp = rand();
while (max < temp)
temp = rand();
return temp % range;

Ali

Aug 25 '08 #6

On 2008-08-25 17:21:18 -0400, ac******@gmail.com said:

On Aug 25, 2:05 pm, Pete Becker <p...@versatilecoding.comwrote:

>The number of values that you start with has to be an exact multiple of
the number of values that you want to get. The way to do that is to
throw away some values from rand(), namely, those that exceed the
maximum multiple of range that's less than RAND_MAX+1. Like this:

int mult = (RAND_MAX + 1) / range;

There is a problem if RAND_MAX equals INT_MAX. Adding one would wrap
the value and mult would become zero.

Yes, usually when I post this information I put in a note that it's
only a sketch, and probably has an off-by-one error. Or maybe two of
them.

So RAND_MAX could be casted to a
suitable type and I think 'unsigned int' works. Or an unsigned 1 would
work:

Maybe. RAND_MAX is required to be an integer constant expression, which
doesn't preclude it being unsigned to begin with.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Aug 25 '08 #7

On 2008-08-25 17:18:54 -0400, peter koch <pe***************@gmail.comsaid:

You are right - the other method also produces twenty numbers twice as
frequent as the first one - just spread evenly. To my embarassment, I
must admit that I did not realise this when I replied. So much for
perfection!

Don't be embarassed. It's a very common mistake. Most people's
intuition here leads them astray, which is why it's useful to work
through some simple examples to retrain it.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Aug 25 '08 #8

zaimoni

On Aug 25, 3:35 pm, Rafael Cunha de Almeida <n...@email.provided>
wrote:

Hi,

I've found several sites on google telling me that I shouldn't use

rand() % range+1

and I should, instead, use something like:

lowest+int(range*rand()/(RAND_MAX + 1.0))

Or function doing roughly the equivalent using integer math, which
might be useful if range is large (to avoid signed integer overflow).

They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that?

If rand() is implemented with a linear congruential generator f(x)=ax
+b mod m, then it is a relatively easy automated check that the lower
bits are in general highly correlated (given the first, the second is
somewhat predictable). Directly verifying algebraically is harder,
but a standard homework exercise in the field. The lowest separation
required to demonstrate correlation is usually two through six
invocations apart.

More sophisticated algorithms are harder to verify, but reputedly have
similar problems unless explicitly designed otherwise. [As an extreme
case, the Mersenne Twister has been formally verified to not to be
correlated across 637 consecutive invocations.]

It's moderately unusual to use a high-quality RNG to implement rand().

The overall results would still be as approximately uniformly
distributed as rand(), except for unavoidable discretization errors as
mentioned in earlier replies (which wouldn't be that large for small
moduli since RAND_MAX should be at least 32767).

Aug 25 '08 #9

On 2008-08-25 17:47:03 -0400, za*****@zaimoni.com said:

On Aug 25, 3:35 pm, Rafael Cunha de Almeida <n...@email.provided>
wrote:
>Hi,

I've found several sites on google telling me that I shouldn't use

rand() % range+1

and I should, instead, use something like:

lowest+int(range*rand()/(RAND_MAX + 1.0))

Or function doing roughly the equivalent using integer math, which
might be useful if range is large (to avoid signed integer overflow).

>They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that?

If rand() is implemented with a linear congruential generator f(x)=ax
+b mod m, then it is a relatively easy automated check that the lower
bits are in general highly correlated (given the first, the second is
somewhat predictable). Directly verifying algebraically is harder,
but a standard homework exercise in the field. The lowest separation
required to demonstrate correlation is usually two through six
invocations apart.

More sophisticated algorithms are harder to verify, but reputedly have
similar problems unless explicitly designed otherwise. [As an extreme
case, the Mersenne Twister has been formally verified to not to be
correlated across 637 consecutive invocations.]

It's moderately unusual to use a high-quality RNG to implement rand().

The overall results would still be as approximately uniformly
distributed as rand(), except for unavoidable discretization errors as
mentioned in earlier replies (which wouldn't be that large for small
moduli since RAND_MAX should be at least 32767).

Of course, if the goal is to get working code quickly, the best answer
to all of this is to unask the question. Don't try to implement this
stuff; use the random number generators in TR1. There's an adaptor to
generate uniformly distributed values within a specified range from a
generator that produces uniformly distributed values in a different
range.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Aug 25 '08 #10

zaimoni

On Aug 25, 4:47 pm, zaim...@zaimoni.com wrote:

More sophisticated algorithms are harder to verify, but reputedly have
similar problems unless explicitly designed otherwise. [As an extreme
case, the Mersenne Twister has been formally verified to not to be
correlated across 637 consecutive invocations.]

Wrong: 623 consecutive invocations. Please accept my apologies.

Aug 26 '08 #11

by: August1 | last post by:

A handful of articles have been posted requesting information on how to use these functions in addition to the time() function as the seed to generate unique groups (sets) of numbers - each group...

"hat" container class [C++]

by: AngleWyrm | last post by:

I have created a new container class, called the hat. It provides random selection of user objects, both with and without replacement, with non-uniform probabilities. Uniform probabilities are a...

problem with rand()

by: ChasW | last post by:

given the following example: Using gcc, this compiles, runs, and outputs as expected, but on vc++.net 2003, the first number is always the same despite the time seed. #include <ctime>...

Problem of random number generate..

by: Sen-Lung Chen | last post by:

Dear All: I have a question about this below function.The purpose of this function is to generate one number between a and b. -------------------------- int gennum(int a, int b) {...

rand in a closed interval on the ints

rand()

by: Bill Burris | last post by:

Hi, With VS .NET 2003 the rand() function sometimes returns a number equal to RAND_MAX. The docs say: The rand function returns a pseudorandom integer in the range 0 to RAND_MAX. Does this...

.NET Framework

by: Frank Silvermann | last post by:

#include <stdio.h> #include <stdlib.h> #include <time.h> #include <math.h> #define MIN_WORD_LENGTH 9

rand() between m and n

by: Gary Wessle | last post by:

Hi I need help to generate some random numbers between 2 and 8. #include <cstdlib> using std::rand; the following was out of my range, int main() {

About rand()

by: Spiros Bousbouras | last post by:

The standard says that rand() should return a pseudo-random number but what does pseudorandom mean ? If an implementation of rand() always returned the same number would it be conforming ? What if...

Redirect to STDOUT Problem

by: Jordan Glassman | last post by:

Trying to do something fairly routine... drop output into a file to graph, but the following command at the bash command line: ising output produces a blinking cursor, an empty file named...

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...