rand() % n Revisited - Page 2

Rich Fife

Quick rand() question:

I know you're not supposed to use "rand() % 1024" for instance,
because it focuses on the lower bits. However, it seems to me that
given that the argument is not a power of two (or near a power of
two), that this is not an issue. The upper bits will participate
equally in the result with the lower. Am I missing something?

Thanks!

-- Rich --

Oct 23 '08

Subscribe Reply

3946

William Hughes

On Oct 22, 11:48 pm, Paul Hsieh <websn...@gmail .comwrote:

On Oct 22, 6:04 pm, Rich Fife <rf...@amug.org wrote:

Quick rand() question:

I know you're not supposed to use "rand() % 1024" for instance,
because it focuses on the lower bits. However, it seems to me
that given that the argument is not a power of two (or near a
power of two), that this is not an issue. The upper bits will
participate equally in the result with the lower. Am I missing
something?

Yes, you are missing some mathematical analysis to back up what you
just said. If you do (rand() % 1023) on Microsoft Visual C++ or
WATCOM C/C++, 32 of the possible outputs will have an extra 3% bias no
matter how good your random number generator is.

Well, 3% certainly meets the informal meaning of small. If your
problem
is such that you are worried about the 3% you should probably be more
worried about the fact that the rand() you are using has an output
space of only 16 bits.

No C compiler's
rand() that I have ever seen has, by itself, a worse effect on random
output than that.

Then you have not seen a rand() implementation that switched parity
on each call. I understand such an implementation not only existed
but
was relatively widespread.

>
The C.L.C. FAQ about this gives extremely misleading advice on this
point and it should seriously be ignored.

No. Using the advice given in the C.L.C. means that you will get
reasonable
results, even if rand() implementation is poor, as long as rand()
produces
integers that are more or less uniformly distributed in 0...RAND_MAX.
If you use the "rand() % n" technique you have no such guarantee.
The bias is small. Do not confuse detectablity with importance.
(The use of "signifcanc e" in the term "statistica l significance"
leads many people astray).

- William Hughes

Oct 24 '08 #11

Nate Eldredge

William Hughes <wp*******@hotm ail.comwrites:

>No C compiler's
rand() that I have ever seen has, by itself, a worse effect on random
output than that.

Then you have not seen a rand() implementation that switched parity
on each call. I understand such an implementation not only existed
but
was relatively widespread.

Right you are. Here is the rand() implementation from the very
influential 4.4BSD-Lite.

#define RAND_MAX 0x7fffffff

static u_long next = 1;

int
rand()
{
return ((next = next * 1103515245 + 12345) % ((u_long)RAND_M AX + 1));
}

void
srand(seed)
u_int seed;
{
next = seed;
}

Oct 24 '08 #12

Paul Hsieh

On Oct 23, 7:45*pm, William Hughes <wpihug...@hotm ail.comwrote:

On Oct 22, 11:48 pm, Paul Hsieh <websn...@gmail .comwrote:
On Oct 22, 6:04 pm, Rich Fife <rf...@amug.org wrote:
Quick rand() question:

I know you're not supposed to use "rand() % 1024" for instance,
because it focuses on the lower bits. *However, it seems to me
that given that the argument is not a power of two (or near a
power of two), that this is not an issue. *The upper bits will
participate equally in the result with the lower. *Am I missing
something?

Yes, you are missing some mathematical analysis to back up what you
just said. *If you do (rand() % 1023) on Microsoft Visual C++ or
WATCOM C/C++, 32 of the possible outputs will have an extra 3% bias no
matter how good your random number generator is.

Well, 3% certainly meets the informal meaning of small.

You might like to explain that to the average casino. Card counting
in black
jack gives player around a 1% advantage over the house, and the
casinos kick
out such people whenever they are discovered. People who attack
defective
casino games rely on people with attitudes like yours.

Even if you are implementing something as simple as a 1d20 (where each
choice
itself is only 5%) in a dungeons and dragons game, the players will
easily see
that bias over time.

[...]*If your
problem is such that you are worried about the 3% you should probably
be more worried about the fact that the rand() you are using has an
output space of only 16 bits.

That statement doesn't follow any line of logic of any relevance. If
you
care, then you care, and you want to get a correct ranged random
number
generator. If you go up to 32 bits, but still have bias that's just
a
little smaller, how can you be happy? And if you want to write
portable
code, then what are you going to do?

No C compiler's
rand() that I have ever seen has, by itself, a worse effect on random
output than that.

Then you have not seen a rand() implementation that switched parity
on each call. *I understand such an implementation not only existed
but was relatively widespread.

True enough, but this fundamentally comes from the lack of analysis.
The
C.L.C. FAQ just continues this tradition by failing to give effective
analysis of the problem.

The C.L.C. FAQ about this gives extremely misleading advice on this
point and it should seriously be ignored.

No. *Using the advice given in the C.L.C. *means that you will get
reasonable results, even if rand() implementation is poor, as long
as rand() produces integers that are more or less uniformly
distributed in 0...RAND_MAX.

Did you know that a simple counter will produce numbers that are
exactly
uniformly distributed in 0 ... RAND_MAX? You know, basic
understanding is
sometimes actually useful on occasion.

If you use the "rand() % n" technique you have no such guarantee.

The technique shown in the CLC FAQ also has no such guarantee. Its
totally besides the point.

The bias is small.

Define small. If you want to test how often a hash function will map
to
a common bucket either (rand() % n) or
(rand() * (double) n / (RAND_MAX + 1)) will make no difference. It
will
produce worthless results no matter what.

[...]*Do not confuse detectablity with importance.

I assure you, I am not the one confused. The C.L.C. FAQ is giving a
solution that assumes a policy where low bit determinism is a worse
problem than pure measurable bias and also a worse problem than a
simple
range issue. In fact the CLC FAQ is promoting confusion by not
explaining the issue correctly and consequently how one might deal
with
the problem.

(The use of "significan ce" in the term "statistica l significance"
leads many people astray).

What has that got to do with anything? If you wish to test something
with a very small probability which is lower than the bias being
introduced by such short-sighted techniques then what good is the
C.L.C.
FAQs discussion on the subject?

Who actually wants to use a PRNG which is biased or incapable of even
measuring what you want? The subject deserves to be discussed
usefully.
The C.L.C. just harps one single anomaly that has resulted for the
weakness of the ANSI C standard.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Oct 24 '08 #13

Richard Bos

Paul Hsieh <we******@gmail .comwrote:

The C.L.C. FAQ about this gives extremely misleading advice on this
point and it should seriously be ignored. If you want to seriously
deal with random numbers just read my page about it:

Or possibly don't. If the code on that page is as rotten as the HTML, I
wouldn't trust it.

Then, search this and other newsgroups for posts by George Marsaglia,
who _really_ knows what a good PRNG is like.

Richard

Oct 24 '08 #14

William Hughes

On Oct 23, 11:29 pm, Paul Hsieh <websn...@gmail .comwrote:

On Oct 23, 7:45 pm, William Hughes <wpihug...@hotm ail.comwrote:

On Oct 22, 11:48 pm, Paul Hsieh <websn...@gmail .comwrote:
On Oct 22, 6:04 pm, Rich Fife <rf...@amug.org wrote:
Quick rand() question:

I know you're not supposed to use "rand() % 1024" for instance,
because it focuses on the lower bits. However, it seems to me
that given that the argument is not a power of two (or near a
power of two), that this is not an issue. The upper bits will
participate equally in the result with the lower. Am I missing
something?

Yes, you are missing some mathematical analysis to back up what you
just said. If you do (rand() % 1023) on Microsoft Visual C++ or
WATCOM C/C++, 32 of the possible outputs will have an extra 3% bias no
matter how good your random number generator is.

Well, 3% certainly meets the informal meaning of small.

You might like to explain that to the average casino.

The explanation goes "Even differences that are usually
considered small, e.g. 3%. can be very very important."
If the casino is still in business they will tell
me to teach my Grandmother to suck eggs.

Card counting
in black
jack gives player around a 1% advantage over the house,

Teach your Grandmother to suck eggs

and the casinos kick
out such people whenever they are discovered. People who attack
defective casino games rely on people with attitudes like yours.

Even if you are implementing something as simple as a 1d20 (where each
choice
itself is only 5%) in a dungeons and dragons game, the players will
easily see
that bias over time.

Piffle (even if we are talking about a 3% bias and not
the less than .1% bias you get with rand()%20 ).
And even if someone took the trouble to notice (e.g. tabulated
1000's of rolls and applied statistical techniques) they would notice
that the bias had no practical import.

>
[...] If your
problem is such that you are worried about the 3% you should probably
be more worried about the fact that the rand() you are using has an
output space of only 16 bits.

That statement doesn't follow any line of logic of any relevance. If
you
care, then you care, and you want to get a correct ranged random
number
generator. If you go up to 32 bits, but still have bias that's just
a
little smaller, how can you be happy?

E.g. I am interested at tail distributions. The performance of
my random generator has gone from terrible to reasonable.

>And if you want to write portable code, then what are you going to do?

Either I don't need much, in which case I can use
the system rand() or I provide my_rand().

>

No C compiler's
rand() that I have ever seen has, by itself, a worse effect on random
output than that.

Then you have not seen a rand() implementation that switched parity
on each call. I understand such an implementation not only existed
but was relatively widespread.

True enough, but this fundamentally comes from the lack of analysis.
The
C.L.C. FAQ just continues this tradition by failing to give effective
analysis of the problem.

The C.L.C. FAQ about this gives extremely misleading advice on this
point and it should seriously be ignored.

No. Using the advice given in the C.L.C. means that you will get
reasonable results, even if rand() implementation is poor, as long
as rand() produces integers that are more or less uniformly
distributed in 0...RAND_MAX.

Did you know that a simple counter will produce numbers that are
exactly
uniformly distributed in 0 ... RAND_MAX?

Indeed, one needs more than uniformly distributed.
The basic point, that the rand() implementation needs
to be really bad to produce unreasonable results
with the FAQ technique, but the rand() implementation
only needs to be a bit bad to produce unreasonable
results with the rand()%n technique remains.

>You know, basic
understanding is
sometimes actually useful on occasion.

If you use the "rand() % n" technique you have no such guarantee.

The technique shown in the CLC FAQ also has no such guarantee. Its
totally besides the point.

The bias is small.

Define small. If you want to test how often a hash function will map
to
a common bucket either (rand() % n) or
(rand() * (double) n / (RAND_MAX + 1)) will make no difference. It
will
produce worthless results no matter what.

No. A test that looks for perfection vs bias, will find a bias,
but since there are lots and lots of ways of introducing a
insignficant (note I did _not_ say "statistica lly insignificant")
bias, a test that looks for perfection vs bias is stupid.

[...] Do not confuse detectablity with importance.

I assure you, I am not the one confused. The C.L.C. FAQ is giving a
solution that assumes a policy where low bit determinism is a worse
problem than pure measurable bias and also a worse problem than a
simple
range issue. In fact the CLC FAQ is promoting confusion by not
explaining the issue correctly and consequently how one might deal
with
the problem.

(The use of "significan ce" in the term "statistica l significance"
leads many people astray).

What has that got to do with anything? If you wish to test something
with a very small probability which is lower than the bias being
introduced by such short-sighted techniques then what good is the
C.L.C.
FAQs discussion on the subject?

Who actually wants to use a PRNG which is biased

[I recall a wonderful poem about an archer who claimed
he was best, because, although he never came near
the target, he was unbiased, Lack of bias is not
everything!]
Lots of people don't care a fig. If I want to shuffle cards for a
bridge game
then I don't care about a 3% bias. (If I want to shuffle cards for
a computer poker game, then the fact that the average rand()
is about as cryptographicly secure as a Ceasar cypher is more
important than a 3% bias). If the wumpus alternates between
being in the left half of the maze and the right half of the maze,
I care a lot!

The CLC FAQ solves a real (although probably now historical) problem.

The system rand() may not be suitable for many applicatins.
Fixing one problem, which is not a problem in most applications
where the system rand() is suitable, does not magically make rand()
produce high quality random numbers.

- William Hughes

Oct 24 '08 #15

user923005

On Oct 24, 5:04*am, William Hughes <wpihug...@hotm ail.comwrote:
[snip]

The CLC FAQ solves a real (although probably now historical) problem.

The system rand() may not be suitable for many applicatins.
Fixing one problem, which is not a problem in most applications
where the system rand() is suitable, does not magically make rand()
produce high quality random numbers.

An easy solution for this problem is to use the Mersenne Twister.
True, the quality of C compiler packaged rand() implementations is
spotty.
Equally true, the excellence of the open source Mersenne Twister is
well documented.

Suggestions to the ANSI C committee:
1. Make a version of the Mersenne Twister the standard C library
implementation.
2. Have versions that produce int (0-INT_MAX), long long (0-
LLONG_MAX), and double (0.0-1.0) outputs.

A final suggestion would be to carefully analyze available C library
source to find a "best of breed" class of standard library functions
{for the subset of functions that are fully portable} and have this
code base become a default library implementation. Then, when the
compiler vendor wants to write a snappy compiler, the workload will
consist only of files that improve upon the default set.

Oct 24 '08 #16

Similar topics

4222

Quality of rand()

by: Tom McCallum | last post by:

Can any tell me why rand() is such a bad pseudo-random number generator. In all the articles I have read they say that you can predict the outcome of rand() but when I used its output with NIST's random number testsuite it passed all of the tests thrown at it for randomness. Can anyone suggest a test I could use that it would fail? A pointer to a suitable article would be appreciated. Thanks for your help in advance, Tom

C / C++

3247

Problem using rand() function

by: Chris Gordon-Smith | last post by:

I have a simulation program that calls the rand() function at various points while it executes. For the user interface that displays statistics etc. while the program runs, I use the Lazarus GUI library, which is based on GTK. Depending on which libraries I link in, I can have either a GTK1 or GTK2 user interface. I'm finding that the simulation behaves differently depending on whether I link with GTK1 or GTK2.

C / C++

7139

Better Rand() Algorithim

by: Ben Justice | last post by:

For a program in c, I need some random numbers for a system were people are placing bets. This is not a commerical project btw. Generally, I tend to rely on things from the standard library, because they're written by people with skills far above mine. Hence, I've always used rand() and FALSELY assumed it could produce unpredictable random numbers and could be used in many situations even if security was an issue. Firstly, lets assume...

C / C++

2695

Rand() with base

by: Profetas | last post by:

Hi, I want to generate a random 8 bit number using rand(0 is that possible? to expecifu the base and the lenght? thanks

C / C++

4394

rand - is RAND_MAX how long before the same # will occur again?

by: Jack | last post by:

When I use rand(), is the RAND_MAX value how long I am guaranteed that the same value will not appear twice? And is this a floating window? For example, if RAND_MAX is 32767, and I make 500,000 consecutive rand calls then is the rand() algorithm going to guarantee me that no floating window of calls over that 500,000 rand calls will have the same value twice? The only way that would work is if the series was identical everytime.

C / C++

2797

oddly behaving rand( ) in microsoft compiler

by: Siam | last post by:

Hi all, I'm writing a shell language in c++ that supports the generation of random numbers, and by its nature, each command must be executed in a new thread. The call to the random function from my language simply propogates up to the rand( ) function from c++. For some reason, C++ will give each thread independent use of their own rand( ) function, so that a rand( ) call from one thread won't affect another thread's call to rand( )....

C / C++

3670

About rand()

by: Spiros Bousbouras | last post by:

The standard says that rand() should return a pseudo-random number but what does pseudorandom mean ? If an implementation of rand() always returned the same number would it be conforming ? What if it always alternated between 2 values ? On the practical side do you have any thoughts on what one could realistically expect from the behaviour of rand() ? Could for example one expect that eventually any value in the range will be returned ?

C / C++

2324

Peculiar rand() case and threads

by: ds | last post by:

Hi all, rand() is not thread safe, a fact that may not be so bad after all.. However, I face the following problem: a piece of code uses rand() to get a random sequence, but always seeds with the same seed for reproducibility of the results. Now I am porting this (old C89) code and have setup a nice app with threads that drives on one thread the old code and on another the new code, so that I can compare the results and see that nothing...

C / C++

3021

Problem with rand() % range+1

by: Rafael Cunha de Almeida | last post by:

Hi, I've found several sites on google telling me that I shouldn't use rand() % range+1 and I should, instead, use something like: lowest+int(range*rand()/(RAND_MAX + 1.0))

C / C++

9647

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

9491

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

10357

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10163

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

10104

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

8988

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

5397

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

5532

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

3668

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP