Random Integers from 0 to 999

Michael B Allen

Someone once posted the following macro on clc:

#define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX)

Unfortunately it's flawed. If rand() returns RAND_MAX the result can be
one larger than b.

So can someone provide a *proper* macro (or function) that returns a
random integer between (actually in) a range of values? For example
randint(0, 999) could return:

0
10
777
999

Mike

Nov 14 '05 #1

Subscribe Post Reply

5927

Michael Mair

Michael B Allen wrote:

Someone once posted the following macro on clc:

#define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX)

Unfortunately it's flawed. If rand() returns RAND_MAX the result can be
one larger than b.

So can someone provide a *proper* macro (or function) that returns a
random integer between (actually in) a range of values? For example
randint(0, 999) could return:

0
10
777
999

Where is your shot at it? Without attribution, nobody knows
from which source this comes.

#define RANDINT(a,b) (a)+(((b)-(a)+1)*(double)rand()/(1u+RAND_MAX))

may serve you better; note the double which serves you well
if RAND_MAX has more digits than float can represent.
There are other deficiencies for this approach; why don't you
use the suggestions of, say, Lawrence Kirby (or other regulars)
which make all values equally probable?
Apart from that: I would rather use a function for this purpose.
If you need many random numbers, consider filling an array and
retrieving from there until it is "used up", then refilling.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.

Nov 14 '05 #2

Grumble

Michael B Allen wrote:

Someone once posted the following macro on clc:

#define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX)

Unfortunately it's flawed. If rand() returns RAND_MAX the result can be
one larger than b.

So can someone provide a *proper* macro (or function) that returns a
random integer between (actually in) a range of values? For example
randint(0, 999) could return:

0
10
777
999

int randint(int min, int max)
{
assert(min <= max);
return min + rand() % (max - min + 1);
}

0 <= rand() <= RAND_MAX
0 <= rand()%(max-min+1) < max-min+1
0 <= rand()%(max-min+1) <= max-min
min <= min+rand()%(max-min+1) <= max

Nov 14 '05 #3

Mark Piffer

Michael Mair wrote:

Michael B Allen wrote:
Someone once posted the following macro on clc:

#define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX)

Unfortunately it's flawed. If rand() returns RAND_MAX the result can be one larger than b.

Where is your shot at it? Without attribution, nobody knows
from which source this comes.

#define RANDINT(a,b) (a)+(((b)-(a)+1)*(double)rand()/(1u+RAND_MAX))

may serve you better; note the double which serves you well
if RAND_MAX has more digits than float can represent.
There are other deficiencies for this approach; why don't you
use the suggestions of, say, Lawrence Kirby (or other regulars)
which make all values equally probable?
Apart from that: I would rather use a function for this purpose.
If you need many random numbers, consider filling an array and
retrieving from there until it is "used up", then refilling.

Unluckily, both, your and Grumble's snippet produce UB due to integer
overflow when RAND_MAX happens to equal INT_MAX (like on all my 16-Bit
compilers) and the arguments are (0,RAND_MAX). It looks like

#define RANDINT(a,b)\
((b)-(a)==RAND_MAX?(a)+rand():(a)+rand()%((b)-(a)+1))

will do it, although the distribution will be abysmal. Then again, for
embedded architectures, where neither floating point nor much RAM is an
option, I use such generators exactly for their simplicity and not the
randomness. Most of my test data just needs to be better than
iterative, but not truly random. Where "real" randomness is needed I go
and ask the big boys (the mathematicians); uniform distributions won't
cut it most of the time anyway.

Mark

Nov 14 '05 #4

Michael Mair

Mark Piffer wrote:

Michael Mair wrote:
Michael B Allen wrote:
Someone once posted the following macro on clc:

#define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX)

Unfortunately it's flawed. If rand() returns RAND_MAX the result
can be
one larger than b.

Where is your shot at it? Without attribution, nobody knows
from which source this comes.

#define RANDINT(a,b) (a)+(((b)-(a)+1)*(double)rand()/(1u+RAND_MAX))

may serve you better; note the double which serves you well
if RAND_MAX has more digits than float can represent.
There are other deficiencies for this approach; why don't you
use the suggestions of, say, Lawrence Kirby (or other regulars)
which make all values equally probable?
Apart from that: I would rather use a function for this purpose.
If you need many random numbers, consider filling an array and
retrieving from there until it is "used up", then refilling.

Unluckily, both, your and Grumble's snippet produce UB due to integer
overflow when RAND_MAX happens to equal INT_MAX (like on all my 16-Bit
compilers) and the arguments are (0,RAND_MAX). It looks like

#define RANDINT(a,b)\
((b)-(a)==RAND_MAX?(a)+rand():(a)+rand()%((b)-(a)+1))

will do it, although the distribution will be abysmal. Then again, for
embedded architectures, where neither floating point nor much RAM is an
option, I use such generators exactly for their simplicity and not the
randomness. Most of my test data just needs to be better than
iterative, but not truly random. Where "real" randomness is needed I go
and ask the big boys (the mathematicians); uniform distributions won't
cut it most of the time anyway.

You are right, I forgot to mention this particular problem and
did not correct it; however IIRC it is covered in the mentioned
message by Lawrence Kirby.
If I have too much time on my hands, I will search for it.
I still hold that writing a function is the better way; you
can cut off the excess random values and handle special cases in a
transparent way.

If we speak of overkill: The Mersenne Twister should do for
a start :-)
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.

Nov 14 '05 #5

Chris Croughton

On Thu, 24 Mar 2005 09:30:34 +0100, Grumble
<de*****@kma.eu.org> wrote:

Michael B Allen wrote:
Someone once posted the following macro on clc:

#define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX)

Unfortunately it's flawed. If rand() returns RAND_MAX the result can be
one larger than b.
The simple solution is to use (RAND_MAX+1) as the divisor. However, it
has the same probem as below.
So can someone provide a *proper* macro (or function) that returns a
random integer between (actually in) a range of values? For example
randint(0, 999) could return:

0
10
777
999
int randint(int min, int max)
{
assert(min <= max);
return min + rand() % (max - min + 1);
}

That just uses different bits to do the same thing (except that you
corrected the "off by one" error). However, there are a number of poor
implementations of rand() where the bottom bits are a lot more
predictable than the higher ones (rand() % 4 returning a continuous
repeated sequence of 0, 1, 2, 3 in one of them!).
0 <= rand() <= RAND_MAX
0 <= rand()%(max-min+1) < max-min+1
0 <= rand()%(max-min+1) <= max-min
min <= min+rand()%(max-min+1) <= max

If the range is not a submultiple of (RAND_MAX + 1) then it does not
give equal probabilities of all of the numbers. For instance, take a
small RAND_MAX of 7 (0 <= rand() <= 7) and a range of [0..4]:

rand() randint()
0 0
1 1
2 2
3 3
4 4
5 0
6 1
7 2

results 0..2 occur twice as often as 3..4. Granted that when RAND_MAX
is very much bigger than the range the error becomes smaller, it is
still there (the potential error is range/(RAND_MAX+1)).

Chris C

Nov 14 '05 #6

Rouben Rostamian

In article <d1**********@news-rocq.inria.fr>,
Grumble <de*****@kma.eu.org> wrote:

int randint(int min, int max)
{
assert(min <= max);
return min + rand() % (max - min + 1);
}

0 <= rand() <= RAND_MAX
0 <= rand()%(max-min+1) < max-min+1
0 <= rand()%(max-min+1) <= max-min
min <= min+rand()%(max-min+1) <= max

That's a terrible way of generating "random" numbers.

When we ask for a "random number" in the range min..max,
we want every number within that range to occur with equal
probability.

The following example shows that your method does not give a
uniformly distributed random number.

For the sake of illustration, let's say RAND_MAX is 7.
Suppose you want random numbers in the set {0,1,2,3,4,5}.
Then according to your algorithm:

rand() randint()
0 -> 0
1 -> 1
2 -> 2
3 -> 3
4 -> 4
5 -> 5
6 -> 0
7 -> 1

Therefore the numbers 0 and 1 are twice as likely to show up
than other numbers.

Here is a better way:

int randint(int min, int max)
{
assert(min <= max);
return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX));
}

--
Rouben Rostamian

Nov 14 '05 #7

Grumble

Rouben Rostamian wrote:

Grumble wrote:
int randint(int min, int max)
{
assert(min <= max);
return min + rand() % (max - min + 1);
}

That's a terrible way of generating "random" numbers.

When we ask for a "random number" in the range min..max,
we want every number within that range to occur with equal
probability.

The following example shows that your method does not give a
uniformly distributed random number.

For the sake of illustration, let's say RAND_MAX is 7.
Suppose you want random numbers in the set {0,1,2,3,4,5}.
Then according to your algorithm:

rand() randint()
0 -> 0
1 -> 1
2 -> 2
3 -> 3
4 -> 4
5 -> 5
6 -> 0
7 -> 1

Therefore the numbers 0 and 1 are twice as likely to show up
than other numbers.

Here is a better way:

int randint(int min, int max)
{
assert(min <= max);
return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX));
}

You algorithm is as "broken" as mine because of a fundamental problem:

If you try to place 8 balls in 6 buckets, then, no matter how you slice
and dice it, you'll end up with more balls in some buckets. The only way
out is to discard 2 balls.

--
Regards, Grumble

Nov 14 '05 #8

CBFalconer

Michael B Allen wrote:

Someone once posted the following macro on clc:

#define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX)

Unfortunately it's flawed. If rand() returns RAND_MAX the result
can be one larger than b.
#define ranrange(a, b) (int)((a) + rand()/((double)RAND_MAX + 1) \
* ((b) - (a) + 1))

assuming 0 == rand() can occur. Many systems will never return 0,
so:

#define ranrange(a, b) (int)((a) + (rand() - 1)/((double)RAND_MAX)
\
* ((b) - (a) + 1))

(untested)

So can someone provide a *proper* macro (or function) that returns
a random integer between (actually in) a range of values?
For example randint(0, 999) could return:

0
10
777
999

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #9

CBFalconer

Grumble wrote:

Rouben Rostamian wrote:
Grumble wrote:
int randint(int min, int max)
{
assert(min <= max);
return min + rand() % (max - min + 1);
}

.... snip ...
For the sake of illustration, let's say RAND_MAX is 7.
Suppose you want random numbers in the set {0,1,2,3,4,5}.
Then according to your algorithm:

rand() randint()
0 -> 0
1 -> 1
2 -> 2
3 -> 3
4 -> 4
5 -> 5
6 -> 0
7 -> 1

Therefore the numbers 0 and 1 are twice as likely to show up
than other numbers.

Here is a better way:

int randint(int min, int max)
{
assert(min <= max);
return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX));
}

You algorithm is as "broken" as mine because of a fundamental
problem:

If you try to place 8 balls in 6 buckets, then, no matter how you
slice and dice it, you'll end up with more balls in some buckets.
The only way out is to discard 2 balls.

Since RAND_MAX is normally much larger than (max - min) that effect
is minimal. However, you can allow for it by:

unsigned int ranrange(unsigned int min, unsigned int max)
{
unsigned int t, d, n;

if (min > max) { /* No assert, just an interval */
t = min; min = max; max = t;
}
t = max - min + 1;
d = RAND_MAX / t;
d *= t;
do { /* discard the few biasing values */
n = rand(); /* assuming rand() unbiased */
} while (n > d);
return min + (int)((double)t * n/(1.0 + d));
} /* untested */

It probably requires tweaking for the minimum return from rand,
which may well be either 1 or 0.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #10

Andrey Tarasevich

Rouben Rostamian wrote:

...
Then according to your algorithm:

rand() randint()
0 -> 0
1 -> 1
2 -> 2
3 -> 3
4 -> 4
5 -> 5
6 -> 0
7 -> 1

Therefore the numbers 0 and 1 are twice as likely to show up
than other numbers.

Here is a better way:

int randint(int min, int max)
{
assert(min <= max);
return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX));
}
...

There's absolutely no way to implement the required functionality by a
stateless function that simply maps 'int' to 'int'. Regardless of how
you do it, some numbers will appear with higher probability than other
numbers. The version you provided is as "terrible" as the previous one
for the very same reason - for the same ranges of input and output
values some numbers will be "twice as likely to show up than other
numbers" (you should've tested it before posting here).

When [0, RAND_MAX] range is significantly wider than the requiested
[min, max] range, this is not really a problem. In other cases, only a
stateful mapping function will help.

--
Best regards,
Andrey Tarasevich

Nov 14 '05 #11

Chris Croughton

On Thu, 24 Mar 2005 15:09:47 +0000 (UTC), Rouben Rostamian
<ro****@pc18.math.umbc.edu> wrote:

In article <d1**********@news-rocq.inria.fr>,
Grumble <de*****@kma.eu.org> wrote:

int randint(int min, int max)
{
assert(min <= max);
return min + rand() % (max - min + 1);
}

0 <= rand() <= RAND_MAX
0 <= rand()%(max-min+1) < max-min+1
0 <= rand()%(max-min+1) <= max-min
min <= min+rand()%(max-min+1) <= max

That's a terrible way of generating "random" numbers.

When we ask for a "random number" in the range min..max,
we want every number within that range to occur with equal
probability.

The following example shows that your method does not give a
uniformly distributed random number.

For the sake of illustration, let's say RAND_MAX is 7.
Suppose you want random numbers in the set {0,1,2,3,4,5}.
Then according to your algorithm:

rand() randint()
0 -> 0
1 -> 1
2 -> 2
3 -> 3
4 -> 4
5 -> 5
6 -> 0
7 -> 1

Therefore the numbers 0 and 1 are twice as likely to show up
than other numbers.

Here is a better way:

int randint(int min, int max)
{
assert(min <= max);
return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX));
}

Why is that any better? Assuming your example of RAND_MAX==7, and
wanting numbers in the set [0,5]:

rand() randint()
0 -> 0*6/8 0/8 -> 0
1 -> 1*6/8 6/8 -> 0
2 -> 2*6/8 12/8 -> 1
3 -> 3*6/8 18/8 -> 2
4 -> 4*6/8 24/8 -> 3
5 -> 5*6/8 30/8 -> 3
6 -> 6*6/8 36/8 -> 4
7 -> 7*6/8 42/8 -> 5

All you've done is to change it so that 0 and 3 get the extra hits
instead of 0 and 1. OK, it looks slightly more uniform (the mean will be
better,2.25 instead of 2, it should be 2.5) but it's still got the same
problem of two of the numbers occuring twice as often as the others.

A way of getting round the problem is to use an iterative method and
discard values out of range:

int randint(int min, int max)
{
unsigned range = max - min + 1;
int bits = 1;
int result;
assert(range > 0 && range <= RAND_MAX);
while (range-1 > bits)
bits = bits*2 + 1;
do result = rand() & bits; while (result > range);
return result + min;
}

This only works if RAND_INT is 2^n - 1, but that's the case on all
implementations I've found. It is also susceptible to the quality of
the lower bits of rand(), which on some implementations are not very
random...

Chris C

Nov 14 '05 #12

websnarf

Rouben Rostamian wrote:

[...]
rand() randint()
0 -> 0
1 -> 1
2 -> 2
3 -> 3
4 -> 4
5 -> 5
6 -> 0
7 -> 1

Therefore the numbers 0 and 1 are twice as likely to show up
than other numbers.

Here is a better way:

int randint(int min, int max) {
assert(min <= max);
return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX));
}

This solution (which, unfortunately, is endorsed by the comp.lang.c
FAQ) is no better. As others have posted, you're up against the
pigeon-hole principle, you can't fix it by just changing the mapping in
some way.

And its a real problem in practice -- RAND_MAX need not be higher than
65535, and in most of the compilers on the most popular platform it is
indeed exactly this. I have a more complete summary of the issue here:

http://www.pobox.com/~qed/random.html

---
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 14 '05 #13

Dave Thompson

On 26 Mar 2005 16:50:31 -0800, we******@gmail.com wrote:

Rouben Rostamian wrote: <snip>
return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX));

<snip> This solution (which, unfortunately, is endorsed by the comp.lang.c
FAQ) is no better. As others have posted, you're up against the
pigeon-hole principle, you can't fix it by just changing the mapping in
some way.

And its a real problem in practice -- RAND_MAX need not be higher than
65535, and in most of the compilers on the most popular platform it is
indeed exactly this. I have a more complete summary of the issue here:

http://www.pobox.com/~qed/random.html

Need not be higher than, and often is, 32767 (= minimum INT_MAX).

Although for a desired choice range of no more than about 30, I might
consider the <.1% nonuniformity acceptable.

- David.Thompson1 at worldnet.att.net

Nov 14 '05 #14

CBFalconer

Dave Thompson wrote:

On 26 Mar 2005 16:50:31 -0800, we******@gmail.com wrote:
Rouben Rostamian wrote:

<snip>
return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX));

<snip>
This solution (which, unfortunately, is endorsed by the comp.lang.c
FAQ) is no better. As others have posted, you're up against the
pigeon-hole principle, you can't fix it by just changing the
mapping in some way.

And its a real problem in practice -- RAND_MAX need not be higher
than 65535, and in most of the compilers on the most popular
platform it is indeed exactly this. I have a more complete

summary of the issue here:

http://www.pobox.com/~qed/random.html

Need not be higher than, and often is, 32767 (= minimum INT_MAX).

Although for a desired choice range of no more than about 30, I
might consider the <.1% nonuniformity acceptable.

AFAIK there is no specification for RAND_MIN, which will be at
least 1 for many implementations. If you require specific
characteristics the only answer is to supply your own random
implmentation. Even then you need to be aware of what is and is
not guaranteed. For example, 32 bit ints are not guaranteed,
contrary to Hsiehs (websnarf) hidden assumptions in at least some
of his code.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #15

Keith Thompson

CBFalconer <cb********@yahoo.com> writes:
[...]

AFAIK there is no specification for RAND_MIN, which will be at
least 1 for many implementations.

[...]

There is no RAND_MIN. The standard says that rand() "computes a
sequence of pseudo-random integers in the range 0 to RAND_MAX".

If an implementation's rand() function never returns 0, the implementation
is either non-conforming or of poor quality.

Do you know of an implementation in which rand() never returns 0?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #16

CBFalconer

Keith Thompson wrote:

CBFalconer <cb********@yahoo.com> writes:
[...]
AFAIK there is no specification for RAND_MIN, which will be at
least 1 for many implementations.

[...]

There is no RAND_MIN. The standard says that rand() "computes a
sequence of pseudo-random integers in the range 0 to RAND_MAX".

If an implementation's rand() function never returns 0, the
implementation is either non-conforming or of poor quality.

Do you know of an implementation in which rand() never returns 0?

Yes. Any Linear Congruential with a zero adder. Any CRC based
system.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #17

Eric Sosman

Keith Thompson wrote:

CBFalconer <cb********@yahoo.com> writes:
[...]
AFAIK there is no specification for RAND_MIN, which will be at
least 1 for many implementations.

[...]

There is no RAND_MIN. The standard says that rand() "computes a
sequence of pseudo-random integers in the range 0 to RAND_MAX".

If an implementation's rand() function never returns 0, the implementation
is either non-conforming or of poor quality.

Do you know of an implementation in which rand() never returns 0?

The trouble is in the testing: If you sample three
hundred twenty-seven quadrillion rand() values and see
no zeroes, can you conclude that rand() will never, ever,
under any circumstances return a zero?

It's reminiscent of the demonstration that all crows are
black. You can travel the world observing crows and noting
that all seen are black; with each successive black crow your
confidence in the hypothesis increases. But "all crows are
black" is equivalent to "all non-black things are non-crows,"
so there's a less strenuous way to proceed: Just sit in your
armchair (none of that tedious travelling), cast your eyes
about you, and observe non-black objects in your field of view.
Behold the butterknife: a non-black non-crow that raises your
confidence. Behold the white cover of K&R, another non-crow:
your confidence swells as if certain spammed pharmaceuticals
had been used. Behold the pinkish appurtenances of the image
of Miss Download, overtly mammalian and hence not a crow: you
are ready to submit your thesis to a journal (and stop thinking
about that swelling, okay?). The upshot is that you can raise
your confidence in "all crows are black" without every observing
a single crow ...

--
Eric Sosman
es*****@acm-dot-org.invalid

Nov 14 '05 #18

Keith Thompson

CBFalconer <cb********@yahoo.com> writes:

Keith Thompson wrote:
CBFalconer <cb********@yahoo.com> writes:
[...]
AFAIK there is no specification for RAND_MIN, which will be at
least 1 for many implementations.

[...]

There is no RAND_MIN. The standard says that rand() "computes a
sequence of pseudo-random integers in the range 0 to RAND_MAX".

If an implementation's rand() function never returns 0, the
implementation is either non-conforming or of poor quality.

Do you know of an implementation in which rand() never returns 0?

Yes. Any Linear Congruential with a zero adder. Any CRC based
system.

Do you know of a specific C library implementation that uses an
algorithm in which rand() never returns 0?

Even the sample implementation in the standard (which isn't very good)
returns 0 every few thousand iterations.

As long as the internal state is bigger than the values returned,
you'll probably get 0 every now and then.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #19

CBFalconer

Eric Sosman wrote:

Keith Thompson wrote:
CBFalconer <cb********@yahoo.com> writes:
[...]
AFAIK there is no specification for RAND_MIN, which will be at
least 1 for many implementations.
[...]

There is no RAND_MIN. The standard says that rand() "computes a
sequence of pseudo-random integers in the range 0 to RAND_MAX".

If an implementation's rand() function never returns 0, the implementation
is either non-conforming or of poor quality.

Do you know of an implementation in which rand() never returns 0?

The trouble is in the testing: If you sample three
hundred twenty-seven quadrillion rand() values and see
no zeroes, can you conclude that rand() will never, ever,
under any circumstances return a zero?

I am thinking of cases where the generation method is known, and
thus the lack of a zero can be guaranteed.

It's reminiscent of the demonstration that all crows are
black. You can travel the world observing crows and noting
that all seen are black; with each successive black crow your
confidence in the hypothesis increases. But "all crows are

Nah, it just confirms that all crows have a black side, which they
normally present to you.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #20

CBFalconer

Keith Thompson wrote:

CBFalconer <cb********@yahoo.com> writes:
Keith Thompson wrote:

.... snip ...

Do you know of an implementation in which rand() never returns 0?

Yes. Any Linear Congruential with a zero adder. Any CRC based
system.

Do you know of a specific C library implementation that uses an
algorithm in which rand() never returns 0?

No. :-) But I never bothered looking.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #21

Tim Rentsch

Eric Sosman <es*****@acm-dot-org.invalid> writes:

Keith Thompson wrote:
CBFalconer <cb********@yahoo.com> writes:
[...]
AFAIK there is no specification for RAND_MIN, which will be at
least 1 for many implementations.

[...]

There is no RAND_MIN. The standard says that rand() "computes a
sequence of pseudo-random integers in the range 0 to RAND_MAX".

If an implementation's rand() function never returns 0, the implementation
is either non-conforming or of poor quality.

Do you know of an implementation in which rand() never returns 0?

The trouble is in the testing: If you sample three
hundred twenty-seven quadrillion rand() values and see
no zeroes, can you conclude that rand() will never, ever,
under any circumstances return a zero?

If RAND_MAX is something reasonable, like 2**32 - 1 or less, do
100*(RAND_MAX+1) iterations, and if zero comes up less than 10 or more
than 1000 times, the implementation of rand() is no good anyway. (If
RAND_MAX is >= 2**32, do a similar test with 100*2**32 iterations,
checking for values < RAND_MAX/2**32.)

It's not necessary to program around poor implementations of rand(),
or even implementations that look like they might be poor. Instead,
it's easy enough to follow the recipe in Knuth's GraphBase book:

http://www-cs-faculty.stanford.edu/~knuth/sgb.html

and make a simple, fast, high-quality random number generator. And
you'll get the benefit that programs using it will get (assuming the
same seed values) the same stream of random numbers on different
platforms.

Nov 14 '05 #22

Michael Wojcik

In article <kf*************@alumnus.caltech.edu>, Tim Rentsch <tx*@alumnus.caltech.edu> writes:

It's not necessary to program around poor implementations of rand(),
or even implementations that look like they might be poor. Instead,
it's easy enough to follow the recipe in Knuth's GraphBase book:

http://www-cs-faculty.stanford.edu/~knuth/sgb.html

and make a simple, fast, high-quality random number generator. And
you'll get the benefit that programs using it will get (assuming the
same seed values) the same stream of random numbers on different
platforms.

Alternatively, George Marsaglia has posted code here for a couple of
C implementations of Complementary-Multiply-With-Carry (CMWC) PRNGs,
which though not portable (IIRC) should be easy enough to adapt to
most C implementations. CMWC PRNGs are flexible, appear to have
excellent characteristics (they have long periods, pass all of
George's DIEHARD tests, etc), and are based on relatively simple
theory. (Output bits are the binary digits of the fractional portion
of a ratio with a long period in that representation, if memory
serves.) Then there's the Mersenne Twister and so forth.

At any rate, I agree with the basic point: if you're dissatisfied, or
think you might be, with rand(), better generators are not hard to
come by.

On a related note: there was a thread on sci.crypt not long ago
regarding producing an unbiased subrange of the PRNG's range (a
common task which we were recently discussing here) while consuming
the minimum possible number of bits from the generator's output. In
other words, someone presented a more sophisticated and less wasteful
approach than just throwing away everything below the largest in-
range multiple of the desired range. That may be of interest to PRNG
tinkerers. It shouldn't be hard to find with Google Groups.

--
Michael Wojcik mi************@microfocus.com

It does basically make you look fat and naked - but you see all this stuff.
-- Susan Hallowell, TSA Security Lab Director, on "backscatter" scanners

Nov 14 '05 #23

Eric Sosman

Michael Wojcik wrote:

At any rate, I agree with the basic point: if you're dissatisfied, or
think you might be, with rand(), better generators are not hard to
come by.

Since the quality of rand() is implementation-specific,
"better" is hard to justify in a universal sense. Also, even
for a finite set of generators it is hard to justify "better"
across all possible problem spaces.

It seems that everybody[*] eventually latches onto a favorite
substitute for rand(). A few years ago it was proposed that
the C Standard should mandate the "Mersenne Twister" generator
(as it was before it was fixed ...). Now there's a fan base for
Knuth (in whatever version is now current; it, too, has been fixed)
and for Marsaglia (a name to reckon with in RNG's, although it's
not clear which of his many generators is today's favorite). My
own not-so-humble opinion is that the C Standard should leave the
requirements on rand() as weak as they are today, but should describe
it as a "coarse" source of variates, suitable for "casual" use but
not for serious work.
[*] Yes, I confess it: I am as guilty as any. Permit me, if
I may, to introduce you to the "Minimal Standard Random Number
Generator" of Park and Miller, CACM volume 31 number 10 (Oct 1988).
It Rulez!

--
Eric Sosman
es*****@acm-dot-org.invalid

Nov 14 '05 #24

Ben Pfaff

Tim Rentsch <tx*@alumnus.caltech.edu> writes:

It's not necessary to program around poor implementations of rand(),
or even implementations that look like they might be poor. Instead,
it's easy enough to follow the recipe in Knuth's GraphBase book:

http://www-cs-faculty.stanford.edu/~knuth/sgb.html

Or, if you want a solution in portable C, you can use mine:
http://www.stanford.edu/~blp/writings/clc/random.html
--
int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\
\n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}

Nov 14 '05 #25

Tim Rentsch

mw*****@newsguy.com (Michael Wojcik) writes:

In article <kf*************@alumnus.caltech.edu>, Tim Rentsch <tx*@alumnus.caltech.edu> writes:

It's not necessary to program around poor implementations of rand(),
or even implementations that look like they might be poor. Instead,
it's easy enough to follow the recipe in Knuth's GraphBase book:

http://www-cs-faculty.stanford.edu/~knuth/sgb.html

and make a simple, fast, high-quality random number generator. And
you'll get the benefit that programs using it will get (assuming the
same seed values) the same stream of random numbers on different
platforms.
Alternatively, George Marsaglia has posted code here for a couple of
C implementations of Complementary-Multiply-With-Carry (CMWC) PRNGs,
which though not portable (IIRC) should be easy enough to adapt to
most C implementations. [snip]

The CMWC code that I was able to find (some of it in CLC postings)
usually used 'long long'. Maybe I just wasn't looking in the right
places. I'd like to see a write-up that explains enough so I can see
why and how the code is working, but that doesn't bury the reader in a
blizzard of terminology and notation that's typical of a lot of the
writing in mathematics. Also code that doesn't have to depend on a
'long long' or 'uint64_t' type. On the plus side, the CMWC PRNG's
seem pretty lightweight in terms of memory footprint, so if someone
wanted a large number of random number streams then CMWC looks like
it might be a good choice.

By the way, I had trouble tracking down web articles on CMWC, because
most of the web writing on CMWC says complImentary rather than
complEmentary.

On a related note: there was a thread on sci.crypt not long ago
regarding producing an unbiased subrange of the PRNG's range (a
common task which we were recently discussing here) while consuming
the minimum possible number of bits from the generator's output. In
other words, someone presented a more sophisticated and less wasteful
approach than just throwing away everything below the largest in-
range multiple of the desired range. [snip]

For practical purposes I'm inclined to think it's not worth the
trouble. Even in the worst case where the subrange is one bigger than
half the PRNG range, it still takes less than two calls to the PRNG on
the average to get an in-range result. Unless the time cost of
producing a PRN is pretty high (and it shouldn't be), it seems like
it would be simpler and faster to use the usual approach.

Nov 14 '05 #26

Tim Rentsch

Ben Pfaff <bl*@cs.stanford.edu> writes:

Tim Rentsch <tx*@alumnus.caltech.edu> writes:
It's not necessary to program around poor implementations of rand(),
or even implementations that look like they might be poor. Instead,
it's easy enough to follow the recipe in Knuth's GraphBase book:

http://www-cs-faculty.stanford.edu/~knuth/sgb.html

Or, if you want a solution in portable C, you can use mine:
http://www.stanford.edu/~blp/writings/clc/random.html

The GraphBase code is also portable C, I believe, except for one
dependency that is easily corrected - the 'long' operands that are
being subtracted should be cast to 'unsigned long' before subtracting.
The GraphBase code is also portable in the sense that it produces the
same values on all systems. It doesn't have implementation dependent
behavior (assuming the noted change to 'unsigned long' has been made).

Nov 14 '05 #27

Tim Rentsch

Eric Sosman <es*****@acm-dot-org.invalid> writes:

Michael Wojcik wrote:

At any rate, I agree with the basic point: if you're dissatisfied, or
think you might be, with rand(), better generators are not hard to
come by.
Since the quality of rand() is implementation-specific,
"better" is hard to justify in a universal sense. Also, even
for a finite set of generators it is hard to justify "better"
across all possible problem spaces.

I think you may have over-snipped. Michael was responding to a
comment (by myself) about implementations of rand(). Certainly it
seemed to me like his comment could or should be read about
implementations of rand(), not necessarily about rand() generally.

And, to defend the generic point - if one's worry about rand() is
that it might be bad on some (future) platform, there are good
replacements that are "better" in that they will not misbehave on
those platforms. So that's a sense of better that seems both
reasonable and defensible.

It seems that everybody[*] eventually latches onto a favorite
substitute for rand().
There certainly is some evidence for your thesis. Among the three
people responding to my mention of the GraphBase RNG, there were
four different ("favorite") alternatives suggested.

I should say that the GraphBase RNG is not my "favorite" RNG. It is
the one I recommend to people when they want to rely on something
better than rand(). Mostly that recommendation is not because it's
fantastic at generating random numbers; I do think it's reasonably
good but I don't think it's fantastic. Rather it's because it's
fairly good along several different axes, and not really bad along any
axis. All the other RNG's that I've seen are weak in one aspect or
another - slow, large memory footprint, not as portable in one way or
another, smaller cycle length, and/or not explained well enough. I
think most implementors want code that they really understand and can
use without the code being "black magic"; very few RNG's satisfy this
property, and IMO the GraphBase RNG is one of those that do. The
GraphBase RNG is also the only one I've seen (in recent reading) that
includes a test case to make sure the RNG has been coded correctly.

A few years ago it was proposed that
the C Standard should mandate the "Mersenne Twister" generator
(as it was before it was fixed ...). Now there's a fan base for
Knuth (in whatever version is now current; it, too, has been fixed)
and for Marsaglia (a name to reckon with in RNG's, although it's
not clear which of his many generators is today's favorite). My
own not-so-humble opinion is that the C Standard should leave the
requirements on rand() as weak as they are today, but should describe
it as a "coarse" source of variates, suitable for "casual" use but
not for serious work.

I agree with you - specifications for rand() should be left more or
less as is. But I hope that doesn't stop people from adding newer,
more dependable RNG's to the set that is presumed commonly available.

Nov 14 '05 #28

websnarf

Eric Sosman wrote:

The trouble is in the testing: If you sample three
hundred twenty-seven quadrillion rand() values and see
no zeroes, can you conclude that rand() will never, ever,
under any circumstances return a zero?

No, but there *are* statistical statements you *CAN* make. Either:

1. The calls to rand() are not independent

or

2. With a high degree of confidence you can say there is an
eccentricity in the distribution of the output probabilities that
underrepresents zero as an output.

This actually matters to some people, and its an interesting echo to
the "Can you use C for mathematical purposes?" thread from earlier that
nobody here seems to be sensitive to the above in any way.

---
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 14 '05 #29

Chris Croughton

On 9 Apr 2005 04:53:46 -0700, we******@gmail.com
<we******@gmail.com> wrote:

No, but there *are* statistical statements you *CAN* make. Either:

1. The calls to rand() are not independent
At least one implementation had the bottom bits cycling 0 1 2 3 on
successive calls. Not possible to detect statistically but detectable
using other methods.
or

2. With a high degree of confidence you can say there is an
eccentricity in the distribution of the output probabilities that
underrepresents zero as an output.
Or more generally that some values are represented more than others (one
I saw never produced RAND_MAX as a value).
This actually matters to some people, and its an interesting echo to
the "Can you use C for mathematical purposes?" thread from earlier that
nobody here seems to be sensitive to the above in any way.

Not true, some of us are sensitive to it, and prefer to use truly random
sources when they are available even though they are not "portable C"
(hardware random events, Linux /dev/random, timing keystroke entry,
etc.; GPG for instance will use whichever of those is around when
generating keys)). However, for a source of repeatable "random-ish"
numbers rand() is "good enough", and the 'repeatable' part is often the
most important (to be able to reproduce a problem for debugging).

The Fortran RNG may be 'better' in some senses, but it still isn't
totally random (unless the Fortran spec. now insists that the hardware
has an atomic source of random data!). The same will apply in any
language, people really concerned about it will use their own
implementations.

Chris C

Nov 14 '05 #30

Michael Wojcik

In article <H_********************@comcast.com>, Eric Sosman <es*****@acm-dot-org.invalid> writes:

Michael Wojcik wrote:

At any rate, I agree with the basic point: if you're dissatisfied, or
think you might be, with rand(), better generators are not hard to
come by.
Since the quality of rand() is implementation-specific,
"better" is hard to justify in a universal sense.

Sure. In this case, though, a "better" one is merely one that
produces less dissatisfaction in the subject (since dissatisfaction
with rand() is a precondition). Especially perverse cases aside that
looks eminently achievable, as you yourself suggest:
It seems that everybody[*] eventually latches onto a favorite
substitute for rand().
There you are, then. Everyone will be able to find a "better"
substitute for rand, on the grounds that they will feel that it's
better, and so it will displease them less. What's programming for,
if not to console programmers?
My own not-so-humble opinion is that the C Standard should leave the
requirements on rand() as weak as they are today, but should describe
it as a "coarse" source of variates, suitable for "casual" use but
not for serious work.

Agreed.

--
Michael Wojcik mi************@microfocus.com

What is it with this warm, quiet, nauseating bond between them?
-- Rumiko Takahashi, _Maison Ikkoku_, trans. Mari Morimoto, adapt. Gerard
Jones

Nov 14 '05 #31

Michael Wojcik

In article <kf*************@alumnus.caltech.edu>, Tim Rentsch <tx*@alumnus.caltech.edu> writes:

mw*****@newsguy.com (Michael Wojcik) writes:
Alternatively, George Marsaglia has posted code here for a couple of
C implementations of Complementary-Multiply-With-Carry (CMWC) PRNGs,
which though not portable (IIRC) should be easy enough to adapt to
most C implementations. [snip]
The CMWC code that I was able to find (some of it in CLC postings)
usually used 'long long'.

Yes, I think the version I have does too, though if memory serves
there were no great differences in adapting it to use a pair of
unsigned longs, if long has fewer than 64 bits. It's been a while
since I looked at it, though.
By the way, I had trouble tracking down web articles on CMWC, because
most of the web writing on CMWC says complImentary rather than
complEmentary.

Oh, well. (Perhaps it *is* "complimentary": the multiply says
something favorable about its operands, or happens for free. I
suspect this is just a case of homophone confusion, though.)

On a related note: there was a thread on sci.crypt not long ago
regarding producing an unbiased subrange of the PRNG's range (a
common task which we were recently discussing here) while consuming
the minimum possible number of bits from the generator's output.

For practical purposes I'm inclined to think it's not worth the
trouble.

Probably not, though it's a fun discussion. I can see some areas
where it might be useful, though, such as providing a deterministic-
time unbiased generator for a real-time application or making it
easier to blind the time and power consumption of the PRNG for crypto
purposes.

--
Michael Wojcik mi************@microfocus.com

Duck: No secret what's worth a hoot ought to be kept quiet.
Pogo: Secrets is usually perty doggone fascinatin'.
Duck: Egg-zackly ... it's completely illogical to keep a secret secret.
Pogo: An' unfair. -- Walt Kelly

Nov 14 '05 #32

Richard Bos

we******@gmail.com wrote:

1. The calls to rand() are not independent

or

2. With a high degree of confidence you can say there is an
eccentricity in the distribution of the output probabilities that
underrepresents zero as an output.

This actually matters to some people, and its an interesting echo to
the "Can you use C for mathematical purposes?" thread from earlier that
nobody here seems to be sensitive to the above in any way.

I dare say that people who use C for mathematical purposes _are_ aware
of the above, and moreover are aware that

Anyone who considers arithmetical methods of producing
random digits is, of course, in a state of sin.
Alfred E. Neumann, 1951

Richard

Nov 14 '05 #33

Grumble

Keith Thompson wrote:

If an implementation's rand() function never returns 0, the implementation
is either non-conforming or of poor quality.

Do you know of an implementation in which rand() never returns 0?

I was under the impression that the PRNG discussed in [1] had been
used often in the old days?

[1] "Random number generators: good ones are hard to find"
http://www-scf.usc.edu/~csci105/links/p1192-park.pdf

u_0 = 1
u_{n+1} = 16807*u_n mod (2^31-1)

Come to think of it, an implementor could very well return u_n - 1
and set RAND_MAX to 2^31-3...

--
Regards, Grumble

Nov 14 '05 #34

Keith Thompson

rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
[snip]

I dare say that people who use C for mathematical purposes _are_ aware
of the above, and moreover are aware that

Anyone who considers arithmetical methods of producing
random digits is, of course, in a state of sin.
Alfred E. Neumann, 1951

I dare say most of them are aware that the quotation is by John von
Neumann, not the guy on the cover of Mad Magazine.

(There is, of course, a connection between Mad Magazine and computer
science; google Potrzebie Knuth for details.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #35

Richard Bos

Keith Thompson <ks***@mib.org> wrote:

rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
[snip]
I dare say that people who use C for mathematical purposes _are_ aware
of the above, and moreover are aware that

Anyone who considers arithmetical methods of producing
random digits is, of course, in a state of sin.
Alfred E. Neumann, 1951
I dare say most of them are aware that the quotation is by John von
Neumann, not the guy on the cover of Mad Magazine.

<g> What, me worry?
(There is, of course, a connection between Mad Magazine and computer
science; google Potrzebie Knuth for details.)

Or buy Van der Linden's Expert C Programming.

Richard

Nov 14 '05 #36

Grumble

Grumble wrote:

Keith Thompson wrote:
If an implementation's rand() function never returns 0, the
implementation is either non-conforming or of poor quality.

Do you know of an implementation in which rand() never returns 0?

I was under the impression that the PRNG discussed in [1] had been
used often in the old days?

[1] "Random number generators: good ones are hard to find"
http://www-scf.usc.edu/~csci105/links/p1192-park.pdf

u_0 = 1
u_{n+1} = 16807*u_n mod (2^31-1)

Come to think of it, an implementor could very well return u_n - 1
and set RAND_MAX to 2^31-3...

Has this implementation of rand() been used often?

Nov 14 '05 #37

Random Integers from 0 to 999

Similar topics