459,179 Members | 1,101 Online
Need help? Post your question and get tips & solutions from a community of 459,179 IT Pros & Developers. It's quick & easy.

# Random Integers from 0 to 999

 P: n/a Someone once posted the following macro on clc: #define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX) Unfortunately it's flawed. If rand() returns RAND_MAX the result can be one larger than b. So can someone provide a *proper* macro (or function) that returns a random integer between (actually in) a range of values? For example randint(0, 999) could return: 0 10 777 999 Mike Nov 14 '05 #1
36 Replies

 P: n/a Michael B Allen wrote: Someone once posted the following macro on clc: #define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX) Unfortunately it's flawed. If rand() returns RAND_MAX the result can be one larger than b. So can someone provide a *proper* macro (or function) that returns a random integer between (actually in) a range of values? For example randint(0, 999) could return: 0 10 777 999 Where is your shot at it? Without attribution, nobody knows from which source this comes. #define RANDINT(a,b) (a)+(((b)-(a)+1)*(double)rand()/(1u+RAND_MAX)) may serve you better; note the double which serves you well if RAND_MAX has more digits than float can represent. There are other deficiencies for this approach; why don't you use the suggestions of, say, Lawrence Kirby (or other regulars) which make all values equally probable? Apart from that: I would rather use a function for this purpose. If you need many random numbers, consider filling an array and retrieving from there until it is "used up", then refilling. Cheers Michael -- E-Mail: Mine is an /at/ gmx /dot/ de address. Nov 14 '05 #2

 P: n/a Michael B Allen wrote: Someone once posted the following macro on clc: #define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX) Unfortunately it's flawed. If rand() returns RAND_MAX the result can be one larger than b. So can someone provide a *proper* macro (or function) that returns a random integer between (actually in) a range of values? For example randint(0, 999) could return: 0 10 777 999 int randint(int min, int max) { assert(min <= max); return min + rand() % (max - min + 1); } 0 <= rand() <= RAND_MAX 0 <= rand()%(max-min+1) < max-min+1 0 <= rand()%(max-min+1) <= max-min min <= min+rand()%(max-min+1) <= max Nov 14 '05 #3

 P: n/a Michael Mair wrote: Michael B Allen wrote: Someone once posted the following macro on clc: #define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX) Unfortunately it's flawed. If rand() returns RAND_MAX the result can be one larger than b. Where is your shot at it? Without attribution, nobody knows from which source this comes. #define RANDINT(a,b) (a)+(((b)-(a)+1)*(double)rand()/(1u+RAND_MAX)) may serve you better; note the double which serves you well if RAND_MAX has more digits than float can represent. There are other deficiencies for this approach; why don't you use the suggestions of, say, Lawrence Kirby (or other regulars) which make all values equally probable? Apart from that: I would rather use a function for this purpose. If you need many random numbers, consider filling an array and retrieving from there until it is "used up", then refilling. Unluckily, both, your and Grumble's snippet produce UB due to integer overflow when RAND_MAX happens to equal INT_MAX (like on all my 16-Bit compilers) and the arguments are (0,RAND_MAX). It looks like #define RANDINT(a,b)\ ((b)-(a)==RAND_MAX?(a)+rand():(a)+rand()%((b)-(a)+1)) will do it, although the distribution will be abysmal. Then again, for embedded architectures, where neither floating point nor much RAM is an option, I use such generators exactly for their simplicity and not the randomness. Most of my test data just needs to be better than iterative, but not truly random. Where "real" randomness is needed I go and ask the big boys (the mathematicians); uniform distributions won't cut it most of the time anyway. Mark Nov 14 '05 #4

 P: n/a Mark Piffer wrote: Michael Mair wrote:Michael B Allen wrote:Someone once posted the following macro on clc:#define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX)Unfortunately it's flawed. If rand() returns RAND_MAX the result can beone larger than b.Where is your shot at it? Without attribution, nobody knowsfrom which source this comes. #define RANDINT(a,b) (a)+(((b)-(a)+1)*(double)rand()/(1u+RAND_MAX))may serve you better; note the double which serves you wellif RAND_MAX has more digits than float can represent.There are other deficiencies for this approach; why don't youuse the suggestions of, say, Lawrence Kirby (or other regulars)which make all values equally probable?Apart from that: I would rather use a function for this purpose.If you need many random numbers, consider filling an array andretrieving from there until it is "used up", then refilling. Unluckily, both, your and Grumble's snippet produce UB due to integer overflow when RAND_MAX happens to equal INT_MAX (like on all my 16-Bit compilers) and the arguments are (0,RAND_MAX). It looks like #define RANDINT(a,b)\ ((b)-(a)==RAND_MAX?(a)+rand():(a)+rand()%((b)-(a)+1)) will do it, although the distribution will be abysmal. Then again, for embedded architectures, where neither floating point nor much RAM is an option, I use such generators exactly for their simplicity and not the randomness. Most of my test data just needs to be better than iterative, but not truly random. Where "real" randomness is needed I go and ask the big boys (the mathematicians); uniform distributions won't cut it most of the time anyway. You are right, I forgot to mention this particular problem and did not correct it; however IIRC it is covered in the mentioned message by Lawrence Kirby. If I have too much time on my hands, I will search for it. I still hold that writing a function is the better way; you can cut off the excess random values and handle special cases in a transparent way. If we speak of overkill: The Mersenne Twister should do for a start :-) Cheers Michael -- E-Mail: Mine is an /at/ gmx /dot/ de address. Nov 14 '05 #5

 P: n/a On Thu, 24 Mar 2005 09:30:34 +0100, Grumble wrote: Michael B Allen wrote: Someone once posted the following macro on clc: #define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX) Unfortunately it's flawed. If rand() returns RAND_MAX the result can be one larger than b. The simple solution is to use (RAND_MAX+1) as the divisor. However, it has the same probem as below. So can someone provide a *proper* macro (or function) that returns a random integer between (actually in) a range of values? For example randint(0, 999) could return: 0 10 777 999 int randint(int min, int max) { assert(min <= max); return min + rand() % (max - min + 1); } That just uses different bits to do the same thing (except that you corrected the "off by one" error). However, there are a number of poor implementations of rand() where the bottom bits are a lot more predictable than the higher ones (rand() % 4 returning a continuous repeated sequence of 0, 1, 2, 3 in one of them!). 0 <= rand() <= RAND_MAX 0 <= rand()%(max-min+1) < max-min+1 0 <= rand()%(max-min+1) <= max-min min <= min+rand()%(max-min+1) <= max If the range is not a submultiple of (RAND_MAX + 1) then it does not give equal probabilities of all of the numbers. For instance, take a small RAND_MAX of 7 (0 <= rand() <= 7) and a range of [0..4]: rand() randint() 0 0 1 1 2 2 3 3 4 4 5 0 6 1 7 2 results 0..2 occur twice as often as 3..4. Granted that when RAND_MAX is very much bigger than the range the error becomes smaller, it is still there (the potential error is range/(RAND_MAX+1)). Chris C Nov 14 '05 #6

 P: n/a In article , Grumble wrote:int randint(int min, int max){ assert(min <= max); return min + rand() % (max - min + 1);}0 <= rand() <= RAND_MAX0 <= rand()%(max-min+1) < max-min+10 <= rand()%(max-min+1) <= max-minmin <= min+rand()%(max-min+1) <= max That's a terrible way of generating "random" numbers. When we ask for a "random number" in the range min..max, we want every number within that range to occur with equal probability. The following example shows that your method does not give a uniformly distributed random number. For the sake of illustration, let's say RAND_MAX is 7. Suppose you want random numbers in the set {0,1,2,3,4,5}. Then according to your algorithm: rand() randint() 0 -> 0 1 -> 1 2 -> 2 3 -> 3 4 -> 4 5 -> 5 6 -> 0 7 -> 1 Therefore the numbers 0 and 1 are twice as likely to show up than other numbers. Here is a better way: int randint(int min, int max) { assert(min <= max); return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX)); } -- Rouben Rostamian Nov 14 '05 #7

 P: n/a Rouben Rostamian wrote: Grumble wrote: int randint(int min, int max) { assert(min <= max); return min + rand() % (max - min + 1); } That's a terrible way of generating "random" numbers. When we ask for a "random number" in the range min..max, we want every number within that range to occur with equal probability. The following example shows that your method does not give a uniformly distributed random number. For the sake of illustration, let's say RAND_MAX is 7. Suppose you want random numbers in the set {0,1,2,3,4,5}. Then according to your algorithm: rand() randint() 0 -> 0 1 -> 1 2 -> 2 3 -> 3 4 -> 4 5 -> 5 6 -> 0 7 -> 1 Therefore the numbers 0 and 1 are twice as likely to show up than other numbers. Here is a better way: int randint(int min, int max) { assert(min <= max); return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX)); } You algorithm is as "broken" as mine because of a fundamental problem: If you try to place 8 balls in 6 buckets, then, no matter how you slice and dice it, you'll end up with more balls in some buckets. The only way out is to discard 2 balls. -- Regards, Grumble Nov 14 '05 #8

 P: n/a Michael B Allen wrote: Someone once posted the following macro on clc: #define randint(a,b) (a)+(((b)-(a)+1)*(float)rand()/RAND_MAX) Unfortunately it's flawed. If rand() returns RAND_MAX the result can be one larger than b. #define ranrange(a, b) (int)((a) + rand()/((double)RAND_MAX + 1) \ * ((b) - (a) + 1)) assuming 0 == rand() can occur. Many systems will never return 0, so: #define ranrange(a, b) (int)((a) + (rand() - 1)/((double)RAND_MAX) \ * ((b) - (a) + 1)) (untested) So can someone provide a *proper* macro (or function) that returns a random integer between (actually in) a range of values? For example randint(0, 999) could return: 0 10 777 999 -- "If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson Nov 14 '05 #9

 P: n/a Grumble wrote: Rouben Rostamian wrote: Grumble wrote: int randint(int min, int max) { assert(min <= max); return min + rand() % (max - min + 1); } .... snip ... For the sake of illustration, let's say RAND_MAX is 7. Suppose you want random numbers in the set {0,1,2,3,4,5}. Then according to your algorithm: rand() randint() 0 -> 0 1 -> 1 2 -> 2 3 -> 3 4 -> 4 5 -> 5 6 -> 0 7 -> 1 Therefore the numbers 0 and 1 are twice as likely to show up than other numbers. Here is a better way: int randint(int min, int max) { assert(min <= max); return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX)); } You algorithm is as "broken" as mine because of a fundamental problem: If you try to place 8 balls in 6 buckets, then, no matter how you slice and dice it, you'll end up with more balls in some buckets. The only way out is to discard 2 balls. Since RAND_MAX is normally much larger than (max - min) that effect is minimal. However, you can allow for it by: unsigned int ranrange(unsigned int min, unsigned int max) { unsigned int t, d, n; if (min > max) { /* No assert, just an interval */ t = min; min = max; max = t; } t = max - min + 1; d = RAND_MAX / t; d *= t; do { /* discard the few biasing values */ n = rand(); /* assuming rand() unbiased */ } while (n > d); return min + (int)((double)t * n/(1.0 + d)); } /* untested */ It probably requires tweaking for the minimum return from rand, which may well be either 1 or 0. -- "If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson Nov 14 '05 #10

 P: n/a Rouben Rostamian wrote: ... Then according to your algorithm: rand() randint() 0 -> 0 1 -> 1 2 -> 2 3 -> 3 4 -> 4 5 -> 5 6 -> 0 7 -> 1 Therefore the numbers 0 and 1 are twice as likely to show up than other numbers. Here is a better way: int randint(int min, int max) { assert(min <= max); return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX)); } ... There's absolutely no way to implement the required functionality by a stateless function that simply maps 'int' to 'int'. Regardless of how you do it, some numbers will appear with higher probability than other numbers. The version you provided is as "terrible" as the previous one for the very same reason - for the same ranges of input and output values some numbers will be "twice as likely to show up than other numbers" (you should've tested it before posting here). When [0, RAND_MAX] range is significantly wider than the requiested [min, max] range, this is not really a problem. In other cases, only a stateful mapping function will help. -- Best regards, Andrey Tarasevich Nov 14 '05 #11

 P: n/a On Thu, 24 Mar 2005 15:09:47 +0000 (UTC), Rouben Rostamian wrote: In article , Grumble wrote:int randint(int min, int max){ assert(min <= max); return min + rand() % (max - min + 1);}0 <= rand() <= RAND_MAX0 <= rand()%(max-min+1) < max-min+10 <= rand()%(max-min+1) <= max-minmin <= min+rand()%(max-min+1) <= max That's a terrible way of generating "random" numbers. When we ask for a "random number" in the range min..max, we want every number within that range to occur with equal probability. The following example shows that your method does not give a uniformly distributed random number. For the sake of illustration, let's say RAND_MAX is 7. Suppose you want random numbers in the set {0,1,2,3,4,5}. Then according to your algorithm: rand() randint() 0 -> 0 1 -> 1 2 -> 2 3 -> 3 4 -> 4 5 -> 5 6 -> 0 7 -> 1 Therefore the numbers 0 and 1 are twice as likely to show up than other numbers. Here is a better way: int randint(int min, int max) { assert(min <= max); return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX)); } Why is that any better? Assuming your example of RAND_MAX==7, and wanting numbers in the set [0,5]: rand() randint() 0 -> 0*6/8 0/8 -> 0 1 -> 1*6/8 6/8 -> 0 2 -> 2*6/8 12/8 -> 1 3 -> 3*6/8 18/8 -> 2 4 -> 4*6/8 24/8 -> 3 5 -> 5*6/8 30/8 -> 3 6 -> 6*6/8 36/8 -> 4 7 -> 7*6/8 42/8 -> 5 All you've done is to change it so that 0 and 3 get the extra hits instead of 0 and 1. OK, it looks slightly more uniform (the mean will be better,2.25 instead of 2, it should be 2.5) but it's still got the same problem of two of the numbers occuring twice as often as the others. A way of getting round the problem is to use an iterative method and discard values out of range: int randint(int min, int max) { unsigned range = max - min + 1; int bits = 1; int result; assert(range > 0 && range <= RAND_MAX); while (range-1 > bits) bits = bits*2 + 1; do result = rand() & bits; while (result > range); return result + min; } This only works if RAND_INT is 2^n - 1, but that's the case on all implementations I've found. It is also susceptible to the quality of the lower bits of rand(), which on some implementations are not very random... Chris C Nov 14 '05 #12

 P: n/a Rouben Rostamian wrote: [...] rand() randint() 0 -> 0 1 -> 1 2 -> 2 3 -> 3 4 -> 4 5 -> 5 6 -> 0 7 -> 1 Therefore the numbers 0 and 1 are twice as likely to show up than other numbers. Here is a better way: int randint(int min, int max) { assert(min <= max); return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX)); } This solution (which, unfortunately, is endorsed by the comp.lang.c FAQ) is no better. As others have posted, you're up against the pigeon-hole principle, you can't fix it by just changing the mapping in some way. And its a real problem in practice -- RAND_MAX need not be higher than 65535, and in most of the compilers on the most popular platform it is indeed exactly this. I have a more complete summary of the issue here: http://www.pobox.com/~qed/random.html --- Paul Hsieh http://www.pobox.com/~qed/ http://bstring.sf.net/ Nov 14 '05 #13

 P: n/a On 26 Mar 2005 16:50:31 -0800, we******@gmail.com wrote: Rouben Rostamian wrote: return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX)); This solution (which, unfortunately, is endorsed by the comp.lang.c FAQ) is no better. As others have posted, you're up against the pigeon-hole principle, you can't fix it by just changing the mapping in some way. And its a real problem in practice -- RAND_MAX need not be higher than 65535, and in most of the compilers on the most popular platform it is indeed exactly this. I have a more complete summary of the issue here: http://www.pobox.com/~qed/random.html Need not be higher than, and often is, 32767 (= minimum INT_MAX). Although for a desired choice range of no more than about 30, I might consider the <.1% nonuniformity acceptable. - David.Thompson1 at worldnet.att.net Nov 14 '05 #14

 P: n/a Dave Thompson wrote: On 26 Mar 2005 16:50:31 -0800, we******@gmail.com wrote: Rouben Rostamian wrote: return min + (int)((max-min+1.0)*rand()/(1.0+RAND_MAX)); This solution (which, unfortunately, is endorsed by the comp.lang.c FAQ) is no better. As others have posted, you're up against the pigeon-hole principle, you can't fix it by just changing the mapping in some way. And its a real problem in practice -- RAND_MAX need not be higher than 65535, and in most of the compilers on the most popular platform it is indeed exactly this. I have a more complete summary of the issue here: http://www.pobox.com/~qed/random.html Need not be higher than, and often is, 32767 (= minimum INT_MAX). Although for a desired choice range of no more than about 30, I might consider the <.1% nonuniformity acceptable. AFAIK there is no specification for RAND_MIN, which will be at least 1 for many implementations. If you require specific characteristics the only answer is to supply your own random implmentation. Even then you need to be aware of what is and is not guaranteed. For example, 32 bit ints are not guaranteed, contrary to Hsiehs (websnarf) hidden assumptions in at least some of his code. -- "If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson Nov 14 '05 #15

 P: n/a CBFalconer writes: [...] AFAIK there is no specification for RAND_MIN, which will be at least 1 for many implementations. [...] There is no RAND_MIN. The standard says that rand() "computes a sequence of pseudo-random integers in the range 0 to RAND_MAX". If an implementation's rand() function never returns 0, the implementation is either non-conforming or of poor quality. Do you know of an implementation in which rand() never returns 0? -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> We must do something. This is something. Therefore, we must do this. Nov 14 '05 #16

 P: n/a Keith Thompson wrote: CBFalconer writes: [...] AFAIK there is no specification for RAND_MIN, which will be at least 1 for many implementations. [...] There is no RAND_MIN. The standard says that rand() "computes a sequence of pseudo-random integers in the range 0 to RAND_MAX". If an implementation's rand() function never returns 0, the implementation is either non-conforming or of poor quality. Do you know of an implementation in which rand() never returns 0? Yes. Any Linear Congruential with a zero adder. Any CRC based system. -- "If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson Nov 14 '05 #17

 P: n/a CBFalconer writes: Keith Thompson wrote: CBFalconer writes: [...] AFAIK there is no specification for RAND_MIN, which will be at least 1 for many implementations. [...] There is no RAND_MIN. The standard says that rand() "computes a sequence of pseudo-random integers in the range 0 to RAND_MAX". If an implementation's rand() function never returns 0, the implementation is either non-conforming or of poor quality. Do you know of an implementation in which rand() never returns 0? Yes. Any Linear Congruential with a zero adder. Any CRC based system. Do you know of a specific C library implementation that uses an algorithm in which rand() never returns 0? Even the sample implementation in the standard (which isn't very good) returns 0 every few thousand iterations. As long as the internal state is bigger than the values returned, you'll probably get 0 every now and then. -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> We must do something. This is something. Therefore, we must do this. Nov 14 '05 #19

 P: n/a Eric Sosman wrote: Keith Thompson wrote: CBFalconer writes: [...] AFAIK there is no specification for RAND_MIN, which will be at least 1 for many implementations. [...] There is no RAND_MIN. The standard says that rand() "computes a sequence of pseudo-random integers in the range 0 to RAND_MAX". If an implementation's rand() function never returns 0, the implementation is either non-conforming or of poor quality. Do you know of an implementation in which rand() never returns 0? The trouble is in the testing: If you sample three hundred twenty-seven quadrillion rand() values and see no zeroes, can you conclude that rand() will never, ever, under any circumstances return a zero? I am thinking of cases where the generation method is known, and thus the lack of a zero can be guaranteed. It's reminiscent of the demonstration that all crows are black. You can travel the world observing crows and noting that all seen are black; with each successive black crow your confidence in the hypothesis increases. But "all crows are Nah, it just confirms that all crows have a black side, which they normally present to you. -- "If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson Nov 14 '05 #20

 P: n/a Keith Thompson wrote: CBFalconer writes: Keith Thompson wrote: .... snip ... Do you know of an implementation in which rand() never returns 0? Yes. Any Linear Congruential with a zero adder. Any CRC based system. Do you know of a specific C library implementation that uses an algorithm in which rand() never returns 0? No. :-) But I never bothered looking. -- "If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson Nov 14 '05 #21

 P: n/a Eric Sosman writes: Keith Thompson wrote: CBFalconer writes: [...]AFAIK there is no specification for RAND_MIN, which will be atleast 1 for many implementations. [...] There is no RAND_MIN. The standard says that rand() "computes a sequence of pseudo-random integers in the range 0 to RAND_MAX". If an implementation's rand() function never returns 0, the implementation is either non-conforming or of poor quality. Do you know of an implementation in which rand() never returns 0? The trouble is in the testing: If you sample three hundred twenty-seven quadrillion rand() values and see no zeroes, can you conclude that rand() will never, ever, under any circumstances return a zero? If RAND_MAX is something reasonable, like 2**32 - 1 or less, do 100*(RAND_MAX+1) iterations, and if zero comes up less than 10 or more than 1000 times, the implementation of rand() is no good anyway. (If RAND_MAX is >= 2**32, do a similar test with 100*2**32 iterations, checking for values < RAND_MAX/2**32.) It's not necessary to program around poor implementations of rand(), or even implementations that look like they might be poor. Instead, it's easy enough to follow the recipe in Knuth's GraphBase book: http://www-cs-faculty.stanford.edu/~knuth/sgb.html and make a simple, fast, high-quality random number generator. And you'll get the benefit that programs using it will get (assuming the same seed values) the same stream of random numbers on different platforms. Nov 14 '05 #22

 P: n/a In article , Tim Rentsch writes: It's not necessary to program around poor implementations of rand(), or even implementations that look like they might be poor. Instead, it's easy enough to follow the recipe in Knuth's GraphBase book: http://www-cs-faculty.stanford.edu/~knuth/sgb.html and make a simple, fast, high-quality random number generator. And you'll get the benefit that programs using it will get (assuming the same seed values) the same stream of random numbers on different platforms. Alternatively, George Marsaglia has posted code here for a couple of C implementations of Complementary-Multiply-With-Carry (CMWC) PRNGs, which though not portable (IIRC) should be easy enough to adapt to most C implementations. CMWC PRNGs are flexible, appear to have excellent characteristics (they have long periods, pass all of George's DIEHARD tests, etc), and are based on relatively simple theory. (Output bits are the binary digits of the fractional portion of a ratio with a long period in that representation, if memory serves.) Then there's the Mersenne Twister and so forth. At any rate, I agree with the basic point: if you're dissatisfied, or think you might be, with rand(), better generators are not hard to come by. On a related note: there was a thread on sci.crypt not long ago regarding producing an unbiased subrange of the PRNG's range (a common task which we were recently discussing here) while consuming the minimum possible number of bits from the generator's output. In other words, someone presented a more sophisticated and less wasteful approach than just throwing away everything below the largest in- range multiple of the desired range. That may be of interest to PRNG tinkerers. It shouldn't be hard to find with Google Groups. -- Michael Wojcik mi************@microfocus.com It does basically make you look fat and naked - but you see all this stuff. -- Susan Hallowell, TSA Security Lab Director, on "backscatter" scanners Nov 14 '05 #23

 P: n/a Michael Wojcik wrote: At any rate, I agree with the basic point: if you're dissatisfied, or think you might be, with rand(), better generators are not hard to come by. Since the quality of rand() is implementation-specific, "better" is hard to justify in a universal sense. Also, even for a finite set of generators it is hard to justify "better" across all possible problem spaces. It seems that everybody[*] eventually latches onto a favorite substitute for rand(). A few years ago it was proposed that the C Standard should mandate the "Mersenne Twister" generator (as it was before it was fixed ...). Now there's a fan base for Knuth (in whatever version is now current; it, too, has been fixed) and for Marsaglia (a name to reckon with in RNG's, although it's not clear which of his many generators is today's favorite). My own not-so-humble opinion is that the C Standard should leave the requirements on rand() as weak as they are today, but should describe it as a "coarse" source of variates, suitable for "casual" use but not for serious work. [*] Yes, I confess it: I am as guilty as any. Permit me, if I may, to introduce you to the "Minimal Standard Random Number Generator" of Park and Miller, CACM volume 31 number 10 (Oct 1988). It Rulez! -- Eric Sosman es*****@acm-dot-org.invalid Nov 14 '05 #24

 P: n/a Tim Rentsch writes: It's not necessary to program around poor implementations of rand(), or even implementations that look like they might be poor. Instead, it's easy enough to follow the recipe in Knuth's GraphBase book: http://www-cs-faculty.stanford.edu/~knuth/sgb.html Or, if you want a solution in portable C, you can use mine: http://www.stanford.edu/~blp/writings/clc/random.html -- int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv wxyz.\ \n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\ );while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p[i]\ );}return 0;} Nov 14 '05 #25

 P: n/a mw*****@newsguy.com (Michael Wojcik) writes: In article , Tim Rentsch writes: It's not necessary to program around poor implementations of rand(), or even implementations that look like they might be poor. Instead, it's easy enough to follow the recipe in Knuth's GraphBase book: http://www-cs-faculty.stanford.edu/~knuth/sgb.html and make a simple, fast, high-quality random number generator. And you'll get the benefit that programs using it will get (assuming the same seed values) the same stream of random numbers on different platforms. Alternatively, George Marsaglia has posted code here for a couple of C implementations of Complementary-Multiply-With-Carry (CMWC) PRNGs, which though not portable (IIRC) should be easy enough to adapt to most C implementations. [snip] The CMWC code that I was able to find (some of it in CLC postings) usually used 'long long'. Maybe I just wasn't looking in the right places. I'd like to see a write-up that explains enough so I can see why and how the code is working, but that doesn't bury the reader in a blizzard of terminology and notation that's typical of a lot of the writing in mathematics. Also code that doesn't have to depend on a 'long long' or 'uint64_t' type. On the plus side, the CMWC PRNG's seem pretty lightweight in terms of memory footprint, so if someone wanted a large number of random number streams then CMWC looks like it might be a good choice. By the way, I had trouble tracking down web articles on CMWC, because most of the web writing on CMWC says complImentary rather than complEmentary. On a related note: there was a thread on sci.crypt not long ago regarding producing an unbiased subrange of the PRNG's range (a common task which we were recently discussing here) while consuming the minimum possible number of bits from the generator's output. In other words, someone presented a more sophisticated and less wasteful approach than just throwing away everything below the largest in- range multiple of the desired range. [snip] For practical purposes I'm inclined to think it's not worth the trouble. Even in the worst case where the subrange is one bigger than half the PRNG range, it still takes less than two calls to the PRNG on the average to get an in-range result. Unless the time cost of producing a PRN is pretty high (and it shouldn't be), it seems like it would be simpler and faster to use the usual approach. Nov 14 '05 #26

 P: n/a Ben Pfaff writes: Tim Rentsch writes: It's not necessary to program around poor implementations of rand(), or even implementations that look like they might be poor. Instead, it's easy enough to follow the recipe in Knuth's GraphBase book: http://www-cs-faculty.stanford.edu/~knuth/sgb.html Or, if you want a solution in portable C, you can use mine: http://www.stanford.edu/~blp/writings/clc/random.html The GraphBase code is also portable C, I believe, except for one dependency that is easily corrected - the 'long' operands that are being subtracted should be cast to 'unsigned long' before subtracting. The GraphBase code is also portable in the sense that it produces the same values on all systems. It doesn't have implementation dependent behavior (assuming the noted change to 'unsigned long' has been made). Nov 14 '05 #27

 P: n/a Eric Sosman wrote: The trouble is in the testing: If you sample three hundred twenty-seven quadrillion rand() values and see no zeroes, can you conclude that rand() will never, ever, under any circumstances return a zero? No, but there *are* statistical statements you *CAN* make. Either: 1. The calls to rand() are not independent or 2. With a high degree of confidence you can say there is an eccentricity in the distribution of the output probabilities that underrepresents zero as an output. This actually matters to some people, and its an interesting echo to the "Can you use C for mathematical purposes?" thread from earlier that nobody here seems to be sensitive to the above in any way. --- Paul Hsieh http://www.pobox.com/~qed/ http://bstring.sf.net/ Nov 14 '05 #29

 P: n/a On 9 Apr 2005 04:53:46 -0700, we******@gmail.com wrote: No, but there *are* statistical statements you *CAN* make. Either: 1. The calls to rand() are not independent At least one implementation had the bottom bits cycling 0 1 2 3 on successive calls. Not possible to detect statistically but detectable using other methods. or 2. With a high degree of confidence you can say there is an eccentricity in the distribution of the output probabilities that underrepresents zero as an output. Or more generally that some values are represented more than others (one I saw never produced RAND_MAX as a value). This actually matters to some people, and its an interesting echo to the "Can you use C for mathematical purposes?" thread from earlier that nobody here seems to be sensitive to the above in any way. Not true, some of us are sensitive to it, and prefer to use truly random sources when they are available even though they are not "portable C" (hardware random events, Linux /dev/random, timing keystroke entry, etc.; GPG for instance will use whichever of those is around when generating keys)). However, for a source of repeatable "random-ish" numbers rand() is "good enough", and the 'repeatable' part is often the most important (to be able to reproduce a problem for debugging). The Fortran RNG may be 'better' in some senses, but it still isn't totally random (unless the Fortran spec. now insists that the hardware has an atomic source of random data!). The same will apply in any language, people really concerned about it will use their own implementations. Chris C Nov 14 '05 #30

 P: n/a In article , Eric Sosman writes: Michael Wojcik wrote: At any rate, I agree with the basic point: if you're dissatisfied, or think you might be, with rand(), better generators are not hard to come by. Since the quality of rand() is implementation-specific, "better" is hard to justify in a universal sense. Sure. In this case, though, a "better" one is merely one that produces less dissatisfaction in the subject (since dissatisfaction with rand() is a precondition). Especially perverse cases aside that looks eminently achievable, as you yourself suggest: It seems that everybody[*] eventually latches onto a favorite substitute for rand(). There you are, then. Everyone will be able to find a "better" substitute for rand, on the grounds that they will feel that it's better, and so it will displease them less. What's programming for, if not to console programmers? My own not-so-humble opinion is that the C Standard should leave the requirements on rand() as weak as they are today, but should describe it as a "coarse" source of variates, suitable for "casual" use but not for serious work. Agreed. -- Michael Wojcik mi************@microfocus.com What is it with this warm, quiet, nauseating bond between them? -- Rumiko Takahashi, _Maison Ikkoku_, trans. Mari Morimoto, adapt. Gerard Jones Nov 14 '05 #31

 P: n/a In article , Tim Rentsch writes: mw*****@newsguy.com (Michael Wojcik) writes: Alternatively, George Marsaglia has posted code here for a couple of C implementations of Complementary-Multiply-With-Carry (CMWC) PRNGs, which though not portable (IIRC) should be easy enough to adapt to most C implementations. [snip] The CMWC code that I was able to find (some of it in CLC postings) usually used 'long long'. Yes, I think the version I have does too, though if memory serves there were no great differences in adapting it to use a pair of unsigned longs, if long has fewer than 64 bits. It's been a while since I looked at it, though. By the way, I had trouble tracking down web articles on CMWC, because most of the web writing on CMWC says complImentary rather than complEmentary. Oh, well. (Perhaps it *is* "complimentary": the multiply says something favorable about its operands, or happens for free. I suspect this is just a case of homophone confusion, though.) On a related note: there was a thread on sci.crypt not long ago regarding producing an unbiased subrange of the PRNG's range (a common task which we were recently discussing here) while consuming the minimum possible number of bits from the generator's output. For practical purposes I'm inclined to think it's not worth the trouble. Probably not, though it's a fun discussion. I can see some areas where it might be useful, though, such as providing a deterministic- time unbiased generator for a real-time application or making it easier to blind the time and power consumption of the PRNG for crypto purposes. -- Michael Wojcik mi************@microfocus.com Duck: No secret what's worth a hoot ought to be kept quiet. Pogo: Secrets is usually perty doggone fascinatin'. Duck: Egg-zackly ... it's completely illogical to keep a secret secret. Pogo: An' unfair. -- Walt Kelly Nov 14 '05 #32

 P: n/a we******@gmail.com wrote: 1. The calls to rand() are not independent or 2. With a high degree of confidence you can say there is an eccentricity in the distribution of the output probabilities that underrepresents zero as an output. This actually matters to some people, and its an interesting echo to the "Can you use C for mathematical purposes?" thread from earlier that nobody here seems to be sensitive to the above in any way. I dare say that people who use C for mathematical purposes _are_ aware of the above, and moreover are aware that Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin. Alfred E. Neumann, 1951 Richard Nov 14 '05 #33

 P: n/a Keith Thompson wrote: If an implementation's rand() function never returns 0, the implementation is either non-conforming or of poor quality. Do you know of an implementation in which rand() never returns 0? I was under the impression that the PRNG discussed in [1] had been used often in the old days? [1] "Random number generators: good ones are hard to find" http://www-scf.usc.edu/~csci105/links/p1192-park.pdf u_0 = 1 u_{n+1} = 16807*u_n mod (2^31-1) Come to think of it, an implementor could very well return u_n - 1 and set RAND_MAX to 2^31-3... -- Regards, Grumble Nov 14 '05 #34

 P: n/a rl*@hoekstra-uitgeverij.nl (Richard Bos) writes: [snip] I dare say that people who use C for mathematical purposes _are_ aware of the above, and moreover are aware that Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin. Alfred E. Neumann, 1951 I dare say most of them are aware that the quotation is by John von Neumann, not the guy on the cover of Mad Magazine. (There is, of course, a connection between Mad Magazine and computer science; google Potrzebie Knuth for details.) -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> We must do something. This is something. Therefore, we must do this. Nov 14 '05 #35

 P: n/a Keith Thompson wrote: rl*@hoekstra-uitgeverij.nl (Richard Bos) writes: [snip] I dare say that people who use C for mathematical purposes _are_ aware of the above, and moreover are aware that Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin. Alfred E. Neumann, 1951 I dare say most of them are aware that the quotation is by John von Neumann, not the guy on the cover of Mad Magazine. What, me worry? (There is, of course, a connection between Mad Magazine and computer science; google Potrzebie Knuth for details.) Or buy Van der Linden's Expert C Programming. Richard Nov 14 '05 #36

 P: n/a Grumble wrote: Keith Thompson wrote: If an implementation's rand() function never returns 0, the implementation is either non-conforming or of poor quality. Do you know of an implementation in which rand() never returns 0? I was under the impression that the PRNG discussed in [1] had been used often in the old days? [1] "Random number generators: good ones are hard to find" http://www-scf.usc.edu/~csci105/links/p1192-park.pdf u_0 = 1 u_{n+1} = 16807*u_n mod (2^31-1) Come to think of it, an implementor could very well return u_n - 1 and set RAND_MAX to 2^31-3... Has this implementation of rand() been used often? Nov 14 '05 #37

### This discussion thread is closed

Replies have been disabled for this discussion.