By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,562 Members | 1,237 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,562 IT Pros & Developers. It's quick & easy.

Random integer program: critique?

P: n/a

I needed a quick program called "random" that gives a random
positive integer from n1 to n2. For example, if I type
"random 38, 43", I want the program to print a random member
of the set {38, 39, 40, 41, 42, 43}.

Also, I read in my compiler's documentation the following:

To get a random number in the range 0..N, use rand()%(N+1).
Note that the low bits of the rand's return value are not
very random, so rand()%N for small values of N could be not
enough random. The alternative, but non-ANSI, function
"random()" is better if N is small. See "random()".

To combat the "least significant bits aren't very random" problem,
instead of using a non-ANSI function, I used rand(), but I discarded
the 4 least significant bits and shifted the remaining bits 4 bits to
the right.

So I wrote the following, which seems to work, but maybe there's ways
this could be improved:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(int argc, char *argv[])
{
int LoNum = 0;
int HiNum = 0;
int Span = 0;
int RnNum = 0;
int Shifted = 0;
int Output = 0;

/* Seed the random-number generator with the time: */
srand(time(0));

/* Get the low and high numbers: */
LoNum = atoi(argv[1]);
HiNum = atoi(argv[2]);

/* Get the span (add one to avoid fencepost error): */
Span = HiNum - LoNum + 1;

/* Get random number between 0 and 2147483647 inclusive: */
RnNum = rand();

/* Shift RnNum rightward 4 bits: */
Shifted = RnNum >4;

/* Set Output to (LoNum plus (Shifted modulo Span)): */
Output = LoNum + Shifted%Span;

printf("%d\n", Output);

return 0;
}
Critiques? Comments? Questions? "Ewww, yuck"s? I'm sure there's
ways this could be done better.

I'm especially worried if my technique of shifting the number 4 bits
to the right is going to introduce any non-random skewing.
--
Cheers,
Robbie Hatley
lonewolf aatt well dott com
www dott well dott com slant user slant lonewolf slant
Oct 1 '08 #1
Share this Question
Share on Google+
20 Replies


P: n/a
I've never seen it done that way, and I'd hesitate to use it. It feels
to me like you're simply diminishing an already small source or
pseudo-randomness.
I would agree. I doubt that this will be main() in implementation, so
I'm assuming that you'll prototype it like:

unsigned int my_random(unsigned int, unsigned int);

If that's the case, the link provided by Martien is a much better and
saner approach.

Cheers,
--Tim

Oct 1 '08 #2

P: n/a
"Robbie Hatley" <lo******@well.comwrites:
Martien Verbruggen wrote:
<snip>
>Output = LoNum + (int)(Span * (rand() / (RAND_MAX + 1.0)));

Hmmm. Yes, I think that would work. Avoids fencepost errors
and endpoint probability skews.
I am not sure what you mean about "endpoint probability skews" but if
you are referring to the problems caused by the range being close to
(or at least not very much less than) RAND_MAX it is worth pointing
out that is does not avoid it.
Involves floating-point
arithmetic, though, whereas my method is purely integral.
It is also worth pointing out that it does not guarantee to avoid
fencepost errors. If the floating point implementation is very poor
or RAND_MAX is very large, the result can be outside of the expected
range (by one). This is still how I would do it, but as 64 bit ints
become common you must watch out for (int)(Span * (rand() / (RAND_MAX
+ 1.0))) being equal to (rather than strictly less than) Span.

<snip>
It's the experimental bit-shifting thing I'm less sure of,
though. I just thought of that today. I figured, "if the
least significant bits are non-random, shave off a few and
use the others instead".
The trouble is that you don't really know that the other bits are any
better. It is true that the problem with some very simple PRNGs is in
the bottom bits but you can't be sure that some implementation has not
"fixed" this with an algorithm that just moves the problem up by n
bits.

There is really very little you can do if rand() is bad unless you
know exactly how it is bad. If the quality of the PRNG matters to
your code, the simplest solution is to use your own algorithm that
meets your needs, but I'd bet that almost all modern implementations
now include a reasonable rand() function.

<snip test code>
Pretty good. Never varies more than about 7% from 1000
items per hash bucket, and close to being bell-shaped.
(I added the "X" graph manually after copy-n-pasting
from program output.)
If all you want is good "coverage" then you don't need to worry about
the randomness of the bottom bits. Repeat the above, without the
shift, and with your own really bad PRNG and you will see you get
similarly "good" results. If the bottom bits worry you, you need a
test that detects the problem so that you can see that the problem
goes away when you do the shift.

--
Ben.
Oct 1 '08 #3

P: n/a
"Robbie Hatley" <lo******@well.comwrites:
/* Seed the random-number generator with the time: */
srand(time(0));
As a word of warning, on many implementations time(0) is only
accurate to the nearest second, so if you run the program more
than once a second (which you can even do by hand if you have a
shell with an appropriate form of command history), you will get
the same seed both times within that second.

Your implementation might have a more accurate clock available
through another interface, e.g. gettimeofday() under POSIX.
--
Ben Pfaff
http://benpfaff.org
Oct 1 '08 #4

P: n/a
"Robbie Hatley" <lo******@well.comwrites:
Martien Verbruggen wrote:
>I wouldn't initialise any of these [local vars],
given that they all are assigned values in the program,
further down. This is mainly a style issue.

Matter of habit with me. Not too important in tiny prog,
but in a huge app with many large functions, I like to
see every local var initialized to something (usually 0)
at the top of each function. Prevents a lot of problems
with using uninitialized variables.
[...]

Or it changes the nature of any problems with using uninitialized
variables. Suppose you initialize a variable to 0, which isn't a
logically valid value for that variable, and you forget to assign a
valid value to it before using it. Then you're less likely to get
really bizarre and inconsistent behavior; instead you'll get
consistently incorrect behavior. And the compiler can't warn you
about it, because as far as it knows 0 was exactly the value you
wanted.

There are arguments for and against the habit of initializing all
variables; you should understand both.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Oct 1 '08 #5

P: n/a
Keith Thompson said:

<snip>
Suppose you initialize a variable to 0, which isn't a
logically valid value for that variable, and you forget to assign a
valid value to it before using it. Then you're less likely to get
really bizarre and inconsistent behavior; instead you'll get
consistently incorrect behavior.
Right. And debugging consistently incorrect behaviour is much easier and
quicker than debugging really bizarre and inconsistent behaviour.
And the compiler can't warn you
about it, because as far as it knows 0 was exactly the value you
wanted.
Right. But some compilers don't warn about the use of uninitialised
objects. And those that do are easily fooled, e.g. by foo(&mystruct).
There are arguments for and against the habit of initializing all
variables; you should understand both.
Agreed. But at some point you have to decide which side of the fence you're
on. Sitting on it is very uncomfortable.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Oct 1 '08 #6

P: n/a
Ben Pfaff <bl*@cs.stanford.eduwrites:
"Robbie Hatley" <lo******@well.comwrites:
> /* Seed the random-number generator with the time: */
srand(time(0));

As a word of warning, on many implementations time(0) is only
accurate to the nearest second, so if you run the program more
than once a second (which you can even do by hand if you have a
shell with an appropriate form of command history), you will get
the same seed both times within that second.

Your implementation might have a more accurate clock available
through another interface, e.g. gettimeofday() under POSIX.
And if you care about unpredictability (say, if you're generating
pseudo-random passwords), srand(time(0)) is particularly dangerous.
If an attacker can guess when the program was run, and knows the
implementation you're using, he can narrow down the possibilities
considerably; if he knows within an hour, for example, he only has
3600 passwords to test. rand() is specifically required to generate
the same sequence for a given seed; it's not *allowed* to be truly
random. (This assumes time(0) has 1-second resolution, which is
common but not specified by the standard.)

But if all you care about is that the numbers *seem* random, say, for
a game where the outcome isn't terribly important, then srand(time(0))
is probably ok. There are permissible implementations of time() where
this won't work well at all (say, where time_t is a floating-point
number in the range 0.0 to 1.0), but such implementations are rare or
even nonexistent in the real world.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Oct 1 '08 #7

P: n/a
Richard Heathfield wrote:
Keith Thompson said:

<snip>
Suppose you initialize a variable to 0, which isn't a
logically valid value for that variable, and you forget to assign a
valid value to it before using it. Then you're less likely to get
really bizarre and inconsistent behavior; instead you'll get
consistently incorrect behavior.

Right. And debugging consistently incorrect behaviour is much easier and
quicker than debugging really bizarre and inconsistent behaviour.
Bizarre behavior makes the problem easier to notice, but consistency
incorrect behavior is certain very helpful during the debugging
process. However, consistently incorrect behavior is easy to notice
only if it's clearly incorrect. In my experience, it's not unusual for
the behavior when a variable is zero-initialized to be subtly off,
rather than clearly incorrect.

Example: I recently had to debug a problem in code that is currently
my responsibility, but was written a long time ago by someone else. It
has been running for about 8 years now, closing input files many
thousands of times each day. The bug was caused by the fact that if
something goes wrong while initializing the output file, the input
file handle will still be uninitialized at the time the fclose() is
called on it . This causes no problems with our Irix compiler, because
it zero-initializes even automatic variables that are not explicitly
set, and fclose() doesn't seem to mind a null pointer argument. For
unknown reasons, it has caused no problems with any of our Linux
compilers before, even though they don't zero-initialize automatic
variables; my best guess is that it's because fclose() is the very
last thing the program does, so the opportunities for anything to get
messed up are minimal. However, we recently installed mandriva on a
new system, and are testing it out. With the mandriva compiler the
code very consistently fails in a very noticeable fashion.

No - we do not have sufficient spare resources to do a code review of
existing and apparently working code that would be sufficiently
thorough to guarantee catching bugs like this before they cause
problems. We barely have enough resources to identify and patch bugs
like this after they cause problems.
Oct 1 '08 #8

P: n/a
"Ben Pfaff" wrote:
"Robbie Hatley" <lo******@well.comwrites:
/* Seed the random-number generator with the time: */
srand(time(0));

As a word of warning, on many implementations time(0) is only
accurate to the nearest second,
Yes, on every implimentation I've seen, time() returns time in
seconds (to the nearest second) since midnight, Jan 1, 1970.

I wonder what the std says about type time_t? Let me read.....
Interesting. It says "time_t" is "an arithmetic type capable
of representing time". Doesn't specify integral or floating-point.
In every compiler I've worked with so far, though, it's always
typedefed to an integral type, usually unsigned long.

Which isn't so good, beause a 32-bit unsigned long int will run
out of room around February of 2106, at which time, time() will
roll over to 12:00:00AM, Jan 1, 1970 on most computers. :-)
so if you run the program more than once a second (which you
can even do by hand if you have a shell with an appropriate
form of command history), you will get the same seed both
times within that second.
Yes, I did notice that if I ran my random number program several
times real fast, it gave "973" three times in a row, before srand()
got seeded with the next time.
Your implementation might have a more accurate clock available
through another interface, e.g. gettimeofday() under POSIX.
It doesn't have that one, but it has others, such as ftime
(time to nearest millisecond) and uclock (very high resolution
timer with 1193180 ticks per second; returns 64-bit signed integer;
gives 0 on first call, and wraps back to 0 after about 48hrs).

--
Cheers,
Robbie Hatley
lonewolf aatt well dott com
www dott well dott com slant user slant lonewolf slant
Oct 3 '08 #9

P: n/a
Ben Pfaff wrote:
As a word of warning, on many implementations time(0) is only
accurate to the nearest second, so if you run the program more
than once a second (which you can even do by hand if you have a
shell with an appropriate form of command history),
(fx:OT)

Or even without:

prog; prog; prog

--
'It changed the future .. and it changed us.' /Babylon 5/

Hewlett-Packard Limited registered no:
registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England

Oct 3 '08 #10

P: n/a
Robbie Hatley wrote:
"Ben Pfaff" wrote:
>"Robbie Hatley" <lo******@well.comwrites:
>> /* Seed the random-number generator with the time: */
srand(time(0));
As a word of warning, on many implementations time(0) is only
accurate to the nearest second,

Yes, on every implimentation I've seen, time() returns time in
seconds (to the nearest second) since midnight, Jan 1, 1970.

I wonder what the std says about type time_t? Let me read.....
Interesting. It says "time_t" is "an arithmetic type capable
of representing time". Doesn't specify integral or floating-point.
It is even permissible for it to be an _Imaginary type, which would
require that CLOCKS_PER_SEC has to be imaginary, too. This would have
interesting :-) implications when converting time(0) to int.
Oct 3 '08 #11

P: n/a
Robbie Hatley <lo******@well.comwrote:
>
Yes, on every implimentation I've seen, time() returns time in
seconds (to the nearest second) since midnight, Jan 1, 1970.
[...]
In every compiler I've worked with so far, though, it's always
typedefed to an integral type, usually unsigned long.
FYI, a fairly popular mainframe C implementation uses double for time_t
and originally used Jan 1, 1900 as the epoch (although it has since been
changed for POSIX compatibility).
--
Larry Jones

Wow, how existential can you get? -- Hobbes
Oct 3 '08 #12

P: n/a
"Robbie Hatley" <lo******@well.comwrites:
I needed a quick program called "random" that gives a random
positive integer from n1 to n2. For example, if I type
"random 38, 43", I want the program to print a random member
of the set {38, 39, 40, 41, 42, 43}.

Also, I read in my compiler's documentation the following:

To get a random number in the range 0..N, use rand()%(N+1).
Note that the low bits of the rand's return value are not
very random, so rand()%N for small values of N could be not
enough random. The alternative, but non-ANSI, function
"random()" is better if N is small. See "random()".

To combat the "least significant bits aren't very random" problem,
instead of using a non-ANSI function, I used rand(), but I discarded
the 4 least significant bits and shifted the remaining bits 4 bits to
the right.

So I wrote the following, which seems to work, but maybe there's ways
this could be improved:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(int argc, char *argv[])
{
int LoNum = 0;
int HiNum = 0;
int Span = 0;
int RnNum = 0;
int Shifted = 0;
int Output = 0;

/* Seed the random-number generator with the time: */
srand(time(0));

/* Get the low and high numbers: */
LoNum = atoi(argv[1]);
HiNum = atoi(argv[2]);

/* Get the span (add one to avoid fencepost error): */
Span = HiNum - LoNum + 1;

/* Get random number between 0 and 2147483647 inclusive: */
RnNum = rand();

/* Shift RnNum rightward 4 bits: */
Shifted = RnNum >4;

/* Set Output to (LoNum plus (Shifted modulo Span)): */
Output = LoNum + Shifted%Span;

printf("%d\n", Output);

return 0;
}
Critiques? Comments? Questions? "Ewww, yuck"s? I'm sure there's
ways this could be done better.

I'm especially worried if my technique of shifting the number 4 bits
to the right is going to introduce any non-random skewing.
First, realize that you're trying to two distinct (related, but still
distinct) problems, namely: (1) how to use a random number generator
to produce numbers in a particular range; and (2) how to compensate
for a low quality implementation (for example, of rand()).

Assuming you have a RNG of reasonable quality, the answer to (1) is
fairly straightforward as long as you remember to avoid bias of some
choices over others. Usually it's better to do this using integer
arithmetic rather than floating point arithmetic, because it's easier
to get unbiased results in integer. Some code has been posted to do
that; the key idea is to choose a range that can produce an unbiased
result, and generate RN's until you get one inside the range.

The answer to (2) is more complicated if you really are intent on
correcting problems in a low quality RNG. It's easier to find a more
high quality implementation and code it portably yourself. The RNG
that was done in GraphBase (google search for "gb_flip") can be coded
up in 50 to 100 lines.

For specific comments on the code above, there are various style
choices you've made that IMO detract from the code more than they add
to it, but I'll limit myself to one remark, on function composition.
This program would be better if there were an auxiliary function
along these lines:

int
random_up_to( int n ){
...
}

...

t = a + random_up_to( b-a ); /* a <= t && t <= b */
If this is done then the two main parts of the program will be
separated, simpler, and easier to understand. Furthermore, after
you've written 'random_up_to()', there's a good chance you'll
want to stick in your personal C library and re-use it later.

Oct 9 '08 #13

P: n/a
ja*********@verizon.net writes:
Richard Heathfield wrote:
Keith Thompson said:

<snip>
Suppose you initialize a variable to 0, which isn't a
logically valid value for that variable, and you forget to assign a
valid value to it before using it. Then you're less likely to get
really bizarre and inconsistent behavior; instead you'll get
consistently incorrect behavior.
Right. And debugging consistently incorrect behaviour is much easier and
quicker than debugging really bizarre and inconsistent behaviour.

Bizarre behavior makes the problem easier to notice, but consistency
incorrect behavior is certain very helpful during the debugging
process. However, consistently incorrect behavior is easy to notice
only if it's clearly incorrect. In my experience, it's not unusual for
the behavior when a variable is zero-initialized to be subtly off,
rather than clearly incorrect.

[... debugging experience example ...]
Those who practice "initialize every variable just for safety"
may want to consider doing that using a preprocessor macro,
eg,

#define SOME_VALUE =(0)

...

int x SOME_VALUE;

both for documentation purposes, and so that the choice of
default value can be changed (including doing no initialization
at all for such variables).

Obviously, to allow a wider range of choices for initial values,
it might be good to define several different macros, to be used
for different types (probably at least three: pointer, signed,
unsigned).
Oct 9 '08 #14

P: n/a
On 9 Oct, 10:09, Tim Rentsch <t...@alumnus.caltech.eduwrote:
Those who practice "initialize every variable just for safety"
may want to consider doing that using a preprocessor macro,
eg,

* *#define SOME_VALUE =(0)

* *...

* *int *x *SOME_VALUE;

both for documentation purposes, and so that the choice of
default value can be changed (including doing no initialization
at all for such variables).
I'm not a fan of obscure macros. Particularly ones that
make the code look syntactically wrong. It upsets my
inner parser.
Obviously, to allow a wider range of choices for initial values,
it might be good to define several different macros, to be used
for different types (probably at least three: *pointer, signed,
unsigned)
I don't think the usefulness outweighs the obscurity

--
Nick Keighley
Oct 9 '08 #15

P: n/a
Tim Rentsch wrote:
>
Those who practice "initialize every variable just for safety"
may want to consider doing that using a preprocessor macro,
eg,

#define SOME_VALUE =(0)

...

int x SOME_VALUE;
what an abomination.

--
Ian Collins.
Oct 9 '08 #16

P: n/a
Nick Keighley <ni******************@hotmail.comwrites:
On 9 Oct, 10:09, Tim Rentsch <t...@alumnus.caltech.eduwrote:
Those who practice "initialize every variable just for safety"
may want to consider doing that using a preprocessor macro,
eg,

#define SOME_VALUE =(0)

...

int x SOME_VALUE;

both for documentation purposes, and so that the choice of
default value can be changed (including doing no initialization
at all for such variables).

I'm not a fan of obscure macros. Particularly ones that
make the code look syntactically wrong. It upsets my
inner parser.
I'm not a fan of them either. In some cases (certainly not all),
a little ugliness is worth paying to get better test coverage.

It doesn't take a lot of imagination to come up with something
that looks a little nicer:

#define DEFAULT(d) (d) = {0}

int DEFAULT( x );
int DEFAULT( *p );
int DEFAULT( z[10] );

Obviously, to allow a wider range of choices for initial values,
it might be good to define several different macros, to be used
for different types (probably at least three: pointer, signed,
unsigned)

I don't think the usefulness outweighs the obscurity
I'm sure for some projects it does and for other projects it
doesn't. The earlier posting did say "may want to consider",
and that phrasing wasn't accidental.
Oct 10 '08 #17

P: n/a
Ian Collins <ia******@hotmail.comwrites:
Tim Rentsch wrote:

Those who practice "initialize every variable just for safety"
may want to consider doing that using a preprocessor macro,
eg,

#define SOME_VALUE =(0)

...

int x SOME_VALUE;
what an abomination.
Certainly this idea would be better cast as a compiler option or
a #pragma. For those who don't have such tools available, the
choices are either go without or adopt some other method for
providing such a capability. Each choice has its advantages and
disadvantages; the point was to offer another choice, not to say
that this choice is the best in all circumstances. Even if you
don't like this idea, some projects can decide that it offers
them value.
Oct 10 '08 #18

P: n/a
Tim Rentsch wrote:
Ian Collins <ia******@hotmail.comwrites:
>Tim Rentsch wrote:
>>Those who practice "initialize every variable just for safety"
may want to consider doing that using a preprocessor macro,
eg,

#define SOME_VALUE =(0)

...

int x SOME_VALUE;
what an abomination.

Certainly this idea would be better cast as a compiler option or
a #pragma. [...]
"Certainly?" Certainly *not*, unless I've completely misread
your intent. Using the SOME_VALUE macro looks silly to me and I
wouldn't recommend it, but at least it's still C code. Introducing
a compiler option or #pragma to change the rules of the language
("If not initialized explicitly, auto variables are zeroed") means
the code is no longer C, but C-slang. The difference becomes
important when you want to move that code to another system, one
that has a C compiler but no C-slang compiler ...

--
Er*********@sun.com
Oct 10 '08 #19

P: n/a
In article <kf*************@alumnus.caltech.edu>,
Tim Rentsch <tx*@alumnus.caltech.eduwrote:
#define SOME_VALUE =(0)

...

int x SOME_VALUE;
If you're going to do this, I think

#define SOME_VALUE 0
...
int x = SOME_VALUE;

is better. It's clearer to the reader, and to programs (such as
editors) that do half-hearted C parsing.

-- Richard
--
Please remember to mention me / in tapes you leave behind.
Oct 10 '08 #20

P: n/a
ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
In article <kf*************@alumnus.caltech.edu>,
Tim Rentsch <tx*@alumnus.caltech.eduwrote:
#define SOME_VALUE =(0)

...

int x SOME_VALUE;

If you're going to do this, I think

#define SOME_VALUE 0
...
int x = SOME_VALUE;

is better. It's clearer to the reader, and to programs (such as
editors) that do half-hearted C parsing.
Yes, however it doesn't allow one of the desired behaviors.
The point of the original definition is to permit something
along the lines of

#if INITIALIZING
# define SOME_VALUE =(0)
#else
# define SOME_VALUE
#endif

With the benefit of hindsight the original example probably should
have been written more like this:

#if DETERMINISTIC_TESTING
# define INDETERMINATE(x) (x) = {0}
#else
# define INDETERMINATE(x) (x)
#endif

int INDETERMINATE(i);

Oct 11 '08 #21

This discussion thread is closed

Replies have been disabled for this discussion.