By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,360 Members | 2,961 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,360 IT Pros & Developers. It's quick & easy.

Randomly selecting an element from an STL collection

P: n/a
I had a need to randomly select an element from an STL collection. It
does not appear that this functionality is provided out-of-the-box
with STL. Here is my crude implementation. I am using advance and
distance in my approach. Is distance guaranteed to return an integral
value? If not, can anyone suggest improvements:

--Song

///// Header File //////

#ifndef _RANDOM_ENTRY_H_
#define _RANDOM_ENTRY_H_

#include <stdlib.h>
#include <sys/time.h>
#include <algorithm>

template<typename iterator>
iterator
random_entry(iterator first, iterator last)
{
int separation = distance(first, last);

if(separation)
{
struct timeval tod; // time-of-day

gettimeofday(&tod, NULL);
srand(tod.tv_sec+tod.tv_usec);

int randOffset = rand()% separation;

advance(first, randOffset);
}

iterator retVal = first;

return retVal;
}

#endif //_RANDOM_ENTRY_H_
///// Driver File /////
#include "random_entry.h"

#include <list>
#include <vector>
#include <map>
#include <string>
#include <iostream>

using namespace std;
main()
{
list<intli;

li.push_back(10);
li.push_back(19);
li.push_back(28);
li.push_back(37);

list<int>::iterator liItor = random_entry(li.begin(), li.end());
if(liItor != li.end())
cout << *liItor << endl;
else
cout << "Empty collection\n";

vector<stringvs;
vs.push_back("Brad Pitt");
vs.push_back("Angelina Jolie");
vs.push_back("Tom Cruise");
vs.push_back("Renee Zellweger");
vs.push_back("Julia Roberts");
vs.push_back("Denzel Washington");
vs.push_back("Will Smith");

vector<string>::iterator vsItor = random_entry(vs.begin(),
vs.end());
if(vsItor != vs.end())
cout << *vsItor << endl;
else
cout << "Empty collection\n";

map<string, stringms;

ms["President"] = "George W Bush";
ms["Vice President"] = "Dick Cheney";
ms["Senate Majority Leader"] = "Harry Reid";
ms["Senate Minority Leader"] = "Mitch Mcconnell ";
ms["Speaker"] = "Nancy Pelosi";

map<string, string>::iterator msItor = random_entry(ms.begin(),
ms.end());
if(msItor != ms.end())
cout << msItor->first << ", " << msItor->second << endl;
else
cout << "Empty collection\n";
}

Jun 27 '07 #1
Share this Question
Share on Google+
9 Replies


P: n/a
Generic Usenet Account wrote:
I had a need to randomly select an element from an STL collection. It
does not appear that this functionality is provided out-of-the-box
with STL. Here is my crude implementation. I am using advance and
distance in my approach. Is distance guaranteed to return an integral
value?
Not sure if it's guaranteed or not, but it's usually 'ptrdiff_t'.
If not, can anyone suggest improvements:
I could only say, make use of iterator_traits<iterator>::difference_type.
>
--Song

///// Header File //////

#ifndef _RANDOM_ENTRY_H_
Do NOT use identifiers that begin with an underscore and a capital letter,
those are *reserved* by the implementation.
#define _RANDOM_ENTRY_H_

#include <stdlib.h>
#include <sys/time.h>
#include <algorithm>

template<typename iterator>
iterator
random_entry(iterator first, iterator last)
{
int separation = distance(first, last);

if(separation)
{
struct timeval tod; // time-of-day

gettimeofday(&tod, NULL);
srand(tod.tv_sec+tod.tv_usec);
It is a BAD IDEA(tm) to seed the random number generator on every
iteration. Seed it once in your program, then let it run freely.
>
int randOffset = rand()% separation;

advance(first, randOffset);
}

iterator retVal = first;

return retVal;
}

#endif //_RANDOM_ENTRY_H_
///// Driver File /////
#include "random_entry.h"

#include <list>
#include <vector>
#include <map>
#include <string>
#include <iostream>

using namespace std;
main()
{
list<intli;

li.push_back(10);
li.push_back(19);
li.push_back(28);
li.push_back(37);

list<int>::iterator liItor = random_entry(li.begin(), li.end());
if(liItor != li.end())
cout << *liItor << endl;
else
cout << "Empty collection\n";

vector<stringvs;
vs.push_back("Brad Pitt");
vs.push_back("Angelina Jolie");
vs.push_back("Tom Cruise");
vs.push_back("Renee Zellweger");
vs.push_back("Julia Roberts");
vs.push_back("Denzel Washington");
vs.push_back("Will Smith");

vector<string>::iterator vsItor = random_entry(vs.begin(),
vs.end());
if(vsItor != vs.end())
cout << *vsItor << endl;
else
cout << "Empty collection\n";

map<string, stringms;

ms["President"] = "George W Bush";
ms["Vice President"] = "Dick Cheney";
ms["Senate Majority Leader"] = "Harry Reid";
ms["Senate Minority Leader"] = "Mitch Mcconnell ";
ms["Speaker"] = "Nancy Pelosi";

map<string, string>::iterator msItor = random_entry(ms.begin(),
ms.end());
if(msItor != ms.end())
cout << msItor->first << ", " << msItor->second << endl;
else
cout << "Empty collection\n";
}
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Jun 27 '07 #2

P: n/a
On Jun 26, 5:45 pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
Generic Usenet Account wrote:
I had a need to randomly select an element from an STL collection. It
does not appear that this functionality is provided out-of-the-box
with STL. Here is my crude implementation. I am using advance and
distance in my approach. Is distance guaranteed to return an integral
value?

Not sure if it's guaranteed or not, but it's usually 'ptrdiff_t'.
If not, can anyone suggest improvements:

I could only say, make use of iterator_traits<iterator>::difference_type.
Just to strengthen Victor's suggestion ..

24.4.1 makes the guarantee:
"For every iterator type X for which equality is defined, there is a
corresponding signed integral type called the difference type of the
iterator."

--
Alan Johnson

Jun 27 '07 #3

P: n/a
On Jun 27, 2:45 am, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
Generic Usenet Account wrote:
I had a need to randomly select an element from an STL collection. It
does not appear that this functionality is provided out-of-the-box
with STL. Here is my crude implementation. I am using advance and
distance in my approach. Is distance guaranteed to return an integral
value?
Not sure if it's guaranteed or not, but it's usually 'ptrdiff_t'.
It's guaranteed for the standard allocator.
If not, can anyone suggest improvements:
I could only say, make use of
iterator_traits<iterator>::difference_type.
I wonder if it's really necessary to bother. He's using rand()
to choose an element, and RAND_MAX can be (and far too often is)
as little as 32767. So if he has more elements that that, his
code isn't going to work. Also, he used the expression
rand() % distance to choose the element. This introduces a bias
toward the smaller elements; the closer distance is to RAND_MAX,
the larger the biais. Unless the number of elements is actually
very small, the biais will be noticeable. So we can conclude
that the actual number of elements will probably not be much
more than 10 or 20. And that a simple int will work just fine.

--
--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 27 '07 #4

P: n/a
On Jun 27, 5:55 am, James Kanze <james.ka...@gmail.comwrote:
I wonder if it's really necessary to bother. He's using rand()
to choose an element, and RAND_MAX can be (and far too often is)
as little as 32767. So if he has more elements that that, his
code isn't going to work. Also, he used the expression
rand() % distance to choose the element. This introduces a bias
toward the smaller elements; the closer distance is to RAND_MAX,
the larger the biais. Unless the number of elements is actually
very small, the biais will be noticeable. So we can conclude
that the actual number of elements will probably not be much
more than 10 or 20. And that a simple int will work just fine.
First of all my thanks to all of you for your excellent feedback ----
Alan, James, Victor and others. I would really appreciate if you
could tell me how I can avoid the "bias towards smaller elements". I
have always used the "modulo division by random number" trick to
restrict my pseudo-random numbers to a certain range. I am keen to
avoid this mistake at some future date. A sample code snippet would
be great. Although I do not expect anywhere close to 32767 entries in
my collection, it is certainly going to be much more than 10 or 20.

Thanks,
Song

Jun 28 '07 #5

P: n/a
Generic Usenet Account wrote:
[..] I would really appreciate if you
could tell me how I can avoid the "bias towards smaller elements". I
have always used the "modulo division by random number" trick to
restrict my pseudo-random numbers to a certain range. I am keen to
avoid this mistake at some future date. A sample code snippet would
be great. Although I do not expect anywhere close to 32767 entries in
my collection, it is certainly going to be much more than 10 or 20.
When I need a random number in a certain range, I use

int rand_n(int lower, int upper)
{
return int(rand() * (upper - lower + 1.0) / (RAND_MAX + 1.0))
+ lower;
}

That essentially subdivides the 0..RAND_MAX range into (upper-lower+1)
subranges and gives you the index of the subrange in which the random
number you get from 'rand' fits, offset by 'lower'. I am sure that
this doesn't work very well for (upper-lower) larger than RAND_MAX/2.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Jun 28 '07 #6

P: n/a
In comp.lang.c++ Generic Usenet Account <us****@sta.samsung.comwrote:
I would really appreciate if you
could tell me how I can avoid the "bias towards smaller elements". I
have always used the "modulo division by random number" trick to
restrict my pseudo-random numbers to a certain range.
The C FAQ has a couple suggestions; see FAQ #13.16:

http://www.c-faq.com/lib/randrange.html

--
Marcus Kwok
Replace 'invalid' with 'net' to reply
Jun 28 '07 #7

P: n/a
On Jun 28, 9:23 pm, Generic Usenet Account <use...@sta.samsung.com>
wrote:
On Jun 27, 5:55 am, James Kanze <james.ka...@gmail.comwrote:
I wonder if it's really necessary to bother. He's using rand()
to choose an element, and RAND_MAX can be (and far too often is)
as little as 32767. So if he has more elements that that, his
code isn't going to work. Also, he used the expression
rand() % distance to choose the element. This introduces a bias
toward the smaller elements; the closer distance is to RAND_MAX,
the larger the biais. Unless the number of elements is actually
very small, the biais will be noticeable. So we can conclude
that the actual number of elements will probably not be much
more than 10 or 20. And that a simple int will work just fine.
I would really appreciate if you
could tell me how I can avoid the "bias towards smaller elements". I
have always used the "modulo division by random number" trick to
restrict my pseudo-random numbers to a certain range.
That's usually sufficient for fairly small values. The bias
towards the smaller elements problems is easily understood,
however, if you consider a perfect random generator with a very
small RAND_MAX. Suppose RAND_MAX is 9, i.e. your generator
generates all of the numbers from 0 to 9, with equal
probability. Now consider what happens if you do a modulo 6.
What are the results for each number generated:

0 % 6 = 0
1 % 6 = 1
2 % 6 = 2
3 % 6 = 3
4 % 6 = 4
5 % 6 = 5
6 % 6 = 0
7 % 6 = 1
8 % 6 = 2
9 % 6 = 3

It's obvious that the values 0-3 each appear 20% of the time,
whereas 4 and 5 only appear 10%.

The problem doesn't occur if RAND_MAX + 1 is a multiple of your
value, so the usual solution is just to throw out any values
above the last multiple. For a random value in the range
0...N-1, I use something like:

unsigned const limit = RAND_MAX+1 - (RAND_MAX+1) % N ;
unsigned result = next() ;
while ( result >= limit ) {
result = next() ;
}
return result % N ;
I am keen to
avoid this mistake at some future date. A sample code snippet would
be great. Although I do not expect anywhere close to 32767 entries in
my collection, it is certainly going to be much more than 10 or 20.
Be careful with rand(). In the past, at least, a lot of
implementations have been very bad... some systematically
alternate even and odd, for example. For anything more than
just playing around, get a good generator with known behavior.
(Boost has quite a few.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jun 29 '07 #8

P: n/a
On Fri, 29 Jun 2007 09:13:21 +0000, James Kanze wrote:
The problem doesn't occur if RAND_MAX + 1 is a multiple of your value,
so the usual solution is just to throw out any values above the last
multiple. For a random value in the range 0...N-1, I use something
like:

unsigned const limit = RAND_MAX+1 - (RAND_MAX+1) % N ;
That is almost guaranteed to fail on many implementations namely those
that have RAND_MAX equal to INT_MAX (e.g. gcc on x86(-64)).

At least make it RAND_MAX+1u. That should work on most implementations.
unsigned result = next() ; while ( result >= limit ) {
result = next() ;
}
return result % N ;
--
Markus Schoder
Jun 30 '07 #9

P: n/a
On Jun 30, 2:42 pm, Markus Schoder <a3vr6dsg-use...@yahoo.dewrote:
On Fri, 29 Jun 2007 09:13:21 +0000, James Kanze wrote:
The problem doesn't occur if RAND_MAX + 1 is a multiple of your value,
so the usual solution is just to throw out any values above the last
multiple. For a random value in the range 0...N-1, I use something
like:
unsigned const limit = RAND_MAX+1 - (RAND_MAX+1) % N ;
That is almost guaranteed to fail on many implementations namely those
that have RAND_MAX equal to INT_MAX (e.g. gcc on x86(-64)).
Good point. I actually copied it from my own implementation of
RandomInt, which is based on my own random number generator,
with known properties, modifying it to use RAND_MAX+1, rather
than Random::max(). (In my case, Random::max() defines the
upper bound of a half open interval, i.e. is exclusive. And my
own generator uses linear congruence, so it isn't be a power of
2.)
At least make it RAND_MAX+1u. That should work on most implementations.
I think I'd cast RAND_MAX to unsigned; that u can easily be
overlooked:-).

--
James Kanze (Gabi Software) email: ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jul 1 '07 #10

This discussion thread is closed

Replies have been disabled for this discussion.