By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,561 Members | 3,265 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,561 IT Pros & Developers. It's quick & easy.

Random Subset of Range of Integers

P: n/a
I searched around on the net for a bit, couldn't find anything though.

I would like to find some code for a function where I input A Range Of
Integers
For example: Function( 1, 100 );

And the function will return me an array holding a random subset of integers
in that range of a size that I specify

So the Function would Probabaly look something like this:

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger,
int SubsetSize );
Anyone know where I can find something along these lines?
Also, if you know of any general good random number libratries that are fast
and have an even distribution, let me know as well!

Thanks!
-Chris
Jul 19 '05 #1
Share this Question
Share on Google+
29 Replies


P: n/a
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
I searched around on the net for a bit, couldn't find anything though.

I would like to find some code for a function where I input A Range Of
Integers
For example: Function( 1, 100 );

And the function will return me an array holding a random subset of integers in that range of a size that I specify

So the Function would Probabaly look something like this:

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger,
int SubsetSize );


Have you tried to implement this on your own? Just curious, because it is a
trivial problem to solve.

tom

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003
Jul 19 '05 #2

P: n/a
I thought about it for a while.

The best I could come up with was an O(NlogN) algorithm that would be a pain
for me to code.

So I figured I might as well not re-invent the wheel and try and find some
code written by experts who are likely much smarter than me and worked hard
on an optimal and bugless solution.

-Chris

"Thomas J. Clancy" <tj******@comcast.net> wrote in message
news:r9********************@comcast.com...
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
I searched around on the net for a bit, couldn't find anything though.

I would like to find some code for a function where I input A Range Of
Integers
For example: Function( 1, 100 );

And the function will return me an array holding a random subset of integers
in that range of a size that I specify

So the Function would Probabaly look something like this:

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger, int SubsetSize );


Have you tried to implement this on your own? Just curious, because it is

a trivial problem to solve.

tom

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003

Jul 19 '05 #3

P: n/a
Don't top post in this NG

{reformatted}
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
"Thomas J. Clancy" <tj******@comcast.net> wrote in message
news:r9********************@comcast.com...
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
I searched around on the net for a bit, couldn't find anything though.

I would like to find some code for a function where I input A Range Of
Integers
For example: Function( 1, 100 );

And the function will return me an array holding a random subset of integers
in that range of a size that I specify

So the Function would Probabaly look something like this:

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger, int SubsetSize );


Have you tried to implement this on your own? Just curious, because it

is a
trivial problem to solve.

tom

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003

I thought about it for a while.

The best I could come up with was an O(NlogN) algorithm that would be a

pain for me to code.

So I figured I might as well not re-invent the wheel and try and find some
code written by experts who are likely much smarter than me and worked hard on an optimal and bugless solution.

-Chris


O(NlogN)???
Look up in your documentation the function rnd -> it returns a random
integer from 0 to RAND_MAX

Your function becomes trivial now, isnt it?
I just cannot stop wonder, O(NlogN)???
/dan

Jul 19 '05 #4

P: n/a
Chris Dutrow wrote:
I would like to find some code for a function where I input A
Range Of Integers And the function will return me an array
holding a random subset of integers in that range of a size that
I specify


May the numbers repeat?

On a related note, there is an O(n) algorithm to generate a
permutation of 0 .. N-1

The pseudo code is the following:

for i=0 to N-1 do V[i] = i; done // initialize V

for i=0 to N-2 do
j = rand() % (N-i) + i // i <= j < N
swap V[i] and V[j]
done

Jul 19 '05 #5

P: n/a

"Dan Cernat" <ce****@dan.com> wrote in message
news:vr************@corp.supernews.com...
Don't top post in this NG

{reformatted}
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
"Thomas J. Clancy" <tj******@comcast.net> wrote in message
news:r9********************@comcast.com...
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
> I searched around on the net for a bit, couldn't find anything though. >
> I would like to find some code for a function where I input A Range Of > Integers
> For example: Function( 1, 100 );
>
> And the function will return me an array holding a random subset of
integers
> in that range of a size that I specify
>
> So the Function would Probabaly look something like this:
>
> std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger,
> int SubsetSize );
>

Have you tried to implement this on your own? Just curious, because
it is
a
trivial problem to solve.

tom

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003

I thought about it for a while.

The best I could come up with was an O(NlogN) algorithm that would be a

pain
for me to code.

So I figured I might as well not re-invent the wheel and try and find

some code written by experts who are likely much smarter than me and worked

hard
on an optimal and bugless solution.

-Chris


O(NlogN)???
Look up in your documentation the function rnd -> it returns a random
integer from 0 to RAND_MAX

Your function becomes trivial now, isnt it?
I just cannot stop wonder, O(NlogN)???
/dan


Hmmmm, I must not have explianed well.

I need random subSET of a RANGE of integers

So The function does not get a data structure holding integers from which to
choose. It just gets two integers: a BeginRange and EndRange

Then it needs to give me a data structure, most likely a std::vector that
holds a subSET of integers from that ranger that have been randomly picked.

For the returned integers to be a SET, they must all be unique.

So the algorithm must pick out an integer from the Range and add it to the
vector, then pick out another number from the Range that has not already
been picked.

The first algorithm that came to mind was to just discard the number if it
has already been picked and try to pick again, but this doesn't work very
well when the subset size approaches the size of the range.

The next approach that I thought of would be to increment up the list when
an integer has been picked that has already been picked, so if integer 10
were to be generated by the random number generator, then try 11, if that
has already been picked, try 12 and so on. The problem with this approach
is that it makes numbers close to numbers that have already been picked MORE
likely to be picked. So this algorithm is FAULTY.

So the final algorithm that I thought of was a but more complicated, and not
so easy to code, I will post a description of it by request, but it led me
to desire to find a library that has a function that already does this.

Thanks again for your help guys,
-Chris
Jul 19 '05 #6

P: n/a

"Nudge" <de*****@kma.eu.org> wrote in message
news:3f*********************@news.free.fr...
Chris Dutrow wrote:
I would like to find some code for a function where I input A
Range Of Integers And the function will return me an array
holding a random subset of integers in that range of a size that
I specify


May the numbers repeat?

On a related note, there is an O(n) algorithm to generate a
permutation of 0 .. N-1

The pseudo code is the following:

for i=0 to N-1 do V[i] = i; done // initialize V

for i=0 to N-2 do
j = rand() % (N-i) + i // i <= j < N
swap V[i] and V[j]
done


Cool, I think you are getting what I am saying to some degree, I think
others here assumed I was stupid.

So, since a sebset is returned, the numbers CANNOT repeat and theren lies
the problem.

I also thought of the O(N) algorithms that return a permutation of a an
array or vector, but the PROBLEM with these is that the function does not
receive a vector or array full of integers, it only receives two integers
that specify a range, giving the function a vector full of integers in the
range would require much more memory than needed.

Thanks a lot for considering this!
-Chris
Jul 19 '05 #7

P: n/a
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
Hmmmm, I must not have explianed well.

I need random subSET of a RANGE of integers

So The function does not get a data structure holding integers from which to choose. It just gets two integers: a BeginRange and EndRange

Then it needs to give me a data structure, most likely a std::vector that
holds a subSET of integers from that ranger that have been randomly picked.
For the returned integers to be a SET, they must all be unique.


It's just like a card dealer then. Instead of a subset of cards (a player's
hand) from a deck of cards, it is a subset of integers from a range. The
most efficient way I know to do it is to have a deck (an array) containing
the entire set of cards (or numbers) - in sequential order if you like, or
any other order - and to select each card from the array as follows:

int CardDeck::DealCard()
{
// get index of random card
int icard = rand() % nr_cards_remaining;
// get card
int card = deck[icard];
// move last available card into remainder of deck
deck[icard] = deck[nr_cards_remaining-1];
// move dealt card to bottom of remaining cards
deck[nr_cards_remaining-1] = card;
// reduce number of available cards by one
--nr_cards_remaining;
return card;
}

Just call this once for each card in the hand (or value in the subset). Note
that the line
deck[nr_cards_remaining-1] = card;
is only needed if you want to reuse the same deck in future without having
to refill it.

DW

Jul 19 '05 #8

P: n/a

"David White" <no@email.provided> wrote in message
news:9l*****************@nasal.pacific.net.au...
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
Hmmmm, I must not have explianed well.

I need random subSET of a RANGE of integers

So The function does not get a data structure holding integers from which
to
choose. It just gets two integers: a BeginRange and EndRange

Then it needs to give me a data structure, most likely a std::vector
that holds a subSET of integers from that ranger that have been randomly

picked.

For the returned integers to be a SET, they must all be unique.


It's just like a card dealer then. Instead of a subset of cards (a

player's hand) from a deck of cards, it is a subset of integers from a range. The
most efficient way I know to do it is to have a deck (an array) containing
the entire set of cards (or numbers) - in sequential order if you like, or
any other order - and to select each card from the array as follows:

int CardDeck::DealCard()
{
// get index of random card
int icard = rand() % nr_cards_remaining;
// get card
int card = deck[icard];
// move last available card into remainder of deck
deck[icard] = deck[nr_cards_remaining-1];
// move dealt card to bottom of remaining cards
deck[nr_cards_remaining-1] = card;
// reduce number of available cards by one
--nr_cards_remaining;
return card;
}

Just call this once for each card in the hand (or value in the subset). Note that the line
deck[nr_cards_remaining-1] = card;
is only needed if you want to reuse the same deck in future without having
to refill it.

DW


This still doesn't solve the problem though, heres why:

// get card
int card = deck[icard];

This assumes a data structure has been given that holds the set from which
the subset is derived from.

In this case, I do not have a data structure holding all the integers from
which to choose. I only have a range: BeginInt and EndInt and a random
subset of unique integers must be chosen from that Range.

Thanks anyway though.
-Chris


Jul 19 '05 #9

P: n/a
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
This still doesn't solve the problem though, heres why:

// get card
int card = deck[icard];

This assumes a data structure has been given that holds the set from which
the subset is derived from.


It doesn't assume that anything's been given. It's just an algorithm for
generating a unique subset. You could have a class called RandomSubset that
implements the algorithm I suggested, e.g.,

class RandomIntegerSelector
{
std::vector<int> m_allValues;
int m_nrValuesRemaining;
public:
// constructor fills m_allValues with the full set of values
RandomIntegerSelector(int begin, int end);
int NextRandomInteger();
};

The NextRandomInteger() member does the same as my CardDeck::DealCard()
member.
Your function then uses an object of this class to generate the subset:

std::vector<int> RandomSubsetOfIntegers( int beginRange, int endRange, int
subsetSize )
{
IntegerRandomSubset integerSelector(beginRange, endRange);
std::vector<int> subset;
int nrRemaining = subsetSize;
while(nrRemaining-- > 0)
subset.push_back(integerSelector.NextRandomInteger ());
return subset;
}

DW

Jul 19 '05 #10

P: n/a
David White" <no@email.provided> wrote in message
news:Fj*****************@nasal.pacific.net.au...
You could have a class called RandomSubset
Or even what I actually called it :-).
class RandomIntegerSelector
{


DW

Jul 19 '05 #11

P: n/a
"David White" <no@email.provided> wrote in message
news:Fj*****************@nasal.pacific.net.au...

A second correction...
class RandomIntegerSelector
{
[snip]
}; std::vector<int> RandomSubsetOfIntegers( int beginRange, int endRange, int
subsetSize )
{
IntegerRandomSubset integerSelector(beginRange, endRange);


That is: RandomIntegerSelector integerSelector(beginRange, endRange);

This is what happens when you change your mind at the last moment.

DW

Jul 19 '05 #12

P: n/a
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
Also, if you know of any general good random number libratries that are fast and have an even distribution, let me know as well!


Try looking up Mersenne Twister.

DW

Jul 19 '05 #13

P: n/a

"David White" <no@email.provided> wrote in message
news:Fj*****************@nasal.pacific.net.au...
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
This still doesn't solve the problem though, heres why:

// get card
int card = deck[icard];

This assumes a data structure has been given that holds the set from which the subset is derived from.
It doesn't assume that anything's been given. It's just an algorithm for
generating a unique subset. You could have a class called RandomSubset

that implements the algorithm I suggested, e.g.,
==============================>

Yeah, you're right, it doesn't assume that.

What I meant though is that the function uses a data structure that holds
the complete set of intergers.

Which is fine for this input range of
1 to 100

But very bad for an input range of
1 to 10000000000

So this implementation is not very versitile because it uses memory that an
optimal algorithm would not need to use.

class RandomIntegerSelector
{
std::vector<int> m_allValues;
==================>
See, this is what I have a problem with.
int m_nrValuesRemaining;
public:
// constructor fills m_allValues with the full set of values
==========================>
And this. What if a subset of length 10 is needed from an integer range of
0 to 10000000000000?
Then this implementation is wildly inappropriate.
RandomIntegerSelector(int begin, int end);
int NextRandomInteger();
};

The NextRandomInteger() member does the same as my CardDeck::DealCard()
member.
Your function then uses an object of this class to generate the subset:

std::vector<int> RandomSubsetOfIntegers( int beginRange, int endRange, int
subsetSize )
{
IntegerRandomSubset integerSelector(beginRange, endRange);
std::vector<int> subset;
int nrRemaining = subsetSize;
while(nrRemaining-- > 0)
subset.push_back(integerSelector.NextRandomInteger ());
return subset;
}

DW


FYI,

what I'm doing here is designing an Artificial Neural Network. A Neuron in
the first layer links to a random number of Neurons in the second layer
between 1 and SecondLayer.size(). So I need a subset of integers that
correspond to the index's of the Neurons in the second layer to link to.
This Neural network could potentially be very large depending on the input
values so I don't want to create a large array of integers each time I need
to decided what each Neuron in Layer1 connects to.

-Chris
Jul 19 '05 #14

P: n/a

"David White" <no@email.provided> wrote in message
news:EC*****************@nasal.pacific.net.au...
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
Also, if you know of any general good random number libratries that are

fast
and have an even distribution, let me know as well!


Try looking up Mersenne Twister.

DW


That code looks great, thanks a lot!

-Chris
Jul 19 '05 #15

P: n/a
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...

"Dan Cernat" <ce****@dan.com> wrote in message
news:vr************@corp.supernews.com...
Don't top post in this NG

{reformatted}
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
"Thomas J. Clancy" <tj******@comcast.net> wrote in message
news:r9********************@comcast.com...
> "Chris Dutrow" <ch******@vt.edu> wrote in message
> news:vr************@corp.supernews.com...
> > I searched around on the net for a bit, couldn't find anything though. > >
> > I would like to find some code for a function where I input A Range
Of
> > Integers
> > For example: Function( 1, 100 );
> >
> > And the function will return me an array holding a random subset
of > integers
> > in that range of a size that I specify
> >
> > So the Function would Probabaly look something like this:
> >
> > std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int
EndRanger,
> > int SubsetSize );
> >
>
> Have you tried to implement this on your own? Just curious, because it
is
a
> trivial problem to solve.
>
> tom
>
>
>
>
>
> ---
> Outgoing mail is certified Virus Free.
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003
>
>
I thought about it for a while.

The best I could come up with was an O(NlogN) algorithm that would be
a pain
for me to code.

So I figured I might as well not re-invent the wheel and try and find

some code written by experts who are likely much smarter than me and worked

hard
on an optimal and bugless solution.

-Chris


O(NlogN)???
Look up in your documentation the function rnd -> it returns a random
integer from 0 to RAND_MAX

Your function becomes trivial now, isnt it?
I just cannot stop wonder, O(NlogN)???
/dan


Hmmmm, I must not have explianed well.

I need random subSET of a RANGE of integers

So The function does not get a data structure holding integers from which

to choose. It just gets two integers: a BeginRange and EndRange

Then it needs to give me a data structure, most likely a std::vector that
holds a subSET of integers from that ranger that have been randomly picked.


This is simply one loop, a rand() statement, and some push_backs into a
vector.
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003
Jul 19 '05 #16

P: n/a
>
> Have you tried to implement this on your own? Just curious, because it
is
a
> trivial problem to solve.
>
> tom
>
>
>
>
>
> ---
> Outgoing mail is certified Virus Free.
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003
>
>
I thought about it for a while.

The best I could come up with was an O(NlogN) algorithm that would be
a pain
for me to code.

So I figured I might as well not re-invent the wheel and try and find

some code written by experts who are likely much smarter than me and worked

hard
on an optimal and bugless solution.

-Chris


O(NlogN)???
Look up in your documentation the function rnd -> it returns a random
integer from 0 to RAND_MAX

Your function becomes trivial now, isnt it?
I just cannot stop wonder, O(NlogN)???
/dan


Hmmmm, I must not have explianed well.

I need random subSET of a RANGE of integers


Would this not work?

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger,
int SubsetSize )
{
vector<int> r;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
r.push_back(j);
}
return r;
}

Of course it would be better if you passed in an out vector (reference to)
so that you can avoid the implicit copy on return. For example:

void RandomSubsetOfIntergers( int BeginRanger, int EndRanger, int
SubsetSize, std::vector<int>& r);

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003
Jul 19 '05 #17

P: n/a

"Thomas J. Clancy" <tj******@comcast.net> wrote in message
news:0-********************@comcast.com...
> >
> > Have you tried to implement this on your own? Just curious, because
it
is
> a
> > trivial problem to solve.
> >
> > tom
> >
> >
> >
> >
> >
> > ---
> > Outgoing mail is certified Virus Free.
> > Checked by AVG anti-virus system (http://www.grisoft.com).
> > Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003
> >
> >
> I thought about it for a while.
>
> The best I could come up with was an O(NlogN) algorithm that would
be
a pain
> for me to code.
>
> So I figured I might as well not re-invent the wheel and try and
find some
> code written by experts who are likely much smarter than me and
worked hard
> on an optimal and bugless solution.
>
> -Chris
>

O(NlogN)???
Look up in your documentation the function rnd -> it returns a random
integer from 0 to RAND_MAX

Your function becomes trivial now, isnt it?
I just cannot stop wonder, O(NlogN)???
/dan


Hmmmm, I must not have explianed well.

I need random subSET of a RANGE of integers


Would this not work?

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger,
int SubsetSize )
{
vector<int> r;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
r.push_back(j);
}
return r;
}

Of course it would be better if you passed in an out vector (reference to)
so that you can avoid the implicit copy on return. For example:

void RandomSubsetOfIntergers( int BeginRanger, int EndRanger, int
SubsetSize, std::vector<int>& r);


Nah, that wouldn't work because the in order for the vector to be a subset,
the integers in it must be unique.

In other words, with the above algorithm, there is a possibility of picking
the same number twice, this is not acceptable.

-Chris


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003

Jul 19 '05 #18

P: n/a

"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...

"Thomas J. Clancy" <tj******@comcast.net> wrote in message
news:0-********************@comcast.com...
> > >
> > > Have you tried to implement this on your own? Just curious, because it
> is
> > a
> > > trivial problem to solve.
> > >
> > > tom
> > >
> > >
> > >
> > >
> > >
> > > ---
> > > Outgoing mail is certified Virus Free.
> > > Checked by AVG anti-virus system (http://www.grisoft.com).
> > > Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003
> > >
> > >
> > I thought about it for a while.
> >
> > The best I could come up with was an O(NlogN) algorithm that would be
a
> pain
> > for me to code.
> >
> > So I figured I might as well not re-invent the wheel and try and find some
> > code written by experts who are likely much smarter than me and worked > hard
> > on an optimal and bugless solution.
> >
> > -Chris
> >
>
> O(NlogN)???
> Look up in your documentation the function rnd -> it returns a random > integer from 0 to RAND_MAX
>
> Your function becomes trivial now, isnt it?
> I just cannot stop wonder, O(NlogN)???
>
>
> /dan
>
>

Hmmmm, I must not have explianed well.

I need random subSET of a RANGE of integers

Would this not work?

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int

EndRanger, int SubsetSize )
{
vector<int> r;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
r.push_back(j);
}
return r;
}

Of course it would be better if you passed in an out vector (reference to) so that you can avoid the implicit copy on return. For example:

void RandomSubsetOfIntergers( int BeginRanger, int EndRanger, int
SubsetSize, std::vector<int>& r);


Nah, that wouldn't work because the in order for the vector to be a

subset, the integers in it must be unique.

In other words, with the above algorithm, there is a possibility of picking the same number twice, this is not acceptable.

-Chris


To me, a subset doesn't necessary connote a set, although it certainly
contains the word set. I know, I know, what the heck do I mean? Nevermind.
So I suppose the following solution is too slow, then?

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger,
int SubsetSize )
{
std::set<int> s;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
if (s.find(j) != s.end())
{
s.insert(j);
}
else
{
i--;
}
}
std::copy(s.begin(), s.end(), back_inserter<int>(r.begin());
return r;
}

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003
Jul 19 '05 #19

P: n/a

"Thomas J. Clancy" <tj******@comcast.net> wrote in message
news:yN********************@comcast.com...

"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...

"Thomas J. Clancy" <tj******@comcast.net> wrote in message
news:0-********************@comcast.com...

> > > >
> > > > Have you tried to implement this on your own? Just curious, because
> it
> > is
> > > a
> > > > trivial problem to solve.
> > > >
> > > > tom
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > ---
> > > > Outgoing mail is certified Virus Free.
> > > > Checked by AVG anti-virus system (http://www.grisoft.com).
> > > > Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003 > > > >
> > > >
> > > I thought about it for a while.
> > >
> > > The best I could come up with was an O(NlogN) algorithm that
would
be
a
> > pain
> > > for me to code.
> > >
> > > So I figured I might as well not re-invent the wheel and try and

find
> some
> > > code written by experts who are likely much smarter than me and

worked
> > hard
> > > on an optimal and bugless solution.
> > >
> > > -Chris
> > >
> >
> > O(NlogN)???
> > Look up in your documentation the function rnd -> it returns a

random > > integer from 0 to RAND_MAX
> >
> > Your function becomes trivial now, isnt it?
> > I just cannot stop wonder, O(NlogN)???
> >
> >
> > /dan
> >
> >
>
> Hmmmm, I must not have explianed well.
>
> I need random subSET of a RANGE of integers
>

Would this not work?

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger, int SubsetSize )
{
vector<int> r;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
r.push_back(j);
}
return r;
}

Of course it would be better if you passed in an out vector (reference to) so that you can avoid the implicit copy on return. For example:

void RandomSubsetOfIntergers( int BeginRanger, int EndRanger, int
SubsetSize, std::vector<int>& r);


Nah, that wouldn't work because the in order for the vector to be a

subset,
the integers in it must be unique.

In other words, with the above algorithm, there is a possibility of

picking
the same number twice, this is not acceptable.

-Chris


To me, a subset doesn't necessary connote a set, although it certainly
contains the word set. I know, I know, what the heck do I mean?

Nevermind. So I suppose the following solution is too slow, then?

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger,
int SubsetSize )
{
std::set<int> s;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
if (s.find(j) != s.end())
{
s.insert(j);
}
else
{
i--;
}
}
std::copy(s.begin(), s.end(), back_inserter<int>(r.begin());
return r;
}

Hmmmmm, I may have read this wrong but...

The algorithm does not appear too slow.

If I read it right, it does not return a subset of length "SubsetSize"
because it simply skips the insertion when a duplicate is encountered.
Also, FYI, I beleive that if you insert a key into a std::set data structure
that is already there, it doesn't insert it by defaul so you don't have to
have if statement there to check if it is already in the set:

if (s.find(j) != s.end())

Also, I beleive the correct definition of a subset in this case would be
that no intergers repeat because there are no duplicate integers in the
input set defined by BeginRang and EndRange. But even if I am wrong about
the definition, what I need is a subset of unique integers that fall in the
input range.

I'm about %100 sure that the solution to this problem is NOT trivial.

Though in the end what I'm looking for is not for someone to suggest an
algorithm for me to code, but for someone to suggest a library that I might
refer to that is elegant and well written.

Thanks,
Chris



---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/6/2003

Jul 19 '05 #20

P: n/a

"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
I searched around on the net for a bit, couldn't find anything though.

I would like to find some code for a function where I input A Range Of
Integers
For example: Function( 1, 100 );

And the function will return me an array holding a random subset of integers in that range of a size that I specify

So the Function would Probabaly look something like this:

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger,
int SubsetSize );


Would this be suitable?

#include <iostream>
#include <vector>
#include <set>
using namespace std;
vector<int> randSubset(int BeginRanger, int EndRanger, int SubsetSize)
{
set<int> s;
vector<int> r;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
if (s.find(j) == s.end())
{
s.insert(j);
}
else
{
i--;
}
}
copy(s.begin(), s.end(), back_inserter(r));
return r;
}
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/7/2003
Jul 19 '05 #21

P: n/a
> > std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int
EndRanger,
int SubsetSize )
{
std::set<int> s;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
if (s.find(j) != s.end())
{
s.insert(j);
}
else
{
i--;
}
}
std::copy(s.begin(), s.end(), back_inserter<int>(r.begin());
return r;
}

Hmmmmm, I may have read this wrong but...

The algorithm does not appear too slow.

If I read it right, it does not return a subset of length "SubsetSize"
because it simply skips the insertion when a duplicate is encountered.
Also, FYI, I beleive that if you insert a key into a std::set data

structure that is already there, it doesn't insert it by defaul so you don't have to
have if statement there to check if it is already in the set:

if (s.find(j) != s.end())

Also, I beleive the correct definition of a subset in this case would be
that no intergers repeat because there are no duplicate integers in the
input set defined by BeginRang and EndRange. But even if I am wrong about
the definition, what I need is a subset of unique integers that fall in the input range.

I'm about %100 sure that the solution to this problem is NOT trivial.

Though in the end what I'm looking for is not for someone to suggest an
algorithm for me to code, but for someone to suggest a library that I might refer to that is elegant and well written.

Okay, I don't know what youy mean by NOT trivial. If you run the code I
thay I just entered at top (and with these corrections) it works as
expected. The if statement is necessary here because if the newly random
number DOES exist, I want to repeat the loop, hence the i--, which will
effectively get me, say, 500 unique random numbers between, say, 999 and
1,900,234 (for example). If these are the requirements the code below
works. So I'm not sure what else you really need. You could simply expect
to receive back a set rather than a vector, thus doing away with the
std::copy() at the end.

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger,
int SubsetSize )
{
std::set<int> s;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
if (s.find(j) == s.end())
{
s.insert(j);
}
else
{
i--;
}
}
std::copy(s.begin(), s.end(), back_inserter(r));
return r;
}

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/7/2003
Jul 19 '05 #22

P: n/a
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
What I meant though is that the function uses a data structure that holds
the complete set of intergers.

Which is fine for this input range of
1 to 100

But very bad for an input range of
1 to 10000000000
I don't know of a single algorithm that's suitable for generating both a
small subset of a large set and a large subset of a small set. For the
latter case you can use a card dealer. For the former case you could forget
the array and just generate numbers in the required range and use a std::set
to check whether a value has already been used. If so, then generate another
value.
So this implementation is not very versitile because it uses memory that an optimal algorithm would not need to use.
It depends on what you call optimal. The card dealer looks about optimal in
terms of execution time when the subset is a large proportion of the set. As
I see it you have a choice between fast and small for that case. I don't
think you can have both. You could choose which algorithm to use according
to the values passed. If the subset size is more than, say, 10 percent of
the entire set, use the card dealer, otherwise use the std::set.
what I'm doing here is designing an Artificial Neural Network. A Neuron in the first layer links to a random number of Neurons in the second layer
between 1 and SecondLayer.size(). So I need a subset of integers that
correspond to the index's of the Neurons in the second layer to link to.
This Neural network could potentially be very large depending on the input
values so I don't want to create a large array of integers each time I need to decided what each Neuron in Layer1 connects to.


Looks like an interesting project. Maybe it calls for techniques specially
designed for it rather than off-the-shelf solutions.

DW

Jul 19 '05 #23

P: n/a
> > Hmmmmm, I may have read this wrong but...

The algorithm does not appear too slow.

If I read it right, it does not return a subset of length "SubsetSize"
because it simply skips the insertion when a duplicate is encountered.
Also, FYI, I beleive that if you insert a key into a std::set data structure
that is already there, it doesn't insert it by defaul so you don't have to have if statement there to check if it is already in the set:

if (s.find(j) != s.end())

Also, I beleive the correct definition of a subset in this case would be
that no intergers repeat because there are no duplicate integers in the
input set defined by BeginRang and EndRange. But even if I am wrong about the definition, what I need is a subset of unique integers that fall in

the
input range.

I'm about %100 sure that the solution to this problem is NOT trivial.

Though in the end what I'm looking for is not for someone to suggest an
algorithm for me to code, but for someone to suggest a library that I

might
refer to that is elegant and well written.

Okay, I don't know what youy mean by NOT trivial. If you run the code I
thay I just entered at top (and with these corrections) it works as
expected. The if statement is necessary here because if the newly random
number DOES exist, I want to repeat the loop, hence the i--, which will
effectively get me, say, 500 unique random numbers between, say, 999 and
1,900,234 (for example). If these are the requirements the code below
works. So I'm not sure what else you really need. You could simply

expect to receive back a set rather than a vector, thus doing away with the
std::copy() at the end.

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger, int SubsetSize )
{
std::set<int> s;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
if (s.find(j) == s.end())
{
s.insert(j);
}
else
{
i--;
}
}
std::copy(s.begin(), s.end(), back_inserter(r));
return r;
}


My apologies, this is the correct form of the function that will do what you
want.

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger,
int SubsetSize )
{
std::vector<int> r;
std::set<int> s;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
if (s.find(j) == s.end())
{
s.insert(j);
}
else
{
i--;
}
}
std::copy(s.begin(), s.end(), back_inserter(r));
return r;
}

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/7/2003
Jul 19 '05 #24

P: n/a

"Thomas J. Clancy" <tj******@comcast.net> wrote in message
news:56********************@comcast.com...
std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger, int SubsetSize )
{
std::set<int> s;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
if (s.find(j) != s.end())
{
s.insert(j);
}
else
{
i--;
}
}
std::copy(s.begin(), s.end(), back_inserter<int>(r.begin());
return r;
}

Hmmmmm, I may have read this wrong but...

The algorithm does not appear too slow.

If I read it right, it does not return a subset of length "SubsetSize"
because it simply skips the insertion when a duplicate is encountered.
Also, FYI, I beleive that if you insert a key into a std::set data

structure
that is already there, it doesn't insert it by defaul so you don't have to have if statement there to check if it is already in the set:

if (s.find(j) != s.end())

Also, I beleive the correct definition of a subset in this case would be
that no intergers repeat because there are no duplicate integers in the
input set defined by BeginRang and EndRange. But even if I am wrong about the definition, what I need is a subset of unique integers that fall in

the
input range.

I'm about %100 sure that the solution to this problem is NOT trivial.

Though in the end what I'm looking for is not for someone to suggest an
algorithm for me to code, but for someone to suggest a library that I

might
refer to that is elegant and well written.

Okay, I don't know what youy mean by NOT trivial. If you run the code I
thay I just entered at top (and with these corrections) it works as
expected. The if statement is necessary here because if the newly random
number DOES exist, I want to repeat the loop, hence the i--, which will
effectively get me, say, 500 unique random numbers between, say, 999 and
1,900,234 (for example). If these are the requirements the code below
works. So I'm not sure what else you really need. You could simply

expect to receive back a set rather than a vector, thus doing away with the
std::copy() at the end.

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger, int SubsetSize )
{
std::set<int> s;
for(int i = 0; i < SubsetSize; i++)
{
int j = (rand() % EndRanger) + BeginRanger;
if (s.find(j) == s.end())
{
s.insert(j);
}
else
{
i--;
}
}
std::copy(s.begin(), s.end(), back_inserter(r));
return r;
}

Aha! I did read it wrong, I looked over it a couple of times but for some
reason by brain interpreted the i-- out as the loop iterator even though it
opviously is not.

I thought of this algorithm and actually described it in a previous post on
this thread.

The problem I had with it is that it will break down givin certain inputs
where the both the range size and the subset size are large and could
theoretically create an infinite loop although this in unlikely.

So I figure it is an unsound algorithm that I would not want to put into a
large program.

-Chris


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.537 / Virus Database: 332 - Release Date: 11/7/2003

Jul 19 '05 #25

P: n/a

"David White" <no@email.provided> wrote in message
news:L1*****************@nasal.pacific.net.au...
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
What I meant though is that the function uses a data structure that holds the complete set of intergers.

Which is fine for this input range of
1 to 100

But very bad for an input range of
1 to 10000000000
I don't know of a single algorithm that's suitable for generating both a
small subset of a large set and a large subset of a small set. For the
latter case you can use a card dealer. For the former case you could

forget the array and just generate numbers in the required range and use a std::set to check whether a value has already been used. If so, then generate another value.
======================================>

I usually try to stay away from such algorithms where it makes choices as to
the method to solve it depending on the inputs and such. I don't consider
these "elegent" solutions.

But I think you are right, I think sometimes I don't get near as much
accomplished because I spend too much time looking for the "elegent"
solution. When a more practical solution like you suggest is both easier to
code, easier to undersatnd, and faster than the elegent solution in most if
not all cases.

So this implementation is not very versitile because it uses memory that

an
optimal algorithm would not need to use.


It depends on what you call optimal. The card dealer looks about optimal

in terms of execution time when the subset is a large proportion of the set. As I see it you have a choice between fast and small for that case. I don't
think you can have both. You could choose which algorithm to use according
to the values passed. If the subset size is more than, say, 10 percent of
the entire set, use the card dealer, otherwise use the std::set.
what I'm doing here is designing an Artificial Neural Network. A Neuron in
the first layer links to a random number of Neurons in the second layer
between 1 and SecondLayer.size(). So I need a subset of integers that
correspond to the index's of the Neurons in the second layer to link to.
This Neural network could potentially be very large depending on the input values so I don't want to create a large array of integers each time I

need
to decided what each Neuron in Layer1 connects to.


Looks like an interesting project. Maybe it calls for techniques specially
designed for it rather than off-the-shelf solutions.


Exactly, thats why I was looking for a library. Hoping one of those crazy
smart genious algorithm people into random numbers ran accross this
algorithm at one point and added it to a library.

The best solution I could think up was to have a SetSize variable
initialized to the size of the Input Range.

Then each time a number is selected, subract 1 from that variable so it
denotes the actual size of the set to choose the numbers through. The
random numbers "i" selected would then be i'th number in that set, but in
order to find which number corresponds to the i'th number, a binary search
would be done to find what numbers that are already in the subset come
before the i'th number.

In other words this is how the algorithm would work:

i = RandomNumberInSetRange();

ActualNumberChosen = i + NumbersThatComBeforeThei'thNumber

So thats where I got O(NlogN) from.

But I do have gut feeling that there is some slick O(N) algorithm....

The thing is, if I coded complicated crap like this whenever I ran into the
need for it, I would never finish this freakin program!

But yeah, thanks for your help David, you're the only person who responded
to my thread who I didn't feel like was out to prove they were smarter than
me and I will end up using that random number code that you referred me to.

By the way, is this not the board to post such questions? It seemed like
what I was asking was beyond most of the people here? Or maybe algorithm
questions are not right for this board?

Thanks!
-Chris


DW

Jul 19 '05 #26

P: n/a
"Chris Dutrow" <ch******@vt.edu> wrote in message
news:vr************@corp.supernews.com...
Exactly, thats why I was looking for a library. Hoping one of those crazy
smart genious algorithm people into random numbers ran accross this
algorithm at one point and added it to a library.
I don't use many libraries, but those I have used have tended to keep things
simple. I'm not sure you'll find a library function that has the broad
specifications you are looking for.
The best solution I could think up was to have a SetSize variable
initialized to the size of the Input Range.

Then each time a number is selected, subract 1 from that variable so it
denotes the actual size of the set to choose the numbers through. The
random numbers "i" selected would then be i'th number in that set, but in
order to find which number corresponds to the i'th number, a binary search
would be done to find what numbers that are already in the subset come
before the i'th number.

In other words this is how the algorithm would work:

i = RandomNumberInSetRange();

ActualNumberChosen = i + NumbersThatComBeforeThei'thNumber
Without any detail I can't tell if this is potentially better than anything
suggested so far.
So thats where I got O(NlogN) from.

But I do have gut feeling that there is some slick O(N) algorithm....
I'm not so sure. To be able to handle any range of values and place a subset
of them in a unique sequence every time over a large number of repetitions
seems to me to be calling for explicitly excluding each value that was
previously generated, one way or another. But there are many brilliant
algorithms for doing all kinds of things, so I'm not prepared to say it
can't be done.
The thing is, if I coded complicated crap like this whenever I ran into the need for it, I would never finish this freakin program!

But yeah, thanks for your help David, you're the only person who responded
to my thread who I didn't feel like was out to prove they were smarter than me and I will end up using that random number code that you referred me to.
By the way, is this not the board to post such questions? It seemed like
what I was asking was beyond most of the people here? Or maybe algorithm
questions are not right for this board?


No, they aren't really. The charter of this NG is discussion of the C++
langauge, as defined by the standard, so this thread has been more off-topic
than on. Perhaps we can claim to be barely on-topic by posting plenty of
standard C++ code, but I don't think that would convince most of the
regulars here. There ought to be a newsgroup for algorithms, but I don't
know of one.

DW

Jul 19 '05 #27

P: n/a
David White wrote:
By the way, is this not the board to post such questions? It seemed
like what I was asking was beyond most of the people here? Or maybe
algorithm questions are not right for this board?


No, they aren't really. The charter of this NG is discussion of the
C++ langauge, as defined by the standard, so this thread has been more
off-topic than on. Perhaps we can claim to be barely on-topic by
posting plenty of standard C++ code, but I don't think that would
convince most of the regulars here. There ought to be a newsgroup for
algorithms, but I don't know of one.


Well, the OP might try in an AI newsgroup like comp.ai.neural-nets,
maybe someone there already had a similar problem and solved it. Or a
more general newsgroup like comp.programming.

Jul 19 '05 #28

P: n/a
Chris Dutrow wrote:
I searched around on the net for a bit, couldn't find anything though.

I would like to find some code for a function where I input A Range Of
Integers
For example: Function( 1, 100 );

And the function will return me an array holding a random subset of integers
in that range of a size that I specify

So the Function would Probabaly look something like this:

std::vector<int> RandomSubsetOfIntergers( int BeginRanger, int EndRanger,
int SubsetSize );
Anyone know where I can find something along these lines?
Also, if you know of any general good random number libratries that are fast
and have an even distribution, let me know as well!

One solution is this - this is not cryptogrphically safe but for testing
or monte-carlo algorithms it's more than fine.

Theory:

It is fact that if

a is a positive integer and b is a positive integer and a and b a re
relatively prime (i.e. gcd(a,b)=1) (gcd is greatest common divisor)

then in the series:

p(n+1) = (p(n)+a)%b

Then:

p(n+b) = p(n)
So a simple fast way to generate your array is:

(warning - untested code)

void MakeRandomVector( std::vector<int> & out, int begin, int end )
{

unsigned period = end - begin + 1; // begin .. end (including end).

// pick a random key

unsigned long key;

do {

key = rand() + 1;

} while ( ( key <= period ) || ( gcd( key, period ) != 1 ) );

out.resize( period );

// pick a random starting point;
unsigned int p0 = rand() % period;

std::vector<int>::iterator itr = out.begin();

unsigned int pn = p0;

unsigned count = period;

while ( count )
{
* ( ++ itr ) = pn + begin;

pn = ( pn + key ) % period;

-- count;
}

assert( pn == p0 );
}

Now - to make this cryptograpgically secure, you could a) pick REALLY
large keys and b) you could repeat the procedure within itself

i.e.

where : MakeRandomVector( vecnmx, 0, period-1 );

and

vecn[i] = vecnm1[ vecnm2[ vecnm3[ ... vec1[ i ] ... ] ] ];

( need to make sure that all the keys are prime (or relatively prime) ).

Jul 22 '05 #29

P: n/a
Gianni Mariani wrote:
....
Now - to make this cryptograpgically secure, you could a) pick REALLY
large keys


I gave that some more thought and that's wrong - the size of the key is
irrelevant. The only thing that is important about the key is it's
modulus relative to the period.

Jul 22 '05 #30

This discussion thread is closed

Replies have been disabled for this discussion.