469,140 Members | 1,355 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,140 developers. It's quick & easy.

random.sample with long int items

I need the random.sample functionality where the population grows up to
long int items. Do you know how could I get this same functionality in
another way? thanks in advance.
Jordi

Apr 12 '06 #1
5 2034
"jordi" <jp******@gmail.com> writes:
I need the random.sample functionality where the population grows up to
long int items. Do you know how could I get this same functionality in
another way? thanks in advance.


Nothing stops you:
from random import sample
a = [n**25 for n in range(6)]
a [0, 1, 33554432, 847288609443L, 1125899906842624L, 298023223876953125L] sample(a,2) [1125899906842624L, 298023223876953125L] sample(a,2) [298023223876953125L, 847288609443L]


Is this what you were asking, or did you mean something different?
Apr 12 '06 #2
On Wed, 12 Apr 2006 06:29:01 -0700, jordi wrote:
I need the random.sample functionality where the population grows up to
long int items. Do you know how could I get this same functionality in
another way? thanks in advance.


I'm thinking you might need to find another way to do whatever it is you
are trying to do.

If you can't, you could do something like this:

- you want to randomly choose a small number of items at random from a
population of size N, where N is very large.

e.g. you would do this: random.sample(xrange(10**10), 60)
except it raises an exception.

- divide your population of N items in B bins of size M, where both B and
M are in the range of small integers. Ideally, all your bins will be equal
in size.

e.g.
bins = [xrange(start*10**5, (start+1)*10**5) \
for start in xrange(10**5)]
- then, to take a sample of n items, do something like this:

# bins is the list of B bins;
# each bin has M items, and B*M = N the total population.
result = []
while len(result) < sample_size:
# choose a random bin
bin = random.choice(bins)
# choose a random element of that bin
selection = random.choice(bin)
if selecting_with_replacement:
result.append(selection)
else:
# each choice must be unique
if not selection in result:
result.append(selection)
Hope that helps.
--
Steven.

Apr 12 '06 #3
On Wed, 12 Apr 2006 06:44:29 -0700, Paul Rubin wrote:
"jordi" <jp******@gmail.com> writes:
I need the random.sample functionality where the population grows up to
long int items. Do you know how could I get this same functionality in
another way? thanks in advance.


Nothing stops you:
>>> from random import sample
>>> a = [n**25 for n in range(6)]
>>> a [0, 1, 33554432, 847288609443L, 1125899906842624L, 298023223876953125L] >>> sample(a,2) [1125899906842624L, 298023223876953125L]


No, I think he means the size of the list is big enough to need a long
int. Something like xrange(10**10) or even bigger.
random.sample(xrange(10*10), 10) [96, 45, 90, 52, 57, 72, 94, 73, 79, 97] random.sample(xrange(10**10), 10)

Traceback (most recent call last):
File "<stdin>", line 1, in ?
OverflowError: long int too large to convert to int
--
Steven.

Apr 12 '06 #4
Steven D'Aprano <st***@REMOVETHIScyber.com.au> writes:
e.g. you would do this: random.sample(xrange(10**10), 60)
except it raises an exception.


For a population that large and a sample that small (less than
sqrt(population size), the chance of collision is fairly small, so you
can just discard duplicates.

This relies on Python 2.4's randrange function to generate arbitrarily
large ranges, which in turn relies on having getrandbits (new 2.4
feature, thanks Ray) available:

samp = Set()
while len(samp) < 60:
samp.add(random.randrange(10**10))

Apr 12 '06 #5
That is just what I need. I did't mind on 'divide and conquer' :(

Thanks a lot!

--
Jordi

Apr 12 '06 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Keith Griffiths | last post: by
4 posts views Thread by Bart Nessux | last post: by
4 posts views Thread by james blair | last post: by
1 post views Thread by steflhermitte | last post: by
4 posts views Thread by Jesse Noller | last post: by
4 posts views Thread by Jonathan Burd | last post: by
19 posts views Thread by Boris Borcic | last post: by
13 posts views Thread by Bruza | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.