472,958 Members | 1,483 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,958 software developers and data experts.

fastest method to choose a random element

Hello,
This is a question for the best method (in terms of performance
only) to choose a random element from a list among those that satisfy
a certain property.

This is the setting: I need to pick from a list a random element
that satisfies a given property. All or none of the elements may have
the property. Most of the time, many of the elements will satisfy the
property, and the property is a bit expensive to evaluate. Chance of
having the property are uniform among elements.

A simple approach is:

import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property

Returns None if no element has the property
'''
random.shuffle(a_list)
for i in a_list:
if property(i): return i

but that requires to shuffle the list every time.

A second approach, that works if we know that at least one element of
the list has the property, is:

import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property

Loops forever if no element has the property
'''
while 1:
i=random.choice(a_list)
if property(i): return i

which is more efficient (on average) if many elements of the list have
the property and less efficient if only few elements of the list has
the property (and goes crazy if no element has the property)

Yet another one:

import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property
'''
b_list=[x for x in a_list if property(x)]
try:
return random.choice(b_list)
finally: return None

but this one checks the property on all the elements, which is no
good.

I don't need strong random numbers, so a simple solution like:
import random
globalRNG=random.Random()

def random_pick(a_list,property):
'''Returns a random element from a list that has the property

Works only if len(a_list)+1 is prime: uses Fermat's little theorem
'''
a=globalRNG(1,len(a_list))
ind=a
for i in xrange(len(a_list)):
x=a_list[a-1]
if property(x):return x
ind*=a

but this works only if len(a_list)+1 is prime!!! Now this one could be
saved if there were an efficient method to find a prime number given
than a number n but not very much greater...

Any other ideas? Thanks everybody
Jan 4 '08 #1
15 2660
On Jan 4, 7:55 pm, c...@mailinator.com wrote:
Hello,
This is a question for the best method (in terms of performance
only) to choose a random element from a list among those that satisfy
a certain property.

This is the setting: I need to pick from a list a random element
that satisfies a given property. All or none of the elements may have
the property. Most of the time, many of the elements will satisfy the
property, and the property is a bit expensive to evaluate. Chance of
having the property are uniform among elements.

A simple approach is:

import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property

Returns None if no element has the property
'''
random.shuffle(a_list)
for i in a_list:
if property(i): return i

but that requires to shuffle the list every time.

A second approach, that works if we know that at least one element of
the list has the property, is:

import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property

Loops forever if no element has the property
'''
while 1:
i=random.choice(a_list)
if property(i): return i

which is more efficient (on average) if many elements of the list have
the property and less efficient if only few elements of the list has
the property (and goes crazy if no element has the property)

Yet another one:

import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property
'''
b_list=[x for x in a_list if property(x)]
try:
return random.choice(b_list)
finally: return None

but this one checks the property on all the elements, which is no
good.

I don't need strong random numbers, so a simple solution like:
import random
globalRNG=random.Random()

def random_pick(a_list,property):
'''Returns a random element from a list that has the property

Works only if len(a_list)+1 is prime: uses Fermat's little theorem
'''
a=globalRNG(1,len(a_list))
ind=a
for i in xrange(len(a_list)):
x=a_list[a-1]
if property(x):return x
ind*=a

but this works only if len(a_list)+1 is prime!!! Now this one could be
saved if there were an efficient method to find a prime number given
than a number n but not very much greater...

Any other ideas? Thanks everybody
Caching might help.

If random_pick is called several times with the same list(s) then
cache the result of
[property(i) for i in a_list] against a_list.

If random_pick is called several times with list(s) whith multiple
instances of list items then cache
property(i) against i for i in a_list .

You could do both.

You might investigate wether property(i) could be implemented using a
faster algorithm, maybe table lookup, or interpolation from initial
table lookup.

- Paddy.
Jan 5 '08 #2
Caching might help.

If random_pick is called several times with the same list(s) then
cache the result of
[property(i) for i in a_list] against a_list.

If random_pick is called several times with list(s) with multiple
instances of list items then cache
property(i) against i for i in a_list .

You could do both.

You might investigate wether property(i) could be implemented using a
faster algorithm, maybe table lookup, or interpolation from initial
table lookup.

- Paddy.
Thanks, Paddy. Those are interesting general comments for the general
problem.
By the way, I noticed two things:

I forgot to write randint in the third approach:
a=globalRNG.randint(1,len(a_list))
The warning "The group you are posting to is a Usenet group. Messages
posted to this group will make your email address visible to anyone on
the Internet." means a person, but not a bot, may see my email
address, so it is safe to use my real address next time...

Jan 5 '08 #3
On Jan 4, 7:55*pm, c...@mailinator.com wrote:
* Hello,
* This is a question for the best method (in terms of performance
only) to choose a random element from a list among those that satisfy
a certain property.

* This is the setting: I need to pick from a list a random element
that satisfies a given property. All or none of the elements may have
the property. Most of the time, many of the elements will satisfy the
property, and the property is a bit expensive to evaluate. Chance of
having the property are uniform among elements.
Here's some code that uses a cached random sorting of the list. It
assumes you'll want more than one random element. It never calls
'prop' on the same element twice, and it's O(n) even when the elements
that pass 'prop' are sparse. I hope this is useful to you!

import random

class RandomPicker(object):
def __init__(self, seq, prop=lambda x:True):
seq = list(seq)
random.shuffle(seq)
# Store with the item whether we've computed prop on it
already.
self.random_list = [(x, False) for x in seq]
self.prop = prop
def pick(self):
for i, (x, p) in enumerate(self.random_list):
if p or self.prop(x):
if not p:
# Record the fact that self.prop passed.
self.random_list[i] = (x, True)
# Chop off the items that prop failed on.
self.random_list = self.random_list[i:]
r = self.random_list
# Instead of shuffling the whole list again, just
insert
# x back in the list randomly. Since the remaining
elements
# are randomly sorted already, this is ok.
n = random.randint(0, len(r) - 1)
r[0], r[n] = r[n], r[0]
return x
# Nothing in the list passes the 'prop' test.
return None

# Example: pick 100 odd integers from 0 to 1000.
a = RandomPicker(xrange(1000), lambda x: x % 2 == 1)
print [a.pick() for i in xrange(100)]

--
Paul Hankin
Jan 5 '08 #4
On Jan 4, 7:55*pm, c...@mailinator.com wrote:
* Hello,
* This is a question for the best method (in terms of performance
only) to choose a random element from a list among those that satisfy
a certain property.

* This is the setting: I need to pick from a list a random element
that satisfies a given property. All or none of the elements may have
the property. Most of the time, many of the elements will satisfy the
property, and the property is a bit expensive to evaluate. Chance of
having the property are uniform among elements.
The generator function below yields an infinite sequence of randomly
picked elements from the list who satisfy the property, or nothing if
the list contains no element satisfying the property. It guarantees
that each time, prop() will either not be called or will be called
just enough to find one single item that satisfies it. The catch is
that you need to have an estimate for the number of items that satisfy
the property in the list.

import random
from itertools import islice, ifilter

def picker(lst, prop, np):
# lst: list of items to pick elements from
# prop: property that picked elements must fulfil
# np: (good estimate of) number of items that
# satisfy the property
random.shuffle(lst)
plst = [] # items we know fulfil prop
for item in ifilter(prop, lst):
# The next random item may be one already yielded
while True:
i = random.randrange(np)
if i >= len(plst): break
yield plst[i]
# Or it may be a new one
plst.append(item)
if len(plst) np: np = len(plst)
yield item
# Now we know all items fulfilling prop
if not plst: return
while True:
yield plst[random.randrange(len(plst))]
def test(picker, n=1000):
res = {}
for val in islice(picker, n):
res[val] = res.get(val, 0) + 1
return res
Example:
>>p = picker(range(20), lambda x: x % 2, 10)
test(p)
{1: 113, 3: 96, 5: 87, 7: 91, 9: 109, 11: 91, 13: 106, 15: 101, 17:
109, 19: 97}

>>p = picker(range(20), lambda x: False, 10)
p.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

I don't know if there is a good idea in there, I'll let you be the
judge :)

--
Arnaud

Jan 5 '08 #5
Hello, Paul and Arnaud.
While I think about your answers: do you think there is any way to
avoid shuffle?
It may take unnecessary long on a long list most of whose elements
have the property.
Jan 5 '08 #6
On Jan 5, 5:07 pm, c...@mailinator.com wrote:
Hello, Paul and Arnaud.
While I think about your answers: do you think there is any way to
avoid shuffle?
It may take unnecessary long on a long list most of whose elements
have the property.
Umm...
You provide nice answers in the case many elements are picked from the
same list.
Any ideas for the case when the picker is called many times on a
program, but never twice with the same list?
Jan 5 '08 #7
Any other ideas?

How about this:

def random_pick(list, property):
L = len(list)
pos = start = random.randrange(L)
while 1:
x = list[pos]
if property(x): return x
pos = (pos + 1) % L
if pos == start:
raise ValueError, "no such item"

Regards,
Martin
Jan 5 '08 #8
On Jan 5, 5:12*pm, "Martin v. Lwis" <mar...@v.loewis.dewrote:
Any other ideas?

How about this:

def random_pick(list, property):
* * L = len(list)
* * pos = start = random.randrange(L)
* * while 1:
* * * * x = list[pos]
* * * * if property(x): return x
* * * * pos = (pos + 1) % L
* * * * if pos == start:
* * * * * *raise ValueError, "no such item"
This might be acceptable for the OP's use, but it's strongly biased
towards values which follow a long stream of things that fail
property.

print [random_pick(range(100), lambda x:x 90)
for i in range(999)].count(91)

929

If it were uniform, it should be around 111.

--
Paul Hankin
Jan 5 '08 #9
On Jan 5, 4:14*pm, c...@mailinator.com wrote:
On Jan 5, 5:07 pm, c...@mailinator.com wrote:
Hello, Paul and Arnaud.
While I think about your answers: do you think there is any way to
avoid shuffle?
It may take unnecessary long on a long list most of whose elements
have the property.

Umm...
You provide nice answers in the case many elements are picked from the
same list.
Any ideas for the case when the picker is called many times on a
program, but never twice with the same list?
Here's a pragmatic optimisation for any algorithm: first test some
elements at random in case you get lucky or most of the elements are
good.

Eg, that optimisation applied to the naive shuffle algorithm.

import random
import itertools

def pick_random_fast(seq, prop):
L = len(seq)
stabs = 5
for i in xrange(stabs):
r = random.randrange(L)
if prop(seq[r]): return seq[r]
random.shuffle(seq)
return itertools.ifilter(prop, seq).next()

I've used 5 'stabs' here. Perhaps it should be a function of L, but I
suppose you can tune it for your data.

--
Paul Hankin
Jan 5 '08 #10
On Sat, 05 Jan 2008 08:14:46 -0800, caca wrote:
On Jan 5, 5:07 pm, c...@mailinator.com wrote:
>Hello, Paul and Arnaud.
While I think about your answers: do you think there is any way to
avoid shuffle?
It may take unnecessary long on a long list most of whose elements have
the property.

Umm...
You provide nice answers in the case many elements are picked from the
same list.
Any ideas for the case when the picker is called many times on a
program, but never twice with the same list?
ISTM the best thing would be to reimplement the shuffle algorithm, but to
stop shuffling as soon as you get a hit. The drawback is that it's a
destructive operation, but that doesn't sound like it's an issue for you.

Here's something for starters:

def random_element_with_property(x,test_property_func) :
for i in xrange(len(x)-1):
j = random.randrange(i+1,len(x))
tmp = x[j]
if test_property_func(tmp):
return tmp
x[j] = x[i]
x[i] = tmp
return None
Then, for example, use it like this:

def odd(i): return i&1
e = random_element_with_property(range(20),odd)

Carl Banks
Jan 5 '08 #11
On Jan 5, 8:14 am, c...@mailinator.com wrote:
On Jan 5, 5:07 pm, c...@mailinator.com wrote:
Hello, Paul and Arnaud.
While I think about your answers: do you think there is any way to
avoid shuffle?
It may take unnecessary long on a long list most of whose elements
have the property.

Umm...
You provide nice answers in the case many elements are picked from the
same list.
Any ideas for the case when the picker is called many times on a
program, but never twice with the same list?

Here's my stab:

from random import randint, seed
from time import time
from sys import stdout
seed(time())

iterations = 0#DEBUG
def pick_random(seq, prop=bool):
temp = list(seq)
global iterations#DEBUG
while temp:
iterations += 1#DEBUG
i = randint(0, len(temp) - 1)
if prop(temp[i]): return temp[i]
else: del temp[i]
def generate_list(length):
l = list()
for x in xrange(length): l.append(randint(0,1) * randint(1,1000))
return l

count = 0
for x in xrange(1000):
count += 1
print pick_random(generate_list(1000)),

print
print "AVERAGE ITERATIONS:", float(iterations) / count


The average number of iterations is 1/p where p is the chance of your
property being true. It's independent of list size! Just remove the
DEBUG lines and it's ready for use.

--Buck
Jan 5 '08 #12
On Jan 5, 9:12 am, "Martin v. Lwis" <mar...@v.loewis.dewrote:
Any other ideas?

How about this:

def random_pick(list, property):
L = len(list)
pos = start = random.randrange(L)
while 1:
x = list[pos]
if property(x): return x
pos = (pos + 1) % L
if pos == start:
raise ValueError, "no such item"

Regards,
Martin
I thought about this, but in the sequence "00012" (and property =
bool) the 1 will be returned four times as often as the 2. Maybe
that's ok...
Jan 5 '08 #13
On Jan 5, 4:16 am, c...@mailinator.com wrote:
The warning "The group you are posting to is a Usenet group. Messages
posted to this group will make your email address visible to anyone on
the Internet." means a person, but not a bot, may see my email
address, so it is safe to use my real address next time...
Wrong. A bot may be able to read and scan all messages sent through
Usenet.
Jan 6 '08 #14
On Jan 5, 6:37 am, Paddy <paddy3...@googlemail.comwrote:
On Jan 4, 7:55 pm, c...@mailinator.com wrote:
Hello,
This is a question for the best method (in terms of performance
only) to choose a random element from a list among those that satisfy
a certain property.
This is the setting: I need to pick from a list a random element
that satisfies a given property. All or none of the elements may have
the property. Most of the time, many of the elements will satisfy the
property, and the property is a bit expensive to evaluate. Chance of
having the property are uniform among elements.
A simple approach is:
import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property
Returns None if no element has the property
'''
random.shuffle(a_list)
for i in a_list:
if property(i): return i
but that requires to shuffle the list every time.
A second approach, that works if we know that at least one element of
the list has the property, is:
import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property
Loops forever if no element has the property
'''
while 1:
i=random.choice(a_list)
if property(i): return i
which is more efficient (on average) if many elements of the list have
the property and less efficient if only few elements of the list has
the property (and goes crazy if no element has the property)
Yet another one:
import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property
'''
b_list=[x for x in a_list if property(x)]
try:
return random.choice(b_list)
finally: return None
but this one checks the property on all the elements, which is no
good.
I don't need strong random numbers, so a simple solution like:
import random
globalRNG=random.Random()
def random_pick(a_list,property):
'''Returns a random element from a list that has the property
Works only if len(a_list)+1 is prime: uses Fermat's little theorem
'''
a=globalRNG(1,len(a_list))
ind=a
for i in xrange(len(a_list)):
x=a_list[a-1]
if property(x):return x
ind*=a
but this works only if len(a_list)+1 is prime!!! Now this one could be
saved if there were an efficient method to find a prime number given
than a number n but not very much greater...
Any other ideas? Thanks everybody

Caching might help.

If random_pick is called several times with the same list(s) then
cache the result of
[property(i) for i in a_list] against a_list.

If random_pick is called several times with list(s) whith multiple
instances of list items then cache
property(i) against i for i in a_list .

You could do both.

You might investigate wether property(i) could be implemented using a
faster algorithm, maybe table lookup, or interpolation from initial
table lookup.

- Paddy.
Here's a caching decorator that you could apply to function property:
http://aspn.activestate.com/ASPN/Coo.../Recipe/498245

- Paddy.
Jan 7 '08 #15
Just for fun, I profiled my answer versus the final answer...
This mailing list is awesome!
PS:ajaksu, I have to leave now, I hope bukzor's answer was enough to
you (at least for the moment)
Jan 7 '08 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: kingofkolt | last post by:
I have a directory of images, called "random". In it are the following files: 1.gif 2.gif 3.gif 4.gif I use this script to choose a random image and display it:
6
by: lucy | last post by:
Hello comp.lang.js.I need a script to rotate a background image on each newpage load, given the id of the element and a list (array?)of 2 or more image paths ("img1.jpg", "img2.jpg", ...).Actually,...
2
by: vsgdp | last post by:
From what I learned, if you want to do random element insertions and deletions you should use a list. But, with std::vector, if the order of the elements does not matter, couldn't you efficiently...
2
by: Richard Bower | last post by:
Hi, What is the fastest .NET method of opening an XML file (no schema validation) and doing basic XPath Queries? Thanks Richard.
5
by: Gaurav - http://www.gauravcreations.com | last post by:
what is the fastest method to sort and load 1 lakh + strings in a list box from a text file. each string is in a new line -- Gaurav Creations
5
nurulshidanoni
by: nurulshidanoni | last post by:
How to choose one element from each array., Which A(i)+B(i)+C(i)+D(i)+E(i)+F(i) must equal to 10. __________________________________________________ i A B C D E F...
2
by: Marwa Ahmed | last post by:
i have large amount of data- i want to choose from it random one record only every time
1
by: peter perez | last post by:
how do i create a java method that removes a random element and then substitutes it for the last known element in the array.
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
0
by: Aliciasmith | last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...
0
tracyyun
by: tracyyun | last post by:
Hello everyone, I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
4
NeoPa
by: NeoPa | last post by:
Hello everyone. I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report). I know it can be done by selecting :...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
3
by: nia12 | last post by:
Hi there, I am very new to Access so apologies if any of this is obvious/not clear. I am creating a data collection tool for health care employees to complete. It consists of a number of...
0
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
2
by: GKJR | last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.