473,320 Members | 1,572 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

selecting a random item from a set

Quite a few algortihms prominently feature choosing/removing a single random
element from a set at a time. On first sight I can't think of anything better
to achieve this with `sets.Set` than something along the lines of:

for dummy, randomElement in zip(range(randrange(len(s)+1)), s): pass
# possibly followed by
s.remove(randomElement)

Is there a better way? If not, how about something like Set.randompop()?

'as
Jul 18 '05 #1
12 8576
Alexander Schmolck wrote:
Is there a better way? If not, how about something like Set.randompop()?


In principle, random.choice "ought to" work. It doesn't, as it expects
the set to be indexable.

I would not like the Set type to grow random methods :-)

Regards,
Martin

Jul 18 '05 #2
Alexander Schmolck <a.********@gmx.net> writes:
Quite a few algortihms prominently feature choosing/removing a
single random element from a set at a time. On first sight I can't
think of anything better to achieve this with `sets.Set` than
something along the lines of:

for dummy, randomElement in zip(range(randrange(len(s)+1)), s): pass
# possibly followed by
s.remove(randomElement)

Is there a better way? If not, how about something like Set.randompop()?


The classic way to do it goes something like this:

for n, randomElement in enumerate(s):
if random() < (1.0 / (n+1)):
e = randomElement
# possibly followed by
s.remove(randomElement)

Note that you don't need to know the size of the set in advance. You
can use the same method for (e.g.) choosing a random line from a file,
without knowing how many lines the file has, and without having to
read the file twice or store it in memory.

Seeing why this works (unless I made an error) is left as an exercise :).
Jul 18 '05 #3
Paul Rubin <http://ph****@NOSPAM.invalid> writes:
The classic way to do it goes something like this:

for n, randomElement in enumerate(s):
if random() < (1.0 / (n+1)):
e = randomElement
# possibly followed by
s.remove(randomElement)


Oops, meant to say:

for n, e in enumerate(s):
if random() < (1.0 / (n+1)):
randomElement = e
# possibly followed by
s.remove(randomElement)
Jul 18 '05 #4
Paul Rubin wrote:
The classic way to do it goes something like this:

for n, randomElement in enumerate(s):
if random() < (1.0 / (n+1)):
e = randomElement
0 <= random() < 1
1.0/(n+1) == 1 for n == 0, so this will always choose the first element.
# possibly followed by
s.remove(randomElement)

Note that you don't need to know the size of the set in advance. You
can use the same method for (e.g.) choosing a random line from a file,
without knowing how many lines the file has, and without having to
read the file twice or store it in memory.

Seeing why this works (unless I made an error) is left as an exercise :).


Peter

Jul 18 '05 #5
Peter Otten <__*******@web.de> writes:
for n, randomElement in enumerate(s):
if random() < (1.0 / (n+1)):
e = randomElement


0 <= random() < 1
1.0/(n+1) == 1 for n == 0, so this will always choose the first element.


It will always select the first element, but then on some (random)
subsequent passes through the loop, it will replace the selection.
For example, suppose there are two elements. On the first pass, e
gets set to the first element. On the second pass, e gets set to the
second element if random() < 0.5, i.e. 50% of the time. So after the
loop is done, e is either the first or the second element, with equal
probability. With k>2 elements, it works the same way.
Jul 18 '05 #6
Paul Rubin wrote:
Peter Otten <__*******@web.de> writes:
> for n, randomElement in enumerate(s):
> if random() < (1.0 / (n+1)):
> e = randomElement
0 <= random() < 1
1.0/(n+1) == 1 for n == 0, so this will always choose the first element.


It will always select the first element, but then on some (random)
subsequent passes through the loop, it will replace the selection.
For example, suppose there are two elements. On the first pass, e
gets set to the first element. On the second pass, e gets set to the
second element if random() < 0.5, i.e. 50% of the time. So after the
loop is done, e is either the first or the second element, with equal
probability. With k>2 elements, it works the same way.


You are right. I saw the typo and "overcorrected" by adding a break
statement in my mind.
Note that you don't need to know the size of the set in advance. You


That remark in your first post helped the misunderstanding. I managed to
overread the _in_ _advance_ and couldn't figure out a satisfactory break
condition - which now it turns out doesn't exist.

But now I've understood it.

Peter

Jul 18 '05 #7

Paul Rubin <http://ph****@NOSPAM.invalid> writes:
Alexander Schmolck <a.********@gmx.net> writes:
for dummy, randomElement in zip(range(randrange(len(s)+1)), s): pass
# possibly followed by
s.remove(randomElement)

Is there a better way? If not, how about something like Set.randompop()?


The classic way to do it goes something like this:

for n, randomElement in enumerate(s):
if random() < (1.0 / (n+1)):
e = randomElement
# possibly followed by
s.remove(randomElement)

Note that you don't need to know the size of the set in advance.


Nice algorithm, but maybe *only* if you don't know the size in advance (have
constant time len) -- as far as I can see it does more than twice the work of
my original solution.

'as
Jul 18 '05 #8
"Martin v. Loewis" <ma****@v.loewis.de> writes:
Alexander Schmolck wrote:
Is there a better way? If not, how about something like Set.randompop()?
In principle, random.choice "ought to" work.
It doesn't, as it expects the set to be indexable.


Maybe not insensibly -- the fairly generic approach outlined above that will
work for any iterable with len is clearly not desirable as default (because of
its time complexity), and just using it as a fallback won't work either,
because AFACT there is no way for `choice` to find out whether its argument is
a sequence or not.
I would not like the Set type to grow random methods :-)


Well, one ought to do :)

'as
Jul 18 '05 #9
On Wed, Dec 31, 2003 at 02:40:17AM +0000, Alexander Schmolck wrote:
"Martin v. Loewis" <ma****@v.loewis.de> writes:
Alexander Schmolck wrote:
Is there a better way? If not, how about something like Set.randompop()?


In principle, random.choice "ought to" work.
It doesn't, as it expects the set to be indexable.


Maybe not insensibly -- the fairly generic approach outlined above that will
work for any iterable with len is clearly not desirable as default (because of
its time complexity), and just using it as a fallback won't work either,
because AFACT there is no way for `choice` to find out whether its argument is
a sequence or not.

def choice(self, seq):
"""Choose a random element from a non-empty sequence."""
try:
return seq[int(self.random() * len(seq))]
except TypeError:
# Fallback algorithm

Alternatively (not as nice, imho):

def choice(self, seq):
"""Choose a random element from a non-empty sequence."""
typeOrClass = getattr(seq, '__class__', type(seq))
if hasattr(typeOrClass, '__getitem__'):
return seq[int(self.random() * len(seq))]
else:
# Fallback algorithm

Jp

Jul 18 '05 #10
Jp Calderone <ex*****@intarweb.us> writes:
Alexander Schmolck wrote:

Maybe not insensibly -- the fairly generic approach outlined above that will
work for any iterable with len is clearly not desirable as default (because of
its time complexity), and just using it as a fallback won't work either,
because AFACT there is no way for `choice` to find out whether its argument is
a sequence or not.

def choice(self, seq):
"""Choose a random element from a non-empty sequence."""
try:
return seq[int(self.random() * len(seq))]
except TypeError:
# Fallback algorithm


choice(dict(1:"gets this", 2:"gets that", "three":"gets the key"))

'as
Jul 18 '05 #11
On Wed, Dec 31, 2003 at 10:37:13PM +0000, Alexander Schmolck wrote:
Jp Calderone <ex*****@intarweb.us> writes:
> Alexander Schmolck wrote:
Maybe not insensibly -- the fairly generic approach outlined above that will
work for any iterable with len is clearly not desirable as default (because of
its time complexity), and just using it as a fallback won't work either,
because AFACT there is no way for `choice` to find out whether its argument is
a sequence or not.

def choice(self, seq):
"""Choose a random element from a non-empty sequence."""
try:
return seq[int(self.random() * len(seq))]
except TypeError:
# Fallback algorithm


choice(dict(1:"gets this", 2:"gets that", "three":"gets the key"))


This is as broken in the current choice implementation as in the one I
proposed. I don't think it makes any difference to the question at hand.

Jp

Jul 18 '05 #12
Jp Calderone <ex*****@intarweb.us> writes:
choice(dict(1:"gets this", 2:"gets that", "three":"gets the key"))


This is as broken in the current choice implementation as in the one I
proposed. I don't think it makes any difference to the question at hand.


The current implementation will always return values and complain if it
unsuccesfully tries a value between 0 and len(dict). I'd argue this is very
much preferable to always "succeeding" and sometimes returning a key and
sometimes returning a value.

'as

Jul 18 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

21
by: Andreas Lobinger | last post by:
Aloha, i wanted to ask another problem, but as i started to build an example... How to generate (memory and time)-efficient a string containing random characters? I have never worked with...
2
by: VB Programmer | last post by:
I am interested to hear your suggestions on this... I have a table full of survey questions. The questions are individually classified as priority 1, 2 or 3. (Priority 1 means the question...
3
by: larry mckay | last post by:
anyone have the code to select and listview item or row (subitems) after a doubleclick event from a listview. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate...
8
by: Kari Lavikka | last post by:
Hi! I have to select a random row from a table where primary key isn't continuous (some rows have been deleted). Postgres just seems to do something strange with my method. -- -- Use the...
8
by: kp87 | last post by:
I am a little bit stuck .... I want to play a bunch of soundfiles randomly, but i want to give each soundfile a rating (say 0-100) and have the likelihood that the file be chosen be tied to its...
12
by: isaac2004 | last post by:
hi how would you go changing a random number generated into a item drawn from a database with SQL
2
by: Brendon Towle | last post by:
I need to simulate scenarios like the following: "You have a deck of 3 orange cards, 5 yellow cards, and 2 blue cards. You draw a card, replace it, and repeat N times." So, I wrote the...
4
by: darrel | last post by:
I have a DDL list along these lines: item value="1" text="a" item value="2" text="b" item value="3" text="c" item value="2" text="d" item value="2" text="e" item value="1" text="f" item...
15
by: caca | last post by:
Hello, This is a question for the best method (in terms of performance only) to choose a random element from a list among those that satisfy a certain property. This is the setting: I need to...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.