I need to simulate scenarios like the following: "You have a deck of
3 orange cards, 5 yellow cards, and 2 blue cards. You draw a card,
replace it, and repeat N times."
So, I wrote the following code, which works, but it seems quite slow
to me. Can anyone point out some obvious thing that I'm doing
inefficiently? Or, is this more or less as good as it gets?
For reference, my typical numbers look like this:
2 <= len(population) <= 7
4 <= len(mapping) <= 50
10 <= count <= 1000000
B.
<code>
#!/usr/bin/env python
import random
def randomDrawing(count, population):
"""Simulates drawing <countitems from <population>, with
replacement.
population is a list of lists: [[count1, type1], [count2,
type2], ...]
Typical examples:
>>>randomDrawing(100, [[3, 'orange'], [5, 'yellow'], [2, 'blue']])
[[28, 'orange'], [57, 'yellow'], [15, 'blue']]
>>>randomDrawing(100000, [[3, 'orange'], [5, 'yellow'], [2,
'blue']])
[[29923, 'orange'], [50208, 'yellow'], [19869, 'blue']]
"""
res = [[0, item[1]] for item in population]
mapping = []
for i in xrange(len(population)):
mapping.extend([i]*population[i][0])
for i in xrange(count):
index = random.choice(mapping)
res[index][0] += 1
return res
</code>

Brendon Towle, PhD
Cognitive Scientist
+14126902442x127
Carnegie Learning, Inc.
The Cognitive Tutor Company ®
Helping over 375,000 students in 1000 school districts succeed in math. 2 2028
Brendon Towle wrote:
I need to simulate scenarios like the following: "You have a deck of 3
orange cards, 5 yellow cards, and 2 blue cards. You draw a card, replace
it, and repeat N times."
So, I wrote the following code, which works, but it seems quite slow to
me. Can anyone point out some obvious thing that I'm doing
inefficiently? Or, is this more or less as good as it gets?
For reference, my typical numbers look like this:
2 <= len(population) <= 7
4 <= len(mapping) <= 50
10 <= count <= 1000000
B.
<code>
#!/usr/bin/env python
import random
def randomDrawing(count, population):
"""Simulates drawing <countitems from <population>, with replacement.
population is a list of lists: [[count1, type1], [count2, type2], ...]
Typical examples:
>>>randomDrawing(100, [[3, 'orange'], [5, 'yellow'], [2, 'blue']])
[[28, 'orange'], [57, 'yellow'], [15, 'blue']]
>>>randomDrawing(100000, [[3, 'orange'], [5, 'yellow'], [2, 'blue']])
[[29923, 'orange'], [50208, 'yellow'], [19869, 'blue']]
"""
res = [[0, item[1]] for item in population]
mapping = []
for i in xrange(len(population)):
mapping.extend([i]*population[i][0])
for i in xrange(count):
index = random.choice(mapping)
res[index][0] += 1
return res
</code>
Brendon Towle, PhD
Cognitive Scientist
+14126902442x127
Carnegie Learning, Inc.
The Cognitive Tutor Company ®
Helping over 375,000 students in 1000 school districts succeed in math.
Given the data structure, might it not be faster to generate binomial
variates for n1 types, and set nth count = #draws  sum(other
outcomes)? And for a "large" count, could you get by with a normal
approximation? If you *do* feel the need for exact binomial, there are
short, somewhat sluggish routines in Devroye (1986  on the web!, pg
525), and Random_binomial1, compiled by Alan Miller(AMrandom.zip0,
available, xlated, at http://www.esbconsult.com/. Even though they are
relatively slow, generating a few of these would seem to be a lot faster
than your current approach.
Sorry I can't help more  I'm just starting to learn Python, have yet to
even get IDLE going.
Regards,
Dave Braden
Brendon Towle wrote:
I need to simulate scenarios like the following: "You have a deck of
3 orange cards, 5 yellow cards, and 2 blue cards. You draw a card,
replace it, and repeat N times."
So, I wrote the following code, which works, but it seems quite slow
to me. Can anyone point out some obvious thing that I'm doing
inefficiently? Or, is this more or less as good as it gets?
For reference, my typical numbers look like this:
2 <= len(population) <= 7
4 <= len(mapping) <= 50
10 <= count <= 1000000
B.
<code>
#!/usr/bin/env python
import random
def randomDrawing(count, population):
"""Simulates drawing <countitems from <population>, with
replacement.
population is a list of lists: [[count1, type1], [count2,
type2], ...]
Typical examples:
>>>randomDrawing(100, [[3, 'orange'], [5, 'yellow'], [2, 'blue']])
[[28, 'orange'], [57, 'yellow'], [15, 'blue']]
>>>randomDrawing(100000, [[3, 'orange'], [5, 'yellow'], [2,
'blue']])
[[29923, 'orange'], [50208, 'yellow'], [19869, 'blue']]
"""
res = [[0, item[1]] for item in population]
mapping = []
for i in xrange(len(population)):
mapping.extend([i]*population[i][0])
for i in xrange(count):
index = random.choice(mapping)
res[index][0] += 1
return res
</code>

Brendon Towle, PhD
Cognitive Scientist
+14126902442x127
Carnegie Learning, Inc.
The Cognitive Tutor Company ®
Helping over 375,000 students in 1000 school districts succeed in math.
I got nearly a 2x speed up with this variant:
def randomDrawing3(count, population):
res = [[0, item[1]] for item in population]
mapping = []
for i in xrange(len(population)):
mapping.extend([i]*population[i][0])
n = len(mapping)
for i in xrange(count):
index = int(n * random.random())
res[mapping[index]][0] += 1
return res
Apparently "int(n * random.random())" is faster than random.choice()
or random.randint()
sforman@garbage:~ $ python mtimeit s'import random; n=10' 'int(n *
random.random())'
100000 loops, best of 3: 3.22 usec per loop
sforman@garbage:~ $ python mtimeit s'import random; n=10'
'random.randint(1, n)'
100000 loops, best of 3: 13 usec per loop
sforman@garbage:~ $ python mtimeit s'import random; n=range(10)'
'random.choice(n)'
100000 loops, best of 3: 6.07 usec per loop
(Note that the first and last examples produce values 0..9 while the
middle one produces 1..10)
I don't know for sure, but I think the random, uh, spread or whatever
will be the same for random() as for choice(). If it's important, you
should verify that. ;)
Peace,
~Simon This discussion thread is closed Replies have been disabled for this discussion. Similar topics
reply
views
Thread by Constandinos Mavromoustakis 
last post: by

28 posts
views
Thread by Paul Rubin 
last post: by

reply
views
Thread by Constandinos Mavromoustakis 
last post: by

reply
views
Thread by Gus 
last post: by

reply
views
Thread by Karatza Helen 
last post: by

21 posts
views
Thread by Marc Dansereau 
last post: by

4 posts
views
Thread by tshad 
last post: by

6 posts
views
Thread by Intiha 
last post: by

6 posts
views
Thread by Mike Langworthy 
last post: by
          