473,396 Members | 1,816 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

randomly generate n of each of two types

I need access to 2*n random choices for two types
subject to a constraint that in the end I have
drawn n of each. I first tried::

def random_types(n,typelist=[True,False]):
types = typelist*n
random.shuffle(types)
for next_type in types:
yield next_type

This works but has some obvious costs.
To avoid those I considered this instead::

def random_types(n,typelist=[True,False]):
type0, type1 = typelist
ct0, ct1 = 0,0
while ct0+ct1<2*n:
if random.random() < ((n-ct0)/(2*n-ct0-ct1)):
next_type = type0
ct0 += 1
else:
next_type = type1
ct1 += 1
yield next_type

Does this seem a good way to go? Comments welcome.

Thank you,
Alan Isaac
Feb 11 '07 #1
9 1669
"Alan Isaac" <ai****@american.eduwrites:
I need access to 2*n random choices for two types
subject to a constraint that in the end I have
drawn n of each. I first tried::
You mean you basically want to generate 2*n bools of which exactly
half are True and half are False? Hmm (untested):

from random import random
def random_types(n):
total = 2*n
trues_needed = n
for i in xrange(total):
if random() < float(trues_needed) / float(total):
trues_needed -= 1
yield True
else:
yield False
total -= 1
Feb 11 '07 #2
Alan Isaac schrieb:
I need access to 2*n random choices for two types
subject to a constraint that in the end I have
drawn n of each. I first tried::

def random_types(n,typelist=[True,False]):
types = typelist*n
random.shuffle(types)
for next_type in types:
yield next_type

This works but has some obvious costs.
To avoid those I considered this instead::

def random_types(n,typelist=[True,False]):
type0, type1 = typelist
ct0, ct1 = 0,0
while ct0+ct1<2*n:
if random.random() < ((n-ct0)/(2*n-ct0-ct1)):
next_type = type0
ct0 += 1
else:
next_type = type1
ct1 += 1
yield next_type

Does this seem a good way to go? Comments welcome.
I think there's any easier way using random's function `shuffle`.
>>from random import shuffle
def randomly(n, types):
.... types *= n
.... shuffle(types)
.... for x in types: yield x
....
>>randomly(1, [True, False])
<generator object at 0x00A62AA8>
>>list(randomly(1, [True, False]))
[True, False]
>>list(randomly(2, [True, False]))
[True, False, True, False]
>>list(randomly(4, [True, False]))
[False, True, False, False, False, True, True, True]

You can even do this with an arbitrary number of choices:
>>list(randomly(2, [True, False, None, NotImplemented]))
[NotImplemented, None, NotImplemented, False, False, None, True, True]

Notice that this method works in-place, ie
>>a = [True, False]
list(randomly(2, a))
[False, False, True, True]
>>a # a changed!
[False, False, True, True]

This can be changed by creating a copy, though.

HTH,
Stargaming
Feb 11 '07 #3
"Stargaming" <st********@gmail.comwrote in message
news:eq***********@ulysses.news.tiscali.de...
... types *= n
... shuffle(types)
This again has the "costs" I referred to:
creating a potentially large sequence,
and shuffling it. (Additionally, shuffle
cannot achieve many of the possible
shuffles of a large list, but I doubt this
is easily evaded.)

So as far as I can tell, this is the same
approach as the original (first) solution.

Cheers,
Alan Isaac
Feb 12 '07 #4
On Mon, 12 Feb 2007 00:57:35 +0000, Alan Isaac wrote:
"Stargaming" <st********@gmail.comwrote in message
news:eq***********@ulysses.news.tiscali.de...
>... types *= n
... shuffle(types)

This again has the "costs" I referred to:
creating a potentially large sequence,
and shuffling it. (Additionally, shuffle
cannot achieve many of the possible
shuffles of a large list, but I doubt this
is easily evaded.)
Whether that second issue is a problem or not really depends on what you
intend doing with the random objects. But notice that Python uses the
Mersenne Twister PRNG, which has a period of 2**19937-1. That's *roughly*
equivalent to the number of permutations of a list with 2080 items.

If you really care about shuffling large lists, you can chain PRNGs. For
instance, you might use shuffle() to generate a random batch of items,
then use a completely different PRNG to shuffle the list again.

As for the first issue:

def random_values(n, valuelist=[True, False]):
for _ in range(n):
values = valuelist[:]
random.shuffle(values)
for item in types:
yield item

Notice that the default objects in the list aren't types, so the name
random_types is inappropriate. The user can pass any objects they like,
not just a list of types like [int, str].

Note also that this code isn't limited to lists of just two objects. You
can pass as many or as few as you like.

It also doesn't build a large list, but any regularities in the shuffle
algorithm will show up in here. Unless you're doing cryptography, that
probably doesn't matter.

If you want to avoid shuffle, here's an alternative:

def random_values(n, valuelist=[True, False]):
N = len(valuelist)
for _ in range(N*n):
yield valuelist[random.randrange(0, N)]


--
Steven D'Aprano

Feb 12 '07 #5
Steven D'Aprano <st***@REMOVEME.cybersource.com.auwrites:
If you want to avoid shuffle, here's an alternative:

def random_values(n, valuelist=[True, False]):
N = len(valuelist)
for _ in range(N*n):
yield valuelist[random.randrange(0, N)]
That is not guaranteed to yield exactly equal numbers of true and false.
Feb 12 '07 #6
On Sun, 11 Feb 2007 22:20:24 -0800, Paul Rubin wrote:
Steven D'Aprano <st***@REMOVEME.cybersource.com.auwrites:
>If you want to avoid shuffle, here's an alternative:

def random_values(n, valuelist=[True, False]):
N = len(valuelist)
for _ in range(N*n):
yield valuelist[random.randrange(0, N)]

That is not guaranteed to yield exactly equal numbers of true and false.
Neither is any random number generator. That's why because it's *random*.

Ah, I see what you mean... you're reminding me that the Original Poster
seems to want a biased set of almost-but-not-quite-randomly chosen
values, so that random_values(1) must return one each of True and False
and never True, True or False, False.

You're right, that's how the O.P. implicitly specified the problem. But it
may not be what he wants.

--
Steven D'Aprano

Feb 12 '07 #7
Steven D'Aprano <st***@REMOVEME.cybersource.com.auwrites:
Ah, I see what you mean... you're reminding me that the Original Poster
seems to want a biased set of almost-but-not-quite-randomly chosen
values, so that random_values(1) must return one each of True and False
and never True, True or False, False.
I thought that was what was specified, "generate n of each of two
types". So if n=10 and the two types are true and false, generate 10
trues and 10 falses. random.shuffle is the most direct way to do it
but I gave another approach that doesn't use auxiliary space except
for a few variables.
Feb 12 '07 #8
>
This again has the "costs" I referred to:
creating a potentially large sequence,
and shuffling it.
I thought I would see if I could do better, so I wrote this:

<code>
import random

def index_types(n, typelist=[True, False]):
numtypes = len(typelist)
total = float(n*numtypes)
counts = [0] * numtypes
while total 0:
index = int(random.random() * numtypes)
if counts[index] < n:
counts[index] += 1
total -= 1
yield typelist[index]

def shuffle_types(n,typelist=[True,False]):
types = typelist*n
random.shuffle(types)
for next_type in types:
yield next_type

def random_types(n,typelist=[True,False]):
type0, type1 = typelist
ct0, ct1 = 0,0
while ct0+ct1<2*n:
if random.random() < ((n-ct0)/(2*n-ct0-ct1)):
next_type = type0
ct0 += 1
else:
next_type = type1
ct1 += 1
yield next_type

def test_index(n):
for x in index_types(n): pass

def test_shuffle(n):
for x in shuffle_types(n): pass

def test_random(n):
for x in shuffle_types(n): pass

if __name__ == '__main__':
from time import sleep
sleep(10)
from timeit import Timer
for function in ['test_index', 'test_shuffle', 'test_random']:
for nvalue in [1000,10000,100000,1000000]:
t = Timer("%s(%d)" % (function, nvalue), "from __main__
import %s" % function)
print function, 'n =', nvalue, ':', t.timeit(number=100)

</code>

Which yielded the following results:

pete@pete-desktop:~$ python test.py
test_index n = 1000 : 0.599834918976
test_index n = 10000 : 5.78634595871
test_index n = 100000 : 56.1441719532
test_index n = 1000000 : 562.577429056
test_shuffle n = 1000 : 0.420483827591
test_shuffle n = 10000 : 4.62663197517
test_shuffle n = 100000 : 46.3557109833
test_shuffle n = 1000000 : 498.563408852
test_random n = 1000 : 0.440869092941
test_random n = 10000 : 4.77765703201
test_random n = 100000 : 47.6845810413
test_random n = 1000000 : 485.233494043

Which shows my function is the slowest (doh!) and your second function
is the fastest. However, shuffle isn't taking much longer to create
and randomize a fairly large list (4.99 secs vs 4.85 secs for your
second function.) In my defense, I wanted to see if I could write the
function to take an arbitrarily long list, rather than limiting it to
2 items.

I also noticed your second function is not working as intended:
>>r = [x for x in test.random_types(10)]
r
[False, False, False, False, False, False, False, False, False, False,
True, True, True, True, True, True, True, True, True, True]

I think it needs a cast to a float:

<code>
if random.random() < (float(n-ct0)/(2*n-ct0-ct1)):
</code>

I was too impatient to wait for the n=1000000 results, so here it is
without them:

pete@pete-desktop:~$ python test.py
test_index n = 1000 : 0.583498001099
test_index n = 10000 : 5.80185317993
test_index n = 100000 : 58.8963599205
test_shuffle n = 1000 : 0.431984901428
test_shuffle n = 10000 : 4.75261592865
test_shuffle n = 100000 : 48.1326880455
test_random n = 1000 : 0.697184085846
test_random n = 10000 : 4.41986989975
test_random n = 100000 : 45.7090520859

Once again, the difference is negligible. My question is, does this
function reduce the "randomness" of the data? Doesn't it move the data
towards an alternating pattern of True, False? It seems to me that
calling shuffle, or perhaps chained PRNG as suggested by Steven
If you really care about shuffling large lists, you can chain PRNGs. For
instance, you might use shuffle() to generate a random batch of items,
then use a completely different PRNG to shuffle the list again.
would produce the most "random" data...

Pete

Feb 12 '07 #9
<pe********@gmail.comwrote in message
news:11**********************@s48g2000cws.googlegr oups.com...
>r = [x for x in test.random_types(10)]
r
[False, False, False, False, False, False, False, False, False, False,
True, True, True, True, True, True, True, True, True, True]

I think it needs a cast to a float:
Mea culpa. I **always** import division from __future__
and tend to assume everyone else does too.

Thanks,
Alan
Feb 12 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: James | last post by:
Just learning C#. What's the easiest way to assign numbers 1-10 randomly to an array? Basically I just want to take the numbers 1-10 and arrange them randomly in slots 0-9 of an array. Thanks
5
by: Raghuram | last post by:
I want to generate some set of chars which are more than 4 chars randomly. . .. how can i do that
8
by: Bill Rust | last post by:
I've created an "Add Item" wizard for VB.NET 2003 that allows a user to add a specialized class that works with my application framework. In the wizard, the user can select the interfaces they...
0
by: Richard Gregory | last post by:
Hi, I have the wsdl below, for an Axis web service, and when I select Add Web Refernce in Visual Studio the proxy is missing a class representing the returnedElementsType (see reference.cs below...
1
by: antony | last post by:
It run w/o error but no image appears. Please help me. Here si the code I do " <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> <title>New Page...
1
by: ad | last post by:
From the past posted, I know char c = (char)rnd.Next('a', 'z' + 1); can generate a alpha randomly. But now I want to generate a alpha or a digit randomly. ('0'..'9' or 'a'..'z') How can I...
5
by: John | last post by:
How can I fill an array randomly so it contains a certian range of numbers (1 - 100 for example) ? My Goal is to generate a set of numbers in random order.
2
by: tamaker | last post by:
Is this do-able with ASP / VBscript? -- I have a database with user records (name, photo, etc. etc.) I want to use asp to generate (on the homepage) a series of 4 randomly selected 'user...
2
by: programmerx101 | last post by:
Ok, I'm looking for expert advice on this one. I have a database which keeps going into read_only mode. Sometimes it goes into read_only / single user mode. Once it was taken offline completely....
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.