473,789 Members | 3,067 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Does shuffle() produce uniform result ?

Hi,

I have read the source code of the built-in random module, random.py.
After also reading Wiki article on Knuth Shuffle algorithm, I wonder if
the shuffle method implemented in random.py produces results with modulo
bias.

The reasoning is as follows: Because the method random() only produces
finitely many possible results, we get modulo bias when the number of
possible results is not divisible by the size of the shuffled list.

1. Does shuffle() produce uniform result ?

2. If not, is there a fast and uniform shuffle() available somewhere ?

Thanks !

-tooru honda
Aug 24 '07 #1
25 2650
tooru honda wrote:
Hi,

I have read the source code of the built-in random module, random.py.
After also reading Wiki article on Knuth Shuffle algorithm, I wonder if
the shuffle method implemented in random.py produces results with modulo
bias.

The reasoning is as follows: Because the method random() only produces
finitely many possible results, we get modulo bias when the number of
possible results is not divisible by the size of the shuffled list.

1. Does shuffle() produce uniform result ?
Given the cycle length of the Mersenne twister algorithm that generates
the underlying random numbers, it would have to be a pretty long list to
see a significant bias, don't you think? Have you done any calculations
on the length of list you propose to use?
2. If not, is there a fast and uniform shuffle() available somewhere ?
Frankly I don't think you need to worry.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------

Aug 24 '07 #2
tooru honda <to*********@fa st-mail.orgwrites:
I have read the source code of the built-in random module,
random.py. After also reading Wiki article on Knuth Shuffle
algorithm, I wonder if the shuffle method implemented in random.py
produces results with modulo bias.
It doesn't have modulo bias because it doesn't use modulo to produce a
random index; it multiplies the floating point value with the desired
range. I'm not sure if that method produces any measurable bias.
Aug 24 '07 #3
On Aug 24, 8:54 am, Hrvoje Niksic <hnik...@xemacs .orgwrote:
tooru honda <tooru_ho...@fa st-mail.orgwrites:
I have read the source code of the built-in random module,
random.py. After also reading Wiki article on Knuth Shuffle
algorithm, I wonder if the shuffle method implemented in random.py
produces results with modulo bias.

It doesn't have modulo bias because it doesn't use modulo to produce a
random index; it multiplies the floating point value with the desired
range. I'm not sure if that method produces any measurable bias.
It produces exactly the same level of bias as the 'modulo bias'
obtained by reducing a random integer in [0, 2**53). For example,
suppose we're trying to produce an integer x in the range 0 through 6
inclusive. If n is a random variable whose values are uniformly
distributed across range(0, 2**53) then:

x = n % 7

will produce 0, 1, 2 and 3 with probability (2**53//7+1)/2**53, and 4,
5 and 6 with probability (2**53//7)/2**53, while

x = floor((n/2**53)*7)

will produce 0, 1, 3 and 5 with probability (2**53//7+1)/2**53, and 2,
4 and 6 with probability (2**53//7)/2*53.

Either way, you'd have a very hard time detecting such a tiny bias.
At the other end of the scale, if you're trying to produce a value in
[0, 2**53-2] (for example) then it looks worse: with either method,
one of the values occurs exactly twice as often as all of the others.
But since there are now so many values, you'd again have problems
detecting any bias.

Steven Holden wrote:
Frankly I don't think you need to worry.
What he said.

Mark

Aug 25 '07 #4
tooru honda <to*********@fa st-mail.orgwrites:
The reasoning is as follows: Because the method random() only produces
finitely many possible results, we get modulo bias when the number of
possible results is not divisible by the size of the shuffled list.

1. Does shuffle() produce uniform result ?
The nonuniformity is too small to matter. But what is the
application? If you are doing something like implementing online
poker for real money, you shouldn't use the built-in RNG. It is not
designed for what we call adversarial indistinguishab ility from true
randomness. Instead, use the random byte stream available from
os.urandom() and derive your random numbers from that.
Aug 25 '07 #5
On Aug 24, 9:30 pm, Mark Dickinson <dicki...@gmail .comwrote:
x = floor((n/2**53)*7)

will produce 0, 1, 3 and 5 with probability (2**53//7+1)/2**53, and 2,
4 and 6 with probability (2**53//7)/2*53.
Oops---I lied; I forgot to take into account the rounding implicit in
the (n/2**53)*7 multiplication. A bit of experimentation shows that
it's 0, 2, 4 and 6 that occur more often, with 1, 3 and 5 less likely
by a miniscule amount (at least on an IEEE-754 system).

Mark

Aug 25 '07 #6
Hi, First of all, my thanks to all of you who replied.

I am writing a gamble simulation to convince my friend that his "winning
strategy" doesn't work. I use shuffle method from a random.SystemRa ndom
instance to shuffle 8 decks of cards.

As the number of cards is quite small (number of cards is 416), the
nonuniformity doesn't matter as most of you have already said. Just to
avoid argument from my friend, I am considering writing my own randint
and shuffle methods based on os.urandom() though.

-tooru honda
Aug 25 '07 #7
At the end, I think it is worthwhile to implement my own shuffle and
random methods based on os.urandom. Not only does the resulting code
gets rid of the minuscule bias, but the program also runs much faster.

When using random.SystemRa ndom.shuffle, posix.open and posix.close from
calling os.urandom account for almost half of the total execution time
for my program. By implementing my own random and getting a much larger
chunk of random bytes from os.urandom each time, I am able to reduce the
total execution time by half.

-tooru honda

P.S. I use python 2.5.1 on MacOSX 10.4.10 (PowerPC).
Aug 25 '07 #8
tooru honda <to*********@fa st-mail.orgwrote:
At the end, I think it is worthwhile to implement my own shuffle and
random methods based on os.urandom. Not only does the resulting code
gets rid of the minuscule bias, but the program also runs much faster.

When using random.SystemRa ndom.shuffle, posix.open and posix.close from
calling os.urandom account for almost half of the total execution time
for my program. By implementing my own random and getting a much larger
chunk of random bytes from os.urandom each time, I am able to reduce the
total execution time by half.
If I were in your shoes, I would optimize by subclassing
random.SystemRa ndom and overriding the random method to use os.urandom
with some large block size and then parcel it out, instead of the
_urandom(7) that it now uses. E.g., something like:

class SystemBlockRand om(random.Syste mRandom):

def __init__(self):
random.SystemRa ndom.__init__(s elf)
def rand7():
while True:
randata = os.urandom(7*10 24)
for i in xrange(0, 7*1024, 7):
yield long(binascii.h exlify(randata[i:i+7]),16)
self.rand7 = rand7().next

def random(self):
"""Get the next random number in the range [0.0, 1.0)."""
return (self.rand7() >3) * random.RECIP_BP F

(untested code). No need to reimplement anything else, it seems to me.
Alex
Aug 25 '07 #9
By incorporating Alex's code, I got another performance boost of 20%.
It is mostly due to Alex's more efficient implementation of block random
than my own version.

-tooru honda

Below is the code I have now:
from binascii import hexlify
from os import urandom

class rcRandomC(rando m.SystemRandom) :

def __init__(self):
random.SystemRa ndom.__init__(s elf)

def rand2():

while True:
randata = urandom(2*1024)
for i in xrange(0, 2*1024, 2):
yield int(hexlify(ran data[i:i+2]),16) # integer
in [0,65535]

self.rand2_M = rand2().next
# modified from random._randbel ow
def randrange(self, startN,stopN):

"""Choose a random integer from range(startN, stopN).
widthN<=65536

"""

widthN=stopN-startN
left_over_N=655 36%widthN
upper_bound_N= 65535-left_over_N

random_number=s elf.rand2_M()

while random_number>u pper_bound_N:

random_number=s elf.rand2_M()

r = random_number%w idthN

return startN+r

def shuffle(self, x):
"""x, random=random.r andom -shuffle list x in place; return
None.

"""

randrange=self. randrange

for i in reversed(xrange (1, len(x))):
# pick an element in x[:i+1] with which to exchange x[i]
j = randrange(0,i+1 )
x[i], x[j] = x[j], x[i]
Aug 26 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

23
12957
by: JC | last post by:
I am very new to programming and learning on my own. Why do I keep getting duplicate values using this code? I want to shuffle a deck of 52 cards. The logic seems right to me. Randomize For C = 0 To 1000 C1 = Cards(Int(Rnd * 52)) ' returns a number from 0 to 51
24
7304
by: Joerg Schuster | last post by:
Hello, I am looking for a method to "shuffle" the lines of a large file. I have a corpus of sorted and "uniqed" English sentences that has been produced with (1): (1) sort corpus | uniq > corpus.uniq corpus.uniq is 80G large. The fact that every sentence appears only
3
1698
by: gaulle | last post by:
i'm a newbie in c. now i 'm learning it.i read a code,and compile it in dev-c++4.9.9.0.it can be compiled and pass.but there are some warnings.while in TC2.0,no any warning.it's why? #include <stdio.h> #include <time.h> #include <stdlib.h> void shuffle(int);
4
2211
by: Curious | last post by:
Hi, I am writing a class Distribution that inherits from Random class public class Distribution : System.Random { public int Uniform(int min, int max) { return this.Uniform(min, max); }
38
2449
by: JTL | last post by:
I have learnt java before and now begin to learn c++ what puzzle me is that seem that different SDKs(c++builder, vs.net, gcc..) has its own class library we know that in java there are only one uniform class library so that the codes I written in JBuilder can just copy to JCreator or Eclipse and they all run well does c++ have such an uniform class library so that I could just learn
8
25482
by: kiranchahar | last post by:
Hey all, How do I generate random numbers with Uniform distribution Uniform(a,b) using C-programming? I want to generate uniform random numbers which have mean following Uniform(p,q) and also variance as Uniform(s,t)? any suggestion would be really appreciated. Thanks, Kay
5
1523
by: Tobi Hammert | last post by:
i just want to show up a random pic. <? $files = glob("./gfx/zufall/*"); shuffle ($files); if (count($files) < 1) $count = count($files); else $count = 1; for ($i = 0; $i < $count; $i ++)
9
5356
by: Tuxedo | last post by:
I'd like to reorganize the third, fourth, fifth and sixth, as well as any elements thereafter in an array in random order: var a = new Array('first','second','third','fourth','fifth','sixth','etc') In other words, the first, second and third element should remain in position 0, 1 and 2, while the fourth, fifth and sixth, etc. should appear in random order. Can anyone recommend a method to do this?
11
12977
by: whodgson | last post by:
I`m trying to write a shuffle function to shuffle a sorted array with an even number of elements but it won`t print past the first 2 elements as shown below. When i print the temp array it only prints the first 2 elements also. So i presume the loop is is stopping after 2 iterations. Can anyone see my error?. #include<iostream> using namespace std; void print(int a,int n); void shuffle(int a ); int main() {
0
9665
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9511
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10199
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10139
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9983
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5417
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5551
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4092
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3697
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.