473,803 Members | 3,518 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Does shuffle() produce uniform result ?

Hi,

I have read the source code of the built-in random module, random.py.
After also reading Wiki article on Knuth Shuffle algorithm, I wonder if
the shuffle method implemented in random.py produces results with modulo
bias.

The reasoning is as follows: Because the method random() only produces
finitely many possible results, we get modulo bias when the number of
possible results is not divisible by the size of the shuffled list.

1. Does shuffle() produce uniform result ?

2. If not, is there a fast and uniform shuffle() available somewhere ?

Thanks !

-tooru honda
Aug 24 '07
25 2656
tooru honda <to*********@fa st-mail.orgwrote:
...
def rand2():
while True:
randata = urandom(2*1024)
for i in xrange(0, 2*1024, 2):
yield int(hexlify(ran data[i:i+2]),16) # integer
in [0,65535]
another equivalent possibility, which might probably be faster:

import array
...
def rand2():
while True:
x = array.array("H" )
x.fromstring(ur andom(2*4000))
for i in x: yield i
Alex
Aug 26 '07 #11
On 2007-08-26, tooru honda <to*********@fa st-mail.orgwrote:
By incorporating Alex's code, I got another performance boost of 20%.
It is mostly due to Alex's more efficient implementation of block random
than my own version.
If I understand correctly that you are using urandom as a random
generator I wouldn't trust too much on this performance. Urandom
uses the systemwide entropy-pool. If other programs need this pool
too, your performance can drop spectaculary. If you are using a linux
machine just try to execute "od -x /dev/random" at the same time
as your program.

--
Antoon Pardon
Sep 3 '07 #12
Antoon Pardon <ap*****@forel. vub.ac.bewrites :
If I understand correctly that you are using urandom as a random
generator I wouldn't trust too much on this performance. Urandom
uses the systemwide entropy-pool. If other programs need this pool
too, your performance can drop spectaculary.
No the idea is that once there's enough entropy in the pool to make
one encryption key (say 128 bits), the output of /dev/urandom is
computationally indistinguishab le from random output no matter how
much data you read from it.
Sep 3 '07 #13
On 2007-09-03, Paul Rubin <httpwrote:
Antoon Pardon <ap*****@forel. vub.ac.bewrites :
>If I understand correctly that you are using urandom as a random
generator I wouldn't trust too much on this performance. Urandom
uses the systemwide entropy-pool. If other programs need this pool
too, your performance can drop spectaculary.

No the idea is that once there's enough entropy in the pool to make
one encryption key (say 128 bits), the output of /dev/urandom is
computationally indistinguishab le from random output no matter how
much data you read from it.
If you were talking about /dev/random I would agree. But this is what
the man page on my system says about /dev/urandom

A read from the /dev/urandom device will not block waiting for
more entropy. As a result, if there is not sufficient
entropy in the entropy pool, the returned values are
theoretically vulnerable to a cryptographic attack on the algorithms
used by the driver. Knowledge of how to do this is not available
in the current non-classified literature, but it is the-
oretically possible that such an attack may exist. If this is a
concern in your application, use /dev/random instead.

And reading from /dev/random can block if there is not enough entropy.

--
Antoon Pardon

Sep 4 '07 #14
Antoon Pardon <ap*****@forel. vub.ac.bewrites :
No the idea is that once there's enough entropy in the pool to make
one encryption key (say 128 bits), the output of /dev/urandom is
computationally indistinguishab le from random output no matter how
much data you read from it.

If you were talking about /dev/random I would agree. But this is what
the man page on my system says about /dev/urandom. ...
the returned values are theoretically vulnerable to a
cryptographic attack on the algorithms used by the driver.
Right. The idea is that those attacks don't exist and therefore the
output is computationally indistinguishab le from random. Of course
whether the idea is correct, an unproven conjecture, but it looks
pretty good; certainly finding any problem with the specific
algorithms in urandom would be a significant research discovery and
not likely to affect the application being discussed. Finding a
general attack that couldn't be fixed with some simple tweak would be
a major crypto breakthrough that would probably reshape a lot of
complexity theory. This is unlike the situation with Mersenne
Twister, which was not designed for indistinguishab ility against an
opponent who knew what to look for.

In short, using /dev/random is fairly silly once you know there's
enough entropy in the randomness pool to make a good key. If
/dev/urandom's algorithms are broken then whatever you're doing with
the /dev/random output is probably also broken.
Sep 4 '07 #15
On Mon, 03 Sep 2007 23:42:56 -0700, Paul Rubin wrote:
Antoon Pardon <ap*****@forel. vub.ac.bewrites :
No the idea is that once there's enough entropy in the pool to make
one encryption key (say 128 bits), the output of /dev/urandom is
computationally indistinguishab le from random output no matter how
much data you read from it.

If you were talking about /dev/random I would agree. But this is what
the man page on my system says about /dev/urandom. ...
the returned values are theoretically vulnerable to a
cryptographic attack on the algorithms used by the driver.

Right. The idea is that those attacks don't exist and therefore the
output is computationally indistinguishab le from random.
It is a huge leap from what the man page says, that they don't exist in
the unclassified literature at the time the docs were written, to what
you're saying, that they don't exist.

The man page is clear: there is a possible vulnerability in /dev/urandom.
Any cryptographer worth his salt (pun intended) would be looking to close
that vulnerability BEFORE an attack is made public, and not just wait for
the attack to trickle down from the NSA to the script kiddies. The time
to close the stable door is _before_ the horse gets away.

Of course
whether the idea is correct, an unproven conjecture, but it looks pretty
good; certainly finding any problem with the specific algorithms in
urandom would be a significant research discovery and not likely to
affect the application being discussed.
I agree that this flaw doesn't sound like it will effect the application
being discussed, but a problem has already been found and a solution is
already known: block until there's enough entropy. That's what /dev/
random does.
[snip]
In short, using /dev/random is fairly silly once you know there's enough
entropy in the randomness pool to make a good key. If /dev/urandom's
algorithms are broken then whatever you're doing with the /dev/random
output is probably also broken.
That doesn't follow. Antoon is specifically warning that /dev/urandom is
non-blocking. If you knew there was enough entropy available, you
wouldn't need /dev/random -- but how do you know there's enough entropy?

(I suppose you could look in /proc/sys/kernel/random/entropy_avail.)

For this specific application, it probably doesn't matter -- using /dev/
urandom is surely overkill, and on a single-user Linux desktop you're
unlikely to have vast numbers of applications reading /dev/urandom
without your knowledge. But why not use /dev/random? What's the downside?

--
Steven.
Sep 4 '07 #16
On Tue, 04 Sep 2007 22:01:47 -0700, Paul Rubin wrote:
OK. /dev/random vs /dev/urandom is a perennial topic in sci.crypt and
there are endless long threads about it there, so I tried to give you
the short version, but will give a somewhat longer version here.
Thank you. Your points are taken, in particular:
4) The man page is fairly seriously bogus because it doesn't explain
the real situation with either /dev/urandom or /dev/random.


--
Steven.
Sep 6 '07 #17
In message <13************ *@corp.supernew s.com>, Steven D'Aprano wrote:
Any cryptographer worth his salt (pun intended) would be looking to close
that vulnerability BEFORE an attack is made public, and not just wait for
the attack to trickle down from the NSA to the script kiddies.
Except that the NSA's reputation has taken a dent since they failed to
anticipate the attacks on MD5 and SHA-1.
Sep 9 '07 #18
Lawrence D'Oliveiro <ld*@geek-central.gen.new _zealandwrites:
Except that the NSA's reputation has taken a dent since they failed to
anticipate the attacks on MD5 and SHA-1.
NSA had nothing to do with MD5, and it's to NSA's credit that SHA-1
held up for as long as it did.
Sep 9 '07 #19
Paul Rubin wrote:
Lawrence D'Oliveiro writes:
>Except that the NSA's reputation has taken a dent since they failed to
anticipate the attacks on MD5 and SHA-1.

NSA had nothing to do with MD5, and it's to NSA's credit that SHA-1
held up for as long as it did.
I haven't kept up. Has anyone exhibited a SHA-1 collision?
--
--Bryan
Sep 9 '07 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

23
12961
by: JC | last post by:
I am very new to programming and learning on my own. Why do I keep getting duplicate values using this code? I want to shuffle a deck of 52 cards. The logic seems right to me. Randomize For C = 0 To 1000 C1 = Cards(Int(Rnd * 52)) ' returns a number from 0 to 51
24
7307
by: Joerg Schuster | last post by:
Hello, I am looking for a method to "shuffle" the lines of a large file. I have a corpus of sorted and "uniqed" English sentences that has been produced with (1): (1) sort corpus | uniq > corpus.uniq corpus.uniq is 80G large. The fact that every sentence appears only
3
1698
by: gaulle | last post by:
i'm a newbie in c. now i 'm learning it.i read a code,and compile it in dev-c++4.9.9.0.it can be compiled and pass.but there are some warnings.while in TC2.0,no any warning.it's why? #include <stdio.h> #include <time.h> #include <stdlib.h> void shuffle(int);
4
2211
by: Curious | last post by:
Hi, I am writing a class Distribution that inherits from Random class public class Distribution : System.Random { public int Uniform(int min, int max) { return this.Uniform(min, max); }
38
2450
by: JTL | last post by:
I have learnt java before and now begin to learn c++ what puzzle me is that seem that different SDKs(c++builder, vs.net, gcc..) has its own class library we know that in java there are only one uniform class library so that the codes I written in JBuilder can just copy to JCreator or Eclipse and they all run well does c++ have such an uniform class library so that I could just learn
8
25490
by: kiranchahar | last post by:
Hey all, How do I generate random numbers with Uniform distribution Uniform(a,b) using C-programming? I want to generate uniform random numbers which have mean following Uniform(p,q) and also variance as Uniform(s,t)? any suggestion would be really appreciated. Thanks, Kay
5
1524
by: Tobi Hammert | last post by:
i just want to show up a random pic. <? $files = glob("./gfx/zufall/*"); shuffle ($files); if (count($files) < 1) $count = count($files); else $count = 1; for ($i = 0; $i < $count; $i ++)
9
5356
by: Tuxedo | last post by:
I'd like to reorganize the third, fourth, fifth and sixth, as well as any elements thereafter in an array in random order: var a = new Array('first','second','third','fourth','fifth','sixth','etc') In other words, the first, second and third element should remain in position 0, 1 and 2, while the fourth, fifth and sixth, etc. should appear in random order. Can anyone recommend a method to do this?
11
12978
by: whodgson | last post by:
I`m trying to write a shuffle function to shuffle a sorted array with an even number of elements but it won`t print past the first 2 elements as shown below. When i print the temp array it only prints the first 2 elements also. So i presume the loop is is stopping after 2 iterations. Can anyone see my error?. #include<iostream> using namespace std; void print(int a,int n); void shuffle(int a ); int main() {
0
9699
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10542
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10309
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10289
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9119
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7600
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
1
4274
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3795
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2968
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.