By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,187 Members | 1,046 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,187 IT Pros & Developers. It's quick & easy.

number generator

P: n/a
I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?

Thanks and regards
Francesco

Mar 9 '07 #1
Share this Question
Share on Google+
73 Replies


P: n/a
"cesco" <fd**********@gmail.comwrites:
I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?
Erm, yes, lots of ways, there are probably further constraints on the
problem, such as the size of the integers, how the lists are supposed
to be distributed, etc. Can you be more specific? Is this for an
application? If it's a homework problem, that's fine, but it's better
in that case for respondents to suggest hints rather than full solutions.
Mar 9 '07 #2

P: n/a
On Mar 9, 3:51 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
"cesco" <fd.calabr...@gmail.comwrites:
I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?

Erm, yes, lots of ways, there are probably further constraints on the
problem, such as the size of the integers, how the lists are supposed
to be distributed, etc. Can you be more specific? Is this for an
application? If it's a homework problem, that's fine, but it's better
in that case for respondents to suggest hints rather than full solutions.
Given two positive integers, N and M with N < M, I have to generate N
positive integers such that sum(N)=M. No more constraints.

Thanks again
Francesco

Mar 9 '07 #3

P: n/a
In <11**********************@j27g2000cwj.googlegroups .com>, cesco wrote:
Given two positive integers, N and M with N < M, I have to generate N
positive integers such that sum(N)=M. No more constraints.
Break it into subproblems. Generate a random number X from a suitable
range and you are left with one number, and the problem to generate (N-1)
random numbers that add up to (M-X). You have to think a little bit about
the "suitable range" part though.

The necessary functions to draw random numbers are in the `random` module.

Ciao,
Marc 'BlackJack' Rintsch
Mar 9 '07 #4

P: n/a
On Mar 9, 4:17 pm, "cesco" <fd.calabr...@gmail.comwrote:
On Mar 9, 3:51 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
"cesco" <fd.calabr...@gmail.comwrites:
I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?
Erm, yes, lots of ways, there are probably further constraints on the
problem, such as the size of the integers, how the lists are supposed
to be distributed, etc. Can you be more specific? Is this for an
application? If it's a homework problem, that's fine, but it's better
in that case for respondents to suggest hints rather than full solutions.

Given two positive integers, N and M with N < M, I have to generate N
positive integers such that sum(N)=M. No more constraints.

Thanks again
Francesco
Suppose you have a fixed telegraph pole at N and a fixed telgraph pole
at M, and you're given 5 more telegraph poles...

Mar 9 '07 #5

P: n/a
On Fri, 2007-03-09 at 07:17 -0800, cesco wrote:
On Mar 9, 3:51 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
"cesco" <fd.calabr...@gmail.comwrites:
I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?
Erm, yes, lots of ways, there are probably further constraints on the
problem, such as the size of the integers, how the lists are supposed
to be distributed, etc. Can you be more specific? Is this for an
application? If it's a homework problem, that's fine, but it's better
in that case for respondents to suggest hints rather than full solutions.

Given two positive integers, N and M with N < M, I have to generate N
positive integers such that sum(N)=M. No more constraints.
Paul still had a point, since the word "positive" is a vital piece of
information that never appeared in your original description.

-Carsten
Mar 9 '07 #6

P: n/a
Carsten Haese <ca*****@uniqsys.comwrites:
Given two positive integers, N and M with N < M, I have to generate N
positive integers such that sum(N)=M. No more constraints.
Paul still had a point, since the word "positive" is a vital piece of
information that never appeared in your original description.
It's maybe even more complicated that way. Does the OP want to
generate each possible partition with equal probability? See

http://en.wikipedia.org/wiki/Partition_number

for more info.
Mar 9 '07 #7

P: n/a
cesco wrote:
On Mar 9, 3:51 pm, Paul Rubin <http://phr...@NOSPAM.invalid>
>"cesco" <fd.calabr...@gmail.comwrites:
>>I have to generate a list of N random numbers (integer) whose
sum is equal to M. If, for example, I have to generate 5 random
numbers whose sum is 50 a possible solution could be [3, 11, 7,
22, 7]. Is there a simple pattern or function in Python to
accomplish that?
Isn't at least one of those numbers depending on the others?
Given two positive integers, N and M with N < M, I have to
generate N positive integers such that sum(N)=M. No more
constraints.
Then why must they be random? div and mod should do it.

Regards,
Björn

--
BOFH excuse #220:

Someone thought The Big Red Button was a light switch.

Mar 9 '07 #8

P: n/a
On Fri, 09 Mar 2007 06:44:01 -0800, cesco wrote:
I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?
No, you'll have to program it yourself.

You might like to Google for the "coin change algorithm" for some hints on
how to accomplish this. It's not the same problem, but it might give you
some ideas on how to solve it.

The way to solve this problem also depends on what you mean by "random
numbers". For example, if the random numbers have to be uniformly
distributed (so that all numbers in the appropriate range are equally
likely to be picked), I think the only way to proceed is with the horribly
inefficient algorithm:

(1) Generate every possible list of N random numbers between 1 and the
maximum value allowed. If the maximum value is (say) 10, there will 10**N
such lists.

(2) Check each list's sum to see if it equals M, and eliminate it if it
doesn't.

That guarantees that the individual random numbers all have the same
probability, but the execution time will explode for large N.
If you relax the requirement that all the random numbers have the same
probability, you can use a heuristic that is biased towards picking
smaller numbers. E.g. something like this:

def make_sum(N, M):
"""Generate a random list of N ints that sum to M."""
# WARNING: untested!

def get_sum(M):
# Returns a list of any length that sums to M.
L = []
while M 0:
n = random.randint(1, M)
L.append(n)
M -= n
return L

L = []
while len(L) != N:
L = get_sum(M)
return L
This isn't a particularly good algorithm, since it will take a LONG time
to return a solution on average, but it should give you some clues towards
solving the problem more efficiently.

Another procedure might be to do something like the following:

We want to make a list of 5 numbers adding up to 50.
The first random number between 1 and 50 might be (say) 13.
So our list will consist of [13] plus a list of 4 numbers adding up to 37.
And so on...

There's a few subtleties which I'll leave to you.

Last but not least, another possible algorithm is to start with a list of
N numbers, regardless of whether or not they add to M, and then adjust
each one up or down by some amount until they sum to the correct value.

--
Steven.

Mar 9 '07 #9

P: n/a
On Mar 9, 7:32 am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
In <1173453432.893222.308...@j27g2000cwj.googlegroups .com>, cesco wrote:
Given two positive integers, N and M with N < M, I have to generate N
positive integers such that sum(N)=M. No more constraints.

Break it into subproblems. Generate a random number X from a suitable
range and you are left with one number, and the problem to generate (N-1)
random numbers that add up to (M-X).
This approach skews the probabilities. The OP said for example with
N=5 and M=50 that a possible solution is [3, 11, 7, 22, 7]. You're
approach biases the probabilities toward solutions that have a large
entry in the first position.

To make the solutions equi-probable, a simple approach is to
recursively enumerate all possibilities and then choose one of them
with random.choice().
Raymond

Mar 10 '07 #10

P: n/a
In <11*********************@n33g2000cwc.googlegroups. com>, Raymond
Hettinger wrote:
On Mar 9, 7:32 am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
>In <1173453432.893222.308...@j27g2000cwj.googlegroups .com>, cesco wrote:
Given two positive integers, N and M with N < M, I have to generate N
positive integers such that sum(N)=M. No more constraints.

Break it into subproblems. Generate a random number X from a suitable
range and you are left with one number, and the problem to generate (N-1)
random numbers that add up to (M-X).

This approach skews the probabilities. The OP said for example with
N=5 and M=50 that a possible solution is [3, 11, 7, 22, 7]. You're
approach biases the probabilities toward solutions that have a large
entry in the first position.
I know but he said also "No more constraints". And…
To make the solutions equi-probable, a simple approach is to
recursively enumerate all possibilities and then choose one of them
with random.choice().
…it would be faster than creating all possibilities. :-)

Ciao,
Marc 'BlackJack' Rintsch
Mar 10 '07 #11

P: n/a
Steven D'Aprano wrote:
Last but not least, another possible algorithm is to start with a list of
N numbers, regardless of whether or not they add to M, and then adjust
each one up or down by some amount until they sum to the correct value.
Another possibility is to generate a list of N non-random
numbers that sum to M, and then adjust them up or down
by random amounts. By performing up/down adjustments in
pairs, you can maintain the sum invariant at each step.
So then it's just a matter of how long you want to go
on fiddling with them.

--
Greg
Mar 10 '07 #12

P: n/a
Raymond Hettinger wrote:
To make the solutions equi-probable, a simple approach is to
recursively enumerate all possibilities and then choose one
of them with random.choice().
Or create a list using the biased method, then use .shuffle() to
return another permutation.

Cheers,

--
Klaus Alexander Seistrup
Tv-fri medielicensbetaler
http://klaus.seistrup.dk/
Mar 10 '07 #13

P: n/a
At 07:17 AM 3/9/2007, cesco wrote:
>On Mar 9, 3:51 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
"cesco" <fd.calabr...@gmail.comwrites:
I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?
Erm, yes, lots of ways, there are probably further constraints on the
problem, such as the size of the integers, how the lists are supposed
to be distributed, etc. Can you be more specific? Is this for an
application? If it's a homework problem, that's fine, but it's better
in that case for respondents to suggest hints rather than full solutions.

Given two positive integers, N and M with N < M, I have to generate N
positive integers such that sum(N)=M. No more constraints.
So why not just repeatedly call a function to generate lists of
length N of random integers within the appropriate range (the closed
interval [1,M-N-1]), and return the first list the sum of which is M?
I don't understand what all the discussion is about. Time is not one
of the constraints.

Dick Moores

Mar 10 '07 #14

P: n/a
On Sat, 10 Mar 2007 02:32:21 -0800, Dick Moores wrote:
So why not just repeatedly call a function to generate lists of
length N of random integers within the appropriate range (the closed
interval [1,M-N-1]), and return the first list the sum of which is M?
I don't understand what all the discussion is about. Time is not one
of the constraints.
Time is always a constraint. The sun will expand and destroy the Earth in
a couple of billion years, it would be nice to have a solutions before
then...

*wink*

Seriously, almost all programming problems have two implicit constraints:
it must run as fast as practical, using as little computer resources (e.g.
memory) as practical. Naturally those two constraints are usually in
opposition, which leads to compromise algorithms that run "fast enough"
without using "too much" memory.

--
Steven.

Mar 10 '07 #15

P: n/a
On Fri, 09 Mar 2007 18:41:39 +0000, Dennis Lee Bieber wrote:
On 9 Mar 2007 06:44:01 -0800, "cesco" <fd**********@gmail.comdeclaimed
the following in comp.lang.python:
>I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?
Well... Just that the last number is not random -- it has to equal
(using your limit of 50 and 5 numbers):

fifth = 50 - sum(first, second, third, fourth)
Doesn't mean that it isn't random. After all, the first four numbers are
random, therefore their sum is random. 50 - (something random) is also
random.

In practice, one would deterministically generate the last number needed,
but that last number is unpredictable because the first four numbers are
unpredictable.

Remember that random numbers are not necessarily unconstrained, nor are
they necessarily uniformly distributed. We blithely talk about "random
numbers", but of course there is a constraint that we're selecting from a
finite range. Any finite range of numbers, no matter how big, is a
vanishingly small percentage of all the possible numbers.

--
Steven.

Mar 10 '07 #16

P: n/a
Raymond Hettinger wrote:
To make the solutions equi-probable, a simple approach is to
recursively enumerate all possibilities and then choose one of them
with random.choice().
Maybe it is possible to generate the possibilities by an indexing
function and then use randint to pick one of them. I suppose this is
like the bricks and bins problem this thread was about:

http://groups.google.nl/group/comp.l...82b54fa39b3bad

Except that the bins now have at least 1 brick in them (if we have
positive numbers).

I posted a rather simplistic solution (but working I think) after Steven
Taschuk made some insightful remarks. I believe it is possible to
generate the list of numbers directly instead of permuting a list of '0'
and '1' characters and then finding the positions of the '1' elements.

A.
Mar 10 '07 #17

P: n/a
Steven D'Aprano <st***@REMOVE.THIS.cybersource.com.auwrites:
Doesn't mean that it isn't random. After all, the first four numbers are
random, therefore their sum is random. 50 - (something random) is also
random.
What does it mean for the first 4 numbers to be random? For example,
is 27 random?

By your method, what is the probability of the first number being
higher than 30? What is the probability of the fifth number being
higher than 30? If these probabilities are unequal, can we really say
the sequences are random?
Mar 10 '07 #18

P: n/a
Gerard Flanagan wrote:
On Mar 9, 4:17 pm, "cesco" <fd.calabr...@gmail.comwrote:
>On Mar 9, 3:51 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
>>"cesco" <fd.calabr...@gmail.comwrites:
I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7].

Suppose you have a fixed telegraph pole at N and a fixed telgraph pole
at M, and you're given 5 more telegraph poles...
That's a good one! Three-liner.

Cheers, Mel.

Mar 10 '07 #19

P: n/a
cesco escreveu:
I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?

Thanks and regards
Francesco
May be this is what you want ...
I didn't test it enough ... but seems fine.

from random import randint,shuffle

def GenRInt(m,n):
"""Generate m positive ints whose sum is n"""
# Generate a random list
xl=[randint(1,n-m+1) for i in xrange(m)]
s=sum(xl)
if s==n: return xl
# Adjust each value to make sum==n
xl=[max(x*n/s,1) for x in xl]
s=sum(xl)
if s!=n:
# Compensate for truncations
if s>n:
# s>n is supposed not to occur. Just in case ...
ixs=filter(lambda x: xl[x]>1,range(m))
shuffle(ixs)
for i in ixs[:s-n]: xl[i]-=1
else:
ixs=range(m)
shuffle(ixs)
for i in ixs[:n-s]: xl[i]+=1
return xl

# Test it ...

xl=GenRInt(10,50)
print xl,"->",sum(xl)

print

xl=GenRInt(100,10000)
print xl,"->",sum(xl)

print

xl=GenRInt(1000,1000)
print xl,"->",sum(xl)
Regards.
Paulo
Mar 10 '07 #20

P: n/a
Paulo da Silva <ps********@esotericaX.ptXwrites:
May be this is what you want ...
I didn't test it enough ... but seems fine.
That's way too complicated. Think about Gerald Flanagan's description
of the telegraph poles, and how to implement simply. It is a two liner.
Mar 10 '07 #21

P: n/a

"Anton Vredegoor" <an*************@gmail.comwrote in message
news:es**********@news2.zwoll1.ov.home.nl...
| Raymond Hettinger wrote:
|
| To make the solutions equi-probable, a simple approach is to
| recursively enumerate all possibilities and then choose one of them
| with random.choice().
|
| Maybe it is possible to generate the possibilities by an indexing
| function and then use randint to pick one of them. I suppose this is
| like the bricks and bins problem this thread was about:
|
|
http://groups.google.nl/group/comp.l...82b54fa39b3bad
|
| Except that the bins now have at least 1 brick in them (if we have
| positive numbers).
|
| I posted a rather simplistic solution (but working I think) after Steven
| Taschuk made some insightful remarks. I believe it is possible to
| generate the list of numbers directly instead of permuting a list of '0'
| and '1' characters and then finding the positions of the '1' elements.

Partitioning positive count m into n positive counts that sum to m is a
standard combinatorial problem at least 300 years old. The number of such
partitions, P(m,n) has no known exact formula but can be computed
inductively rather easily. The partitions for m and n can be ranked in
lexicographic order from 0 to P(m,n)-1. Given a rank r in that range, one
can calculate the particular partition that has that rank. So a
equi-probable random count in the range can be turned into a equi-probable
random partition.

This topic is section 3.1 in Combinatorial Algorithms: Generation,
Enumeration, and Search by Kreher and Stinson. The authors reference
several other books as their sources.

I plan to someday rewrite many of their pseudocode algorithms in Python.

Terry Jan Reedy

Mar 10 '07 #22

P: n/a
Terry Reedy wrote:
Partitioning positive count m into n positive counts that sum to m is a
standard combinatorial problem at least 300 years old. The number of such
partitions, P(m,n) has no known exact formula but can be computed
inductively rather easily. The partitions for m and n can be ranked in
lexicographic order from 0 to P(m,n)-1. Given a rank r in that range, one
can calculate the particular partition that has that rank. So a
equi-probable random count in the range can be turned into a equi-probable
random partition.
Yes that was one of my first ideas too. But later on Steven pointed out
that one can view the problem like this:

00010000100010100

That would be [3,4,3,1,2]

where the '1' elements are like dividing shutters that partition the row
of '0'. This means that the problem is reduced to permutations (albeit
unique permutations) which are a lot simpler to compute than partitions.

Ok I'll admit that I succeeded in translating my 'Goldberg' solution to
this case, I can't expect anyone to dust off and read 4 year old threads
anyway :-)

(by the way I'm still convinced that this code can be simplified a lot)

def starters(L):
n,np,R = len(L),1,range(len(L))
bf = [L[:i].count(L[i]) for i in R]
for i in R: np = np*(n-i)/(bf[i]+1)
return [(i,np*L[i:].count(L[i])/n) for i in R if not bf[i]]

def perm(index,L):
remain,n = index,len(L)
res,T = L[:],L[:]
for i in range(n):
for j,k in starters(T):
if remain-k < 0:
res[i] = T.pop(j)
break
remain -= k
return res

def nperm(L):
return reduce(lambda a,b:a+b,[k for j,k in starters(L)])

def bb(i,bricks,bins):
L = [1] * (bins-1) + [0] * (bins-1)
R = [1]
for x in perm(i,L):
if x: R.append(1)
else: R[-1]+=1
return R

def test():
bricks,bins = 7, 4
L = [1] * (bins-1) + [0] * (bins-1)
for i in range(nperm(L)):
print bb(i,bricks,bins)

if __name__=='__main__':
test()

This topic is section 3.1 in Combinatorial Algorithms: Generation,
Enumeration, and Search by Kreher and Stinson. The authors reference
several other books as their sources.
Great book!
I plan to someday rewrite many of their pseudocode algorithms in Python.
That would be my dream job. If only I understood more of them. But I'm
slowly making progress in other areas so that one day I will maybe
reread the book and also understand the second half.

A.
Mar 10 '07 #23

P: n/a
On Sat, 10 Mar 2007 08:29:09 -0800, Paul Rubin wrote:
Steven D'Aprano <st***@REMOVE.THIS.cybersource.com.auwrites:
>Doesn't mean that it isn't random. After all, the first four numbers are
random, therefore their sum is random. 50 - (something random) is also
random.

What does it mean for the first 4 numbers to be random? For example,
is 27 random?
In isolation, no, but 27 could have been chosen at random and hence would
have been unpredictable. That was my point: if we generate four
unpredictable numbers (using random.randint or similar) and sum them, the
sum is also unpredictable, and hence the difference between 50 and that
sum is unpredictable.

By your method, what is the probability of the first number being
higher than 30? What is the probability of the fifth number being
higher than 30? If these probabilities are unequal, can we really say
the sequences are random?
Of course we can! "Uniform probability distribution" is a special case of
random. Most random quantities are far from uniform. The Python random
module includes a couple of non-uniform distributions, including
exponential distribution (random.expovariate) and the famous bell curve
distribution (random.normalvariate).

One common distribution which seems to have been missed is the Poisson
distribution, which applies to (among many, many others) the number of
18th Century Prussian cavalry officers who fell off their horse and broke
a limb, and the number of cars arriving at a petrol station in any hour.

--
Steven.

Mar 11 '07 #24

P: n/a
Anton Vredegoor wrote:
L = [1] * (bins-1) + [0] * (bins-1)
replace these lines in the code by:

L = [1] * (bins-1) + [0] * (bricks-bins)

A.
Mar 11 '07 #25

P: n/a
Steven D'Aprano <st***@REMOVE.THIS.cybersource.com.auwrites:
By your method, what is the probability of the first number being
higher than 30? What is the probability of the fifth number being
higher than 30? If these probabilities are unequal, can we really say
the sequences are random?

Of course we can! "Uniform probability distribution" is a special case of
random. Most random quantities are far from uniform. The Python random
module includes a couple of non-uniform distributions, including
exponential distribution (random.expovariate) and the famous bell curve
distribution (random.normalvariate).
Ehh. Say we call some generator repeatedly and get back the sequence
(3, 3, 3, 3, 3, ...). Would we say this is a random sequence using a
distribution that puts 100% of the probability density at 3? Or would
we just say it isn't random?

For something like this partition problem, describing some generator
as making a random partition should contain some reasonable
description of the distribution. For example, we could say the
generator selects one of the P(n) possible partitions with equal
probability. For another, we could say it uses the "fencepost" method
and the distribution is whatever results from that. Question then is
whether those two distributions are the same.
Mar 11 '07 #26

P: n/a
On Mar 10, 3:16 am, greg <g...@cosc.canterbury.ac.nzwrote:
Another possibility is to generate a list of N non-random
numbers that sum to M, and then adjust them up or down
by random amounts. By performing up/down adjustments in
pairs, you can maintain the sum invariant at each step.
So then it's just a matter of how long you want to go
on fiddling with them.
Taking your comment and running with it...this is pretty much
cheating, and requires that M be evenly divisible by N, and only works
well with smaller N values, and selections are limited to numbers in
the 1 to (M/N)+(M/N) range...but hey; other than that it's perfect,
heh.

import random
N, M = 10, 80
D = M/N
O = [D] * N
C = []
for i in O:
C.append(random.randint(1, D-1))
for i in range(0, len(O), 2):
O[i] -= C[i]
if i == len(O)-1:
O[random.randint(0, i-1)] += C[i]
else:
O[i+1] += C[i]
assert sum(O) == M
assert len(O) == N

Regards,
Jordan

Mar 11 '07 #27

P: n/a
"MonkeeSage" <Mo********@gmail.comwrites:
Taking your comment and running with it...this is pretty much
cheating, and requires that M be evenly divisible by N, and only works
well with smaller N values, and selections are limited to numbers in
the 1 to (M/N)+(M/N) range...but hey; other than that it's perfect, heh.
This still doesn't make terribly much sense in terms of the distribution
you get.

The fencepost method still seems to be simplest:

t = sorted(random.sample(xrange(1,50), 4))
print [(j-i) for i,j in zip([0]+t, t+[50])]
Mar 11 '07 #28

P: n/a
On Mar 10, 6:47 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
The fencepost method still seems to be simplest:

t = sorted(random.sample(xrange(1,50), 4))
print [(j-i) for i,j in zip([0]+t, t+[50])]
Simpler, true, but I don't think it gives any better distribution...

import random
def cheat(n, m):
N, M = n, m
D = M/N
O = [D] * N
C = []
for i in O:
C.append(random.randint(1, D-1))
for i in range(0, len(O), 2):
O[i] -= C[i]
if i == len(O)-1:
O[random.randint(0, i-1)] += C[i]
else:
O[i+1] += C[i]
print 'CHEAT:'
print O

def fence(n, m):
t = sorted(random.sample(xrange(1,m), n-1))
print 'FENCE:'
print [(j-i) for i,j in zip([0]+t, t+[m])]

for i in range(10):
print 'Run: %d:' % (i+1)
cheat(10, 80)
fence(10, 80)
print

Output:

Run: 1:
CHEAT:
[1, 15, 1, 15, 5, 11, 5, 11, 1, 15]
FENCE:
[4, 9, 24, 7, 3, 9, 11, 7, 3, 3]

Run: 2:
CHEAT:
[1, 15, 5, 11, 5, 11, 1, 15, 7, 9]
FENCE:
[15, 12, 13, 7, 1, 4, 5, 6, 4, 13]

Run: 3:
CHEAT:
[1, 15, 3, 13, 3, 13, 2, 14, 7, 9]
FENCE:
[2, 9, 12, 15, 4, 5, 2, 3, 19, 9]

Run: 4:
CHEAT:
[7, 9, 2, 14, 5, 11, 4, 12, 3, 13]
FENCE:
[2, 2, 4, 7, 1, 11, 15, 13, 6, 19]

Run: 5:
CHEAT:
[5, 11, 3, 13, 5, 11, 7, 9, 4, 12]
FENCE:
[2, 4, 11, 10, 13, 16, 2, 18, 1, 3]

Run: 6:
CHEAT:
[5, 11, 3, 13, 2, 14, 6, 10, 1, 15]
FENCE:
[1, 4, 13, 5, 2, 26, 5, 4, 16, 4]

Run: 7:
CHEAT:
[4, 12, 4, 12, 4, 12, 4, 12, 5, 11]
FENCE:
[8, 3, 5, 15, 8, 15, 2, 3, 10, 11]

Run: 8:
CHEAT:
[3, 13, 5, 11, 6, 10, 1, 15, 2, 14]
FENCE:
[25, 15, 2, 5, 2, 10, 6, 1, 9, 5]

Run: 9:
CHEAT:
[6, 10, 3, 13, 2, 14, 1, 15, 5, 11]
FENCE:
[11, 9, 3, 3, 7, 4, 8, 28, 1, 6]

Run: 10:
CHEAT:
[2, 14, 3, 13, 6, 10, 2, 14, 2, 14]
FENCE:
[12, 5, 23, 2, 3, 4, 4, 11, 5, 11]

Granted, I'm just eyeballing it, but they look fairly equal in terms
of distribution.

Regards,
Jordan

Mar 11 '07 #29

P: n/a
Paul Rubin escreveu:
Paulo da Silva <ps********@esotericaX.ptXwrites:
>May be this is what you want ...
I didn't test it enough ... but seems fine.

That's way too complicated. Think about Gerald Flanagan's description
of the telegraph poles, and how to implement simply. It is a two liner.
I hadn't seen that yet. A *nice idea*. Just two lines!!!

I had a similar problem but with floats (%). I solved it using
the previous way in C. For float it is very simple. So I
adapted it for python and ints. That resulted in a lot more
complex code.
Mar 11 '07 #30

P: n/a

"Anton Vredegoor" <an*************@gmail.comwrote in message
news:es**********@news3.zwoll1.ov.home.nl...
| Terry Reedy wrote:
|
| Partitioning positive count m into n positive counts that sum to m is a
| standard combinatorial problem at least 300 years old. The number of
such
| partitions, P(m,n) has no known exact formula but can be computed
| inductively rather easily. The partitions for m and n can be ranked in
| lexicographic order from 0 to P(m,n)-1. Given a rank r in that range,
one
| can calculate the particular partition that has that rank. So a
| equi-probable random count in the range can be turned into a
equi-probable
| random partition.
|
| Yes that was one of my first ideas too. But later on Steven pointed out
| that one can view the problem like this:
|
| 00010000100010100
|
| That would be [3,4,3,1,2]
|
| where the '1' elements are like dividing shutters that partition the row
| of '0'. This means that the problem is reduced to permutations (albeit

If any of the 1s appear at the ends or together, then you would have 0s in
the partition, which is not allowed, as I understood the spec.

| unique permutations) which are a lot simpler to compute than partitions.

I think the simplicity is actually about the same.

Terry Jan Reedy

Mar 11 '07 #31

P: n/a
Terry Reedy wrote:
"Anton Vredegoor" <an*************@gmail.comwrote in message
| Yes that was one of my first ideas too. But later on Steven pointed out
| that one can view the problem like this:
|
| 00010000100010100
|
| That would be [3,4,3,1,2]
|
| where the '1' elements are like dividing shutters that partition the row
| of '0'. This means that the problem is reduced to permutations (albeit

If any of the 1s appear at the ends or together, then you would have 0s in
the partition, which is not allowed, as I understood the spec.
Yes, I was writing about the bricks and bins problem from 4 years ago
which is very similar.
| unique permutations) which are a lot simpler to compute than partitions.

I think the simplicity is actually about the same.
Probably yes. It's like the difference between Pascal's triangle and the
partition numbers' triangle. Anyway, Paul Rubin's idea in this same
thread stimulated me to simplify my code a lot. It's rather late here so
I hope I haven't slipped up again.

def comb(i,n,k):
for j in range(k,0,-1):
while noverk(n,j) i :
n -= 1
i -= noverk(n,j)
yield n

def noverk(n,k):
return reduce(lambda a,b: a*(n-b)/(b+1),range(k),1)

def bb(i,bricks,bins):
L = [j+1 for j in comb(i,bricks,bins-1)]
return [(i-j) for i,j in zip([bricks]+L,L+[0])]

def test():
bricks, bins = 6,4
n = noverk(bricks-1,bins-1)
for i in range(n):
print bb(i,bricks,bins)

if __name__=='__main__':
test()

A.

Mar 11 '07 #32

P: n/a
On Mar 9, 10:44 pm, "cesco" <fd.calabr...@gmail.comwrote:
I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?

Thanks and regards
Francesco

import random

def genRandList(n,left):
L = []
if n left:
return L

while n:
L.append(int(random.random()*(left-n+1))+1)
n -= 1;
left -= L[-1]

return L

Mar 11 '07 #33

P: n/a
On Sat, 10 Mar 2007 16:02:56 -0800, Paul Rubin wrote:
Steven D'Aprano <st***@REMOVE.THIS.cybersource.com.auwrites:
By your method, what is the probability of the first number being
higher than 30? What is the probability of the fifth number being
higher than 30? If these probabilities are unequal, can we really say
the sequences are random?

Of course we can! "Uniform probability distribution" is a special case of
random. Most random quantities are far from uniform. The Python random
module includes a couple of non-uniform distributions, including
exponential distribution (random.expovariate) and the famous bell curve
distribution (random.normalvariate).

Ehh. Say we call some generator repeatedly and get back the sequence
(3, 3, 3, 3, 3, ...). Would we say this is a random sequence using a
distribution that puts 100% of the probability density at 3? Or would
we just say it isn't random?
Oh! You know that bit where I said that every imaginable sequence of
integers was random? I take it back.

Oh wait. I never said anything of the sort. Not every sequence of ints is
a random sequence.

Although... your question is deeper than it appears at first glance. In
any truly random sequence, you should expect to find repeated values. If
you wait long enough, you should get a sequence of 3,3,3... for any number
of threes you like. (Or, naturally, any other integer.) If you wait long
enough, you should get a billion threes.

--
Steven.

Mar 11 '07 #34

P: n/a
On Sat, 10 Mar 2007 17:19:38 -0800, MonkeeSage wrote:
On Mar 10, 6:47 pm, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
>The fencepost method still seems to be simplest:

t = sorted(random.sample(xrange(1,50), 4))
print [(j-i) for i,j in zip([0]+t, t+[50])]

Simpler, true, but I don't think it gives any better distribution...
[snip]
Granted, I'm just eyeballing it, but they look fairly equal in terms
of distribution.
It's easy enough to see whether the fencepost method gives a uniform
distribution.
def fence(n, m):
t = [0] + sorted(random.sample(xrange(1, m), n-1)) + [m]
L = [(j-i) for i,j in zip(t[:-1], t[1:])]
assert sum(L) == m
assert len(L) == n
return L
def collate(count, n, m):
bins = {}
for _ in xrange(count):
L = fence(n, m)
for x in L:
bins[x] = 1 + bins.get(x, 0)
return bins
collate(1000, 10, 80)

gives me the following sample:

{1: 1148, 2: 1070, 3: 869, 4: 822, 5: 712,
6: 633, 7: 589, 8: 514, 9: 471,
10: 406, 11: 335, 12: 305, 13: 308, 14: 242,
15: 232, 16: 190, 17: 172, 18: 132, 19: 132,
20: 124, 21: 87, 22: 91, 23: 72, 24: 50,
25: 48, 26: 45, 27: 33, 28: 29, 29: 22,
30: 19, 31: 20, 32: 12, 33: 12, 34: 11,
35: 11, 36: 5, 37: 9, 38: 2, 39: 4, 40: 5,
42: 1, 43: 3, 45: 2, 49: 1}

Clearly the distribution isn't remotely close to uniform.

To compare to the "cheat" method, calculate the mean and standard
deviation of this sample, and compare to those from the other method.
(Code left as an exercise for the reader.) When I do that, I get a mean
and standard deviation of 8 and 6.74 for the fencepost method, and 8 and
4.5 for the "cheat" method. That implies they are different distributions.

--
Steven.

Mar 11 '07 #35

P: n/a
On Mar 10, 11:26 pm, Steven D'Aprano
<s...@REMOVE.THIS.cybersource.com.auwrote:
To compare to the "cheat" method, calculate the mean and standard
deviation of this sample, and compare to those from the other method.
I belieive you (mainly because I'm too lazy to write the sieve,
hehe). ;)

Regards,
Jordan

Mar 11 '07 #36

P: n/a
MonkeeSage wrote:
this ... requires that M be evenly divisible by N,
No, it doesn't -- I never said the numbers had
to be *equal*.
and only works well with smaller N values,
Why?
and selections are limited to numbers in the
1 to (M/N)+(M/N) range
I don't understand what you mean by that. Note
that the adjustments don't have to be restricted
to *adjacent* numbers -- you can pick any pair
of numbers and transfer an amount from one to
the other, as long as neither of them goes
below 1, and you can perform as many adjustments
as you like. So *any* sequence of numbers that
sums to M is a possible output from some
algorithm of this kind.

As for the distribution, the OP said there were
"no other restrictions", so it seems that the
distribution doesn't matter. Actually, if you
take that at face value, the numbers don't
even have to be *random* at all... or,
equivalently, they can have a very skewed
distribution. :-)

--
Greg
Mar 11 '07 #37

P: n/a
On Mar 11, 2:16 am, greg <g...@cosc.canterbury.ac.nzwrote:
MonkeeSage wrote:
this ... requires that M be evenly divisible by N,

No, it doesn't -- I never said the numbers had
to be *equal*.
Sorry for not being clear. I was refering to my specific
implementation of the algorithm, not the generic design pattern.

Regards,
Jordan

Mar 11 '07 #38

P: n/a

"cesco" <fd**********@gmail.comha scritto nel messaggio
news:11**********************@c51g2000cwc.googlegr oups.com...
>I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?

Thanks and regards
Francesco

You can initialize a list to [1, 1, 1, 1, 1], and generate 45 random
integers between 1 and 5, and every time a number is generated, increase the
Nth number in the list by one.

Not all distinct lists will have the same chance of occurring, e.g. [46, 1,
1, 1, 1] will be much less likely than [10, 10, 10, 10, 10]. Depending on
what you need these numbers for, it can be a good thing or a bad thing.

--Army1987
Mar 11 '07 #39

P: n/a
Army1987 <pl********@for.itwrote:
"cesco" <fd**********@gmail.comha scritto nel messaggio
news:11**********************@c51g2000cwc.googlegr oups.com...
I have to generate a list of N random numbers (integer) whose sum is
equal to M. If, for example, I have to generate 5 random numbers whose
sum is 50 a possible solution could be [3, 11, 7, 22, 7]. Is there a
simple pattern or function in Python to accomplish that?

Thanks and regards
Francesco


You can initialize a list to [1, 1, 1, 1, 1], and generate 45 random
integers between 1 and 5, and every time a number is generated, increase the
Nth number in the list by one.

Not all distinct lists will have the same chance of occurring, e.g. [46, 1,
1, 1, 1] will be much less likely than [10, 10, 10, 10, 10]. Depending on
what you need these numbers for, it can be a good thing or a bad thing.
And a1-liner way to get the numbers (net of the mandatory +1 for each)
is:

map([random.randrange(5) for i in xrange(45)].count, xrange(5))

i.e., this gives 5 integers (each between 0 and 45 included) summing to
45 -- add 1 to each of them to get the desired result.

Without any specification regarding the distributions required for the
"5 random numbers" it's really impossible to say whether these are
better or worse than other proposed solutions.
Alex
Mar 11 '07 #40

P: n/a
al***@mac.com (Alex Martelli) wrote:
Without any specification regarding the distributions required for the
"5 random numbers" it's really impossible to say whether these are
better or worse than other proposed solutions.
FWIW, I decided it would be fun to see what kind of implementation I
could come up test driven and avoiding reading the thread or background
references too much first. So starting from:

import unittest

def partition(total, n, min):
return [50]

class PartitionTests(unittest.TestCase):
def testsinglevalue(self):
self.assertEqual([50], partition(50, 1, 1))

if __name__=='__main__':
unittest.main()

I eventually worked my way through 15 revisions to the code below.

The tests were added in the order you see them below. The commented out
function is the one I had arrived at before I added the first
distribution test which triggered a major refactor (although the code
ended up remarkably similar): if you uncomment it all but the
distribution tests pass.

I don't really like the arbitrary limits for the distribution tests, but
I'm not sure how else to test that sort of thing. And as Alex said,
without knowing what distribution the OP wanted the definition I chose
to use is completely arbitrary.

----- partition.py -------
import unittest, collections
from random import randint, sample

def partition(total, n, min):
maxtotal = total - n*(min-1)
posts = sorted(sample(xrange(1, maxtotal), n-1))
return [ (b-a)+min-1 for (a,b) in zip([0]+posts, posts+[maxtotal]) ]

# def partition(total, n, min):
# maxtotal = total - (n*min)
# sums = sorted(randint(0, maxtotal) for i in range(n-1))
# return [(b-a)+min for (a,b) in zip([0]+sums, sums+[maxtotal])]

class PartitionTests(unittest.TestCase):
def testsinglevalue(self):
self.assertEqual([50], partition(50, 1, 1))

def testnvalues(self):
self.assertEqual([1]*5, partition(5, 5, 1))

def testnminusone(self):
self.assertEqual([1]*4+[2], sorted(partition(6, 5, 1)))

def testnminusone2(self):
# Check we get all possible results eventually
expected = set([(1,1,2), (1,2,1), (2,1,1)])
for i in range(100):
got = tuple(partition(4,3,1))
if got in expected:
expected.remove(got)
if not len(expected):
break
self.assertEqual(len(expected), 0)

def testdistribution(self):
# Make sure we get each of 3 possible outcomes roughly
# equally often.
actual = collections.defaultdict(int)
for i in range(1000):
actual[tuple(partition(4,3,1))] += 1
counts = actual.values()
assert (min(counts) 250)
assert (max(counts) < 400)

def testdistribution2(self):
# More arbitrary limits for the distribution. 10 possible
# outcomes this time.
actual = collections.defaultdict(int)
ntries = 10000
for i in range(ntries):
actual[tuple(partition(6,3,1))] += 1
counts = actual.values()
assert (min(counts) 900)
assert (max(counts) < 1100)

def testmintwo(self):
self.assertEqual([2]*50, partition(100,50,2))

def testminzero(self):
self.assertEqual([0]*20, partition(0,20,0))

def testcantdoit(self):
self.assertRaises(ValueError, partition, 100, 51, 2)

if __name__=='__main__':
unittest.main()

--------------------------
Mar 12 '07 #41

P: n/a
Paul Rubin <httpwrote:
The fencepost method still seems to be simplest:

t = sorted(random.sample(xrange(1,50), 4))
print [(j-i) for i,j in zip([0]+t, t+[50])]
Mmm, nice.

Here is another effort which is easier to reason about the
distribution produced but not as efficient.

def real(N, M):
while 1:
t = [ random.random() for i in range(N) ]
factor = M / sum(t)
t = [ int(round(x * factor)) for x in t]
if sum(t) == M:
break
print "again"
assert len(t) == N
assert sum(t) == M
return t

It goes round the while loop on average 0.5 times.

If 0 isn't required then just test for it and go around the loop again
if found. That of course skews the distribution in difficult to
calculate ways!

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Mar 12 '07 #42

P: n/a
On Sat, 2007-03-10 at 22:27 -0500, Terry Reedy wrote:
"Anton Vredegoor" <an*************@gmail.comwrote in message
news:es**********@news3.zwoll1.ov.home.nl...
| Terry Reedy wrote:
|
| Partitioning positive count m into n positive counts that sum to m is a
| standard combinatorial problem at least 300 years old. The number of
such
| partitions, P(m,n) has no known exact formula [...]
| [...] Steven pointed out that one can view the problem like this:
|
| 00010000100010100
|
| That would be [3,4,3,1,2]
|
| where the '1' elements are like dividing shutters that partition the row
| of '0'. This means that the problem is reduced to permutations (albeit

If any of the 1s appear at the ends or together, then you would have 0s in
the partition, which is not allowed, as I understood the spec.
Correct, the OP's spec doesn't allow 0s, but the problem can be easily
transformed back and forth between positive partitions and non-negative
partitions. In order to partition M into N positive numbers, partition
(M-N) into N non-negative numbers and increase each part by 1.

There must be some other constraint on what P(M,N) means, or I just
solved a 300 year old problem.

-Carsten
Mar 12 '07 #43

P: n/a
"Nick Craig-Wood" <ni**@craig-wood.comwrote:
Paul Rubin <httpwrote:
The fencepost method still seems to be simplest:

t = sorted(random.sample(xrange(1,50), 4))
print [(j-i) for i,j in zip([0]+t, t+[50])]

Mmm, nice.

Here is another effort which is easier to reason about the
distribution produced but not as efficient.

def real(N, M):
while 1:
t = [ random.random() for i in range(N) ]
factor = M / sum(t)
t = [ int(round(x * factor)) for x in t]
if sum(t) == M:
break
print "again"
assert len(t) == N
assert sum(t) == M
return t

It goes round the while loop on average 0.5 times.

If 0 isn't required then just test for it and go around the loop again
if found. That of course skews the distribution in difficult to
calculate ways!
I have been wondering about the following as this thread unrolled:

Is it possible to devise a test that can distinguish between sets
of:

- five random numbers that add to 50, and
- four random numbers and a fudge number that add to 50?

My stats are way too small and rusty to attempt to answer
the question, but it seems intuitively a very difficult thing.

- Hendrik

Mar 13 '07 #44

P: n/a
At 06:38 AM 3/10/2007, Steven D'Aprano wrote:
>On Sat, 10 Mar 2007 02:32:21 -0800, Dick Moores wrote:
So why not just repeatedly call a function to generate lists of
length N of random integers within the appropriate range (the closed
interval [1,M-N-1]), and return the first list the sum of which is M?
I don't understand what all the discussion is about. Time is not one
of the constraints.

Time is always a constraint. The sun will expand and destroy the Earth in
a couple of billion years, it would be nice to have a solutions before
then...

*wink*

Seriously, almost all programming problems have two implicit constraints:
it must run as fast as practical, using as little computer resources (e.g.
memory) as practical. Naturally those two constraints are usually in
opposition, which leads to compromise algorithms that run "fast enough"
without using "too much" memory.
OK, points well-taken.

The problem posed by the OP is "Given two positive integers, N and M
with N < M, I have to generate N
positive integers such that sum(N)=M. No more constraints."

But let's say there is one more constraint--that for each n of the N
positive integers, there must be an equal chance for n to be any of
the integers between 1 and M-N+1, inclusive. Thus for M == 50 and N
== 5, the generated list of 5 should be as likely to be [1,46,1,1,1]
as [10,10,10,10,10] or [14, 2, 7, 1, 26].

Wouldn't sumRndInt() be THE solution?:
================================================== ===============
def sumRndInt(M, N):
import random
while True:
lst = []
for x in range(N):
n = random.randint(1,M-N+1)
lst.append(n)
if sum(lst) == M:
return lst

if __name__ == '__main__':

N = 5
M = 50

lst = sumRndInt(M, N)

print "N is %d, M is %d, lst is %s, sum(lst) is %d" % (N, M,
lst, sum(lst))
================================================== ============

I hope I don't seem querulous--I really want to know.

Thanks,

Dick Moores

Mar 13 '07 #45

P: n/a
Dick Moores <rd*@rcblue.comwrote:
But let's say there is one more constraint--that for each n of the N
positive integers, there must be an equal chance for n to be any of
the integers between 1 and M-N+1, inclusive. Thus for M == 50 and N
== 5, the generated list of 5 should be as likely to be [1,46,1,1,1]
as [10,10,10,10,10] or [14, 2, 7, 1, 26].
I don't think what you wrote actually works. Any combination including a 46
must also have four 1s, so the digit 1 has to be at least 4 times as likely
to appear as the digit 46, and probably a lot more likely than that.

On the other hand, making sure that each combination occurs equally often
(as your example might imply) is doable but still leaves the question
whether the order of the numbers matters: are [1,46,1,1,1] and [1,1,46,1,1]
the same or different combinations?

The sorted/sample/xrange solution seems to give an even distribution or
ordered combinations, I don't know what algorithm (short of enumerating all
possible solutions and choosing one at random) gives an even distribution
when the ordering doesn't matter.
Mar 13 '07 #46

P: n/a
"Terry Reedy" <tj*****@udel.eduwrites:
P(m,n) has no known exact formula but can be computed
inductively rather easily. The partitions for m and n can be ranked in
lexicographic order from 0 to P(m,n)-1. Given a rank r in that range, one
can calculate the particular partition that has that rank. So a
equi-probable random count in the range can be turned into a equi-probable
random partition.
I don't see an efficient way to do that, but will try to look sometime
for the book you cited.

Here is a brute force generation approach (not fully tested since
I'm using Python 2.3):

from itertools import imap

# yield all partitions of n having length k, with many duplications
def _partitions(n,k):
if k==0: return
for i in xrange(1,n-k+1):
for p in partitions(n-i, k-1):
yield (i,)+p

def unique_partitions(n,k):
return sorted(set(tuple(sorted(p)) for p in _partitions(n,k)))

p50_5 = unique_partitions(50,5)
assert len(p50_5) = 2611

Note the use of a generator expression in unique_partitions to
enumerate the non-unique partitions without remembering them all in
memory. "set" strips out the duplicates and remembers only the unique
ones. The above takes a few seconds to run on (50,5) on my 1.2 ghz laptop.
Mar 13 '07 #47

P: n/a
En Tue, 13 Mar 2007 03:20:49 -0300, Hendrik van Rooyen
<ma**@microcorp.co.zaescribió:
Is it possible to devise a test that can distinguish between sets
of:

- five random numbers that add to 50, and
- four random numbers and a fudge number that add to 50?

My stats are way too small and rusty to attempt to answer
the question, but it seems intuitively a very difficult thing.
You can't have two different sets with four equal numbers - it's not a
very difficult thing, it's impossible to distinguish because they're
identical!
Given 4 numbers in the set, the 5th is uniquely determined. By example:
12, 3, 10, 18 *must* end with 7. The 5th number is not "random". Any
randomness analysis should include only the first 4 numbers (or any other
set of 4).
In other words, there are only 4 degrees of freedom. In the fence analogy,
you only have to choose where to place 4 poles; the 5th is fixed at the
end.

--
Gabriel Genellina

Mar 13 '07 #48

P: n/a
Gabriel Genellina wrote:
The 5th number is not "random".
More precisely, the fifth number is not *independent*
of the others. You can't have five independent random
numbers that sum to 50; only four independent numbers
plus a dependent one.

--
Greg
Mar 14 '07 #49

P: n/a
On Tue, 13 Mar 2007 08:20:49 +0200, Hendrik van Rooyen wrote:
Is it possible to devise a test that can distinguish between sets
of:

- five random numbers that add to 50, and
- four random numbers and a fudge number that add to 50?

My stats are way too small and rusty to attempt to answer
the question, but it seems intuitively a very difficult thing.
Easy-peasey. Just collate the individual numbers from the lists, and graph
their frequencies. You will get quite different distributions. Both are
"random", but in different ways.

Here is some code to do it.

import random

def method1(total, count):
"""SLOW way of generating random lists."""
L = []
while sum(L) != total:
L = [random.randint(1, total) for i in range(count)]
assert sum(L) == total
assert len(L) == count
return L

def method2(total, count):
if total <= 0 or count <= 0:
return []
elif count == 1:
return [total]
else:
n = random.randint(1, total-count)
L = [n] + method2(total-n, count-1)
assert sum(L) == total
assert len(L) == count
return L
def collate_numbers(method, count):
bins = {}
for i in xrange(count):
L = method(50, 5)
for n in L:
bins[n] = bins.get(n, 0) + 1
return bins
I'll leave graphing the results as a exercise for the reader, but here's a
sample I prepared earlier:

>>d1 # bins from method1()
{1: 208, 2: 207, 3: 158, 4: 167, 5: 162, 6: 146, 7: 142, 8: 115,
9: 114, 10: 113, 11: 94, 12: 98, 13: 92, 14: 68, 15: 60, 16: 70,
17: 58, 18: 56, 19: 57, 20: 44, 21: 25, 22: 36, 23: 27, 24: 34,
25: 23, 26: 18, 27: 18, 28: 13, 29: 13, 30: 8, 31: 13, 32: 17,
33: 7, 34: 1, 35: 5, 36: 4, 37: 3, 38: 2, 39: 1, 40: 2, 44: 1}
>>d2 # bins from method2()
{1: 370, 2: 387, 3: 222, 4: 190, 5: 129, 6: 106, 7: 95, 8: 84,
9: 64, 10: 61, 11: 47, 12: 56, 13: 45, 14: 35, 15: 39, 16: 26,
17: 34, 18: 25, 19: 36, 20: 34, 21: 30, 22: 25, 23: 20, 24: 23,
25: 18, 26: 19, 27: 24, 28: 15, 29: 17, 30: 21, 31: 14, 32: 12,
33: 16, 34: 16, 35: 14, 36: 14, 37: 15, 38: 11, 39: 19, 40: 13,
41: 14, 42: 16, 43: 8, 44: 11, 45: 10}
Even from the raw numbers, the differences are obvious. A bar graph makes
it even clearer.
--
Steven D'Aprano

Mar 14 '07 #50

73 Replies

This discussion thread is closed

Replies have been disabled for this discussion.