473,748 Members | 2,685 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Is there a better way to simulate randomly choosing from weighted set

Here is my problem: suppose there are, say, five events with these
probabilities:

event1 0.7
event2 0.1
event3 0.1
event4 0.05
event5 0.05

Note that sum of the probabilities is 1.0. I would like a function that
simulates these events and returns an int to indicate which event
occurred: the function should statistically return 1 about 70% of the
time, 2 about 10% of the time, and so on.

I have figured out a way to do this, but I suspect my way is
suboptimal.

I build a vector of five elements that looks like:

{ 0.05, 0.05+0.05, 0.05+0.05+0.1, 0.05+0.05+0.1+0 .1,
0.05+0.05+0.1+0 .1+0.7 }
= { 0.05, 0.1, 0.2, 0.3, 1.0 }

I then generate a random float in the interval 0.0 ... 1.0, and if the
random float is in the range 0 to 0.05, I return event 5, and if the
random float is in the range 0.05-0.1, I return event 4, and so on.
(Actually, I should test for event 1 first since it is most common, but
I'm too lazy to re-type my example vector above.)

For my real problem, I have to deal with many different cases where the
number of events to consider constantly varies, and I suspect there has
to be a better way than building a vector to represent the different
ranges a random variable can fall in and then seeing which range it
falls in.

So is there a better way?

Feb 27 '06 #1
2 1555

ya*******@yahoo .com wrote:
Here is my problem: suppose there are, say, five events with these
probabilities:

event1 0.7
event2 0.1
event3 0.1
event4 0.05
event5 0.05

Note that sum of the probabilities is 1.0. I would like a function that
simulates these events and returns an int to indicate which event
occurred: the function should statistically return 1 about 70% of the
time, 2 about 10% of the time, and so on.

I have figured out a way to do this, but I suspect my way is
suboptimal.

I build a vector of five elements that looks like:

{ 0.05, 0.05+0.05, 0.05+0.05+0.1, 0.05+0.05+0.1+0 .1,
0.05+0.05+0.1+0 .1+0.7 }
= { 0.05, 0.1, 0.2, 0.3, 1.0 }

I then generate a random float in the interval 0.0 ... 1.0, and if the
random float is in the range 0 to 0.05, I return event 5, and if the
random float is in the range 0.05-0.1, I return event 4, and so on.
(Actually, I should test for event 1 first since it is most common, but
I'm too lazy to re-type my example vector above.)

For my real problem, I have to deal with many different cases where the
number of events to consider constantly varies, and I suspect there has
to be a better way than building a vector to represent the different
ranges a random variable can fall in and then seeing which range it
falls in.


You don't have to precompute the vector at all. Just modify your
algorithm to compute the ranges on the fly. Like this:

#include <stdio.h>
#include <stdlib.h>

int event(double const* prob, unsigned prob_len)
{
unsigned r = rand();
double p = 0;
for(unsigned i = 0; p < 1 && i < prob_len; ++i)
{
p += prob[i];
if(r < p * RAND_MAX)
return i;
}
return -1; // should not get here
}

int main()
{
double const prob[] = { .7, .1, .1, .05, .05 };
for(unsigned n = 1000000; n--;)
printf("%d\n", event(prob, sizeof(prob) / sizeof(*prob))) ;
}

$ ./exp | awk '/0/{++n0} /1/{++n1} /2/{++n2} /3/{++n3} /4/{++n4} END {
print n0, n1, n2, n3, n4 }'
700612 99770 99752 50075 49791

Feb 27 '06 #2
ya*******@yahoo .com wrote:
Here is my problem: suppose there are, say, five events with these
probabilities:

event1 0.7
event2 0.1
event3 0.1
event4 0.05
event5 0.05

Note that sum of the probabilities is 1.0. I would like a function that
simulates these events and returns an int to indicate which event
occurred: the function should statistically return 1 about 70% of the
time, 2 about 10% of the time, and so on.
[details of a solution snipped]
For my real problem, I have to deal with many different cases where the
number of events to consider constantly varies, and I suspect there has
to be a better way than building a vector to represent the different
ranges a random variable can fall in and then seeing which range it
falls in.

So is there a better way?


a) The Alias Method of Walker (Google it or find it is TAOCP). This choice
is good if you have to draw many times from the same probability set.

b) Google the archive of this news group for Anglewyrm's hat container. This
solution is good when the probabilities change frequently.
Best

Kai-Uwe Bux
Feb 27 '06 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2017
by: Bob Bedford | last post by:
I've a site where companies add their article. I'de like to provide a "lasts articles" table. By this, I'll show last articles inserted. But I won't always the same articles at any refresh. Question 1: how to get a "random" selection from the database, giving more priority to the last inserted (the ones with higher articleID) Question 2: I'd like to provide one article by client. I won't show 3 articles from the same client only because...
36
7132
by: Ben Justice | last post by:
For a program in c, I need some random numbers for a system were people are placing bets. This is not a commerical project btw. Generally, I tend to rely on things from the standard library, because they're written by people with skills far above mine. Hence, I've always used rand() and FALSELY assumed it could produce unpredictable random numbers and could be used in many situations even if security was an issue. Firstly, lets assume...
0
3292
by: bettervssremoting | last post by:
To view the full article, please visit http://www.BetterVssRemoting.com Better VSS Remote Access Tool including SourceOffSite, SourceAnyWhere and VSS Remoting This article makes a detailed comparison among SourceAnyWhere, SourceOffSite, VSS Remoting and possible others.
8
5555
by: ericvw | last post by:
How would I shuffle a static array of 52 cards that you input an integer, n, into a function and it takes the first n cards as the left segment and the remaining as the right. Then it shuffles this deck starting from the first of the right segment, then first of the left, second of the right, second of the left. Once one side is exhausted it fills in the rest of the shuffle with the other remaining segment. Any suggestions?
1
21101
by: HEMH6 | last post by:
Weihted Average Write a C program to calculate and print the weighted average of a list of N floating point number, using the formula Xave = F1X1 + F2X2+...+ FnXn where the F's are fractional weighting factors, i.e. 0<=F1<1, and F1+F2+...+Fn=1
11
4857
by: alpha.beta0 | last post by:
I have a MySQL table of servers, I use RAND() to pick a random server to use each time, but how can I add a number to each server entry that allows it be to picked more often than the other 20 servers? For example Server1's weight is 80 and Server2's weight is 40 and hence Server1 is more likely to be picked than the others.
3
5307
by: Salad | last post by:
http://www.mathwords.com/w/weighted_average.htm At the above link gives an example of a weighted average. It uses the following example: Grades are often computed using a weighted average. Suppose that homework counts 10%, quizzes 20%, and tests 70%. If Pat has a homework grade of 92, a quiz grade of 68, and a test grade of 81, then
6
2491
by: alessandro.carrega | last post by:
In Igraph library (http://cneurocvs.rmki.kfki.hu/igraph/) there's a method to add a weighted edge?
0
1972
by: edmund_xue | last post by:
Hello There, I was just working on a project of creating a database for a relative that have alots of clients. His company managed the shares portfolio of clients and he requires the buy price of each stock in a particular client accounts to be weighted averaged every time there is a purchase in stock, but use the previous weighted average for a sold in stock. say:
0
8995
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8832
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9381
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9332
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9254
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6799
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6078
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4608
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
2791
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.