473,811 Members | 2,771 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Probabilistic Data Structure

Hi All!,

Does anyone know a link to a website or know the best way to create a data
structure that will pick random elements from itself with assigned
probabilities?

For example, say I created a bag of 3 marbles(red, green, and blue) and the
red marble had a 50% change of being picked while the green and blue marbles
both had 25% chance of being picked. they should be picked randomly, but
picked in such a way that the red marble will be picked half the time and
the other 2 a quarter of the time. What is the best way to do this
algorithmically ?

Thanks,
David
or*******@yahoo .com


Jul 17 '05 #1
12 3136
David Turner wrote:
Hi All!,

Does anyone know a link to a website or know the best way to create a data
structure that will pick random elements from itself with assigned
probabilities?

For example, say I created a bag of 3 marbles(red, green, and blue) and the
red marble had a 50% change of being picked while the green and blue marbles
both had 25% chance of being picked. they should be picked randomly, but
picked in such a way that the red marble will be picked half the time and
the other 2 a quarter of the time. What is the best way to do this
algorithmically ?


The way I would do it, for your example, would be to create an array of
4 elements, with 0 and 1 = 'red', 2 = 'green' and 3 = 'blue', then
randomly select elements 0 to 3 evenly.

SteveE

Jul 17 '05 #2
David Turner wrote:
Hi All!,

Does anyone know a link to a website or know the best way to create a data
structure that will pick random elements from itself with assigned
probabilities?

For example, say I created a bag of 3 marbles(red, green, and blue) and the
red marble had a 50% change of being picked while the green and blue marbles
both had 25% chance of being picked. they should be picked randomly, but
picked in such a way that the red marble will be picked half the time and
the other 2 a quarter of the time. What is the best way to do this
algorithmically ?

Thanks,
David
or*******@yahoo .com


If you only need divisions of two, you might be able to cook up somthing
using a binary tree.

Jul 17 '05 #3
"David Turner" <or*******@yaho o.com> wrote in message news:<7H******* ************@ne ws4.srv.hcvlny. cv.net>...
Hi All!,

Does anyone know a link to a website or know the best way to create a data
structure that will pick random elements from itself with assigned
probabilities?

For example, say I created a bag of 3 marbles(red, green, and blue) and the
red marble had a 50% change of being picked while the green and blue marbles
both had 25% chance of being picked. they should be picked randomly, but
picked in such a way that the red marble will be picked half the time and
the other 2 a quarter of the time. What is the best way to do this
algorithmically ?

Thanks,
David
or*******@yahoo .com


hi there
u can maintain array of nodes with their respective probabilities & an
array to maintain how many times the node was chosen till time & then
generate a random number & check if the no of times > no of times it
should have been chosen then generate another no else chose that node
this is it
regards
amey
Jul 17 '05 #4
How about if you define a set
of intervals that subdivide the
interval [0,1) in proportion to
the probabilities you desire. Then
generate a random number whose
value selects one of the intervals.
e.g. item 1 is 50 percent so interval is [0,0.5)
item 2 is 25 % so [0.5,0.75)
item 3 is 25% so [0.75,1)
random number is 0.67 so item 2 is selected.

Data structure is an array containing the lower and upper
bounds of each interval. Index of the array is the item.

"Amey Samant" <am*****@yahoo. com> wrote in message
news:66******** *************** ***@posting.goo gle.com...
"David Turner" <or*******@yaho o.com> wrote in message

news:<7H******* ************@ne ws4.srv.hcvlny. cv.net>...
Hi All!,

Does anyone know a link to a website or know the best way to create a data structure that will pick random elements from itself with assigned
probabilities?

For example, say I created a bag of 3 marbles(red, green, and blue) and the red marble had a 50% change of being picked while the green and blue marbles both had 25% chance of being picked. they should be picked randomly, but
picked in such a way that the red marble will be picked half the time and the other 2 a quarter of the time. What is the best way to do this
algorithmically ?

Thanks,
David
or*******@yahoo .com


hi there
u can maintain array of nodes with their respective probabilities & an
array to maintain how many times the node was chosen till time & then
generate a random number & check if the no of times > no of times it
should have been chosen then generate another no else chose that node
this is it
regards
amey

Jul 17 '05 #5
<snip>
The way I would do it, for your example, would be to create an array
of
4 elements, with 0 and 1 = 'red', 2 = 'green' and 3 = 'blue', then
randomly select elements 0 to 3 evenly.

SteveE
</snip>

hi steve
this wont work.
consider if there are around 100 elements
& lets say an element 'red' has 10% prob & you have placed it in 0-9

in this case ... do you guarantee that once the 0-9 numbers have
already been selected randomly once, they are not selected before
other numbers are selected randomly
i mean random generator does not guarantee that the number selected
once is not selected again before all numbers are exhausted
thus in your case though you have provided only space for 10 for 'red'
it might be selected more than 10 times in 100 experiments thus
violating its given prob

hope i made it clear

regards
amey
Jul 17 '05 #6
>
hi steve
this wont work.
consider if there are around 100 elements
& lets say an element 'red' has 10% prob & you have placed it in 0-9

in this case ... do you guarantee that once the 0-9 numbers have
already been selected randomly once, they are not selected before
other numbers are selected randomly
i mean random generator does not guarantee that the number selected
once is not selected again before all numbers are exhausted
thus in your case though you have provided only space for 10 for 'red'
it might be selected more than 10 times in 100 experiments thus
violating its given prob


OK, it does rely on the random number generator producing an even spread.

If you need guarantee an even distribution then what about shuffling the
100 elements randomly and selecting incrementally, repeating the shuffle
every 100 elements?

SteveE

Jul 17 '05 #7
SteveE <ea****@btinter net.com_> wrote in message news:<Ai******* **************@ news-text.cableinet. net>...
OK, it does rely on the random number generator producing an even spread.

If you need guarantee an even distribution then what about shuffling the
100 elements randomly and selecting incrementally, repeating the shuffle
every 100 elements?

SteveE


hi StevE
see you agreed that random number generation would cause problem in
first case then again in your answer you are talking about shuffling
elements RANDOMLY :d how would you solve the second RANDOM case then
;)
neways
another issues with this could be COST
if there are 1000 elements , shuffling 1000 elements for every element
would make your program much slower as the complexity increases
TREMENDOUSLY with this logic

regards
amey
Jul 17 '05 #8
> see you agreed that random number generation would cause problem in
first case then again in your answer you are talking about shuffling
elements RANDOMLY :d how would you solve the second RANDOM case then
;)
There will always be restriction of psuedo-random numbers never trully
being random (unless you use some exotic external generator) when using
computers. This method will guarantee an even distribution over the
number of iterations required, you just need to ensure you shuffle
sufficiently to give the 'appearance' of being genuinely random.
neways
another issues with this could be COST
if there are 1000 elements , shuffling 1000 elements for every element
would make your program much slower as the complexity increases
TREMENDOUSLY with this logic


No, I don't think the complexity changes at all. The actual data
retrieval would be very fast, but you would pay a time cost when
shuffling depending on the distribution, number of elements required and
iterations used. Then again, shuffling data around is what computers are
all about ;)

I think it all boils down to the exact requirements and constraints. For
instance, the original poster's example would only require 4 elements.
You could even factor this up to 40000 elements to defer the shuffling,
depending on the frequency of calls, memory available, etc.

There is another beneficial side-effect of using this method, which is
that the time taken for data retrieval and shuffling is constant (as
much as you can be in Java), whereas a 'hit and miss' approach, where
values are rejected until within the required distribution, is entirely
unpredictable time-wise (well, pseudo-unpredictable at least :)

SteveE

Jul 17 '05 #9
hi Steve
its great to discuss & clear things up ....
There will always be restriction of psuedo-random numbers never trully
being random (unless you use some exotic external generator) when using
computers. This method will guarantee an even distribution over the
number of iterations required, you just need to ensure you shuffle
sufficiently to give the 'appearance' of being genuinely random.
i tried this code

import java.util.*;
class TestRandom
{
public static void main(String[] args)
{
Random ran =new Random(10);
for(int i=0;i<10;i++)
System.out.prin tln(i+" - "+ran.nextInt(1 0));
}
}

this should generate 10 random numbers. i goes from 0 - 9 & random
number bounds are 0 - 9 as well .

& the results were as follows
OUTPUT :

0 - 3
1 - 0
2 - 3
3 - 0
4 - 6
5 - 6
6 - 7
7 - 8
8 - 1
9 - 4

is this evenly distributed ?
how would you shuffle such that you always get HIT or at least try to
maximize HITs ?
No, I don't think the complexity changes at all. The actual data
retrieval would be very fast, but you would pay a time cost when
shuffling depending on the distribution, number of elements required and
iterations used. Then again, shuffling data around is what computers are
all about ;)
swapping has always been an expensive task especiaaly when it comes
inside nested loop.
here is a joke (my own creation) "swapping = three variables , three
instructions to swap two values " ;)

I think it all boils down to the exact requirements and constraints. For
instance, the original poster's example would only require 4 elements.
You could even factor this up to 40000 elements to defer the shuffling,
depending on the frequency of calls, memory available, etc.


no, i agree that requirements play role but what when number of values
is known only at runtime?
even if you know that there are 4 elements today, do you assume that
there will not be more than 100 elements in your next version/upgrade
;) ?
this way your entire logic would change with change in quantity which
may not be desirable.
when the no. of elements are FIXED then it is ok to choose by this
way.

regards
amey
Jul 17 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2379
by: Mike Jones | last post by:
need help with data structures.Looking for ways to start, sample code, anything Program description: Design and implement a Visual C++ .NET program that inserts values into a data structure and then removes them as described below.
2
6469
by: yee young han | last post by:
I need a fast data structure and algorithm like below condition. (1) this data structure contain only 10,000 data entry. (2) data structure's one entry is like below typedef struct _DataEntry_ { char szInput; char szOutput; int iSum;
6
2975
by: Amy | last post by:
Hi there: I'm trying to link two large data sets collected from two different sources for epidemiological research. These two data sets share a few common fields such as Social Insurance Number, Name (partically), Gender, etc. Due to incompleteness or errors of some data records, I guess we have to use probabilistic linkage to find matched records. Is there any academic/commercial software available for probabilistic linkage? Or some...
11
3195
by: theshowmecanuck | last post by:
As a matter of academic interest only, is there a way to programmatically list the 'c' data types? I am not looking for detail, just if it is possible, and what function could be used to accomplish it. For example: int main void() { while there are more data types { print next data type; }
3
8904
by: Kiran B. | last post by:
Hi, I am new to .net. I have two Data Structure Type ... Sturcture A and Structure B. Structure A Public Fname as String Public LastName as String Public City as String Public Zip as String End Structure
11
4148
by: Macca | last post by:
Hi, I'm writing an application that will pass a large amount of data between classes/functions. In C++ it was more efficient to send a pointer to the object, e.g structure rather than passing the actual structure itself. Is this true of C# also?
3
2169
by: aurora | last post by:
This is an entry I just added to ASPN. It is a somewhat novel technique I have employed quite successfully in my code. I repost it here for more explosure and discussions. http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/475158 wy ------------------------------------------------------------------------
29
4276
by: zoltan | last post by:
Hi, The scenario is like this : struct ns_rr { const u_char* rdata; }; The rdata field contains some fields such as :
30
3410
by: Charles Law | last post by:
Here's one that should probably have the sub-heading "I'm sure I asked this once before, but ...". Two users are both looking at the same data, from a database. One user changes the data and commits it. How does the other user get the updated view without polling for changes? Is there some sort of callback mechanism that can be set up on the dataset or connection? TIA
0
9728
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9605
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10648
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10402
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9205
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6890
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5554
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5692
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3867
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.