473,383 Members | 1,798 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

Efficient generation of random data subsets


I have a database table that contains a set of numbers (set X). I need to be
able to generate/export a random set of survey numbers (set C) from set A
with IDs
that aren't in set B. Where set A is a subset of numbers from set X that
match the filters for the survey, and set B contains the set of number IDs
for numbers that have already been exported for the survey.

The idea is to repeatedly generate a random sample of non-duplicate numbers
from set X filtering on the survey templates.

My database comprises of the following tables:

Table Numbers
ID INT PK
Number VARCHAR

Table Templates
ID INT PK
Template VARCHAR

Table Survey
ID INT PK
Name VARCHAR
Description VARCHAR
Started DATETIME

Table Filters
ID INT PK
SurveyID INT FK (Survey:ID)
TemplateID INT FK (Templates:ID)

Table NumberExportedSurveyJoin
ID INT PK
NumberID INT FK (Numbers:ID)
SurveyID INT FK (Survey:ID)
Exported DATETIME
This is the SQL query that I'm using to generate set C (the random set of
numbers) :

SELECT * FROM Numbers WHERE Number LIKE 'filter1%' OR Number LIKE 'filter2%'
AND Numbers.ID NOT IN (SELECT Numbers.ID FROM Numbers LEFT JOIN
NumberExportedSurveyJoin ON Numbers.ID = NumberExportedSurveyJoin.NumberID
WHERE NumberExportedSurveyJoin.SurveyID = S) ORDER BY RAND() LIMIT N;
The problem is that MySQL takes a long time to generate set C even for
moderate sizes of set A and set B. As the number of entries in table
NumberExportedSurveyJoin (set B) increases so does the time to generate the
next
random set of numbers. Is there a better/more efficient approach to
generating set C ?

Thanks,

Tony


Jul 20 '05 #1
0 1444

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Maurice LING | last post by:
Hi, Is there any UML tools that is able to take UML and generate Python codes? Cheers Maurice
4
by: mescaline | last post by:
hi, i'm new to C++ could anyone refer me to a good site / good examples of random numbers? in particular including: 1) the commnds to obtain normally and exponenetially distributed r...
10
by: Ioannis Vranos | last post by:
I want to create some random numbers for encryption purposes, and i wonder if the following scheme makes it more hard to guess the underneath number generation pattern, than the plain use of...
13
by: Roy Gourgi | last post by:
Hi, How do I invoke the random number generator that was suggested by a few people. Ideally, what I would like to do is to instantiate the random no. generator with a seed value that does not...
4
by: Dimos | last post by:
Hello All, I need some help with random number generation. What I need exactly is: To create a few thousand numbers, decimal and integers, between 5 and 90, and then to export them as a...
22
by: gagan.singh.arora | last post by:
Hi there. I want to generate random numbers with a given probability, say 80% even and 20% odd. Is it possible to implement such an algorithm in C?
6
by: py_genetic | last post by:
Hi, I'm looking to generate x alphabetic strings in a list size x. This is exactly the same output that the unix command "split" generates as default file name output when splitting large...
4
by: Patrick | last post by:
Hi, I want to write a programs that checks if a set of numbers in a list obey a condition, the problem is that i have say "n" numbers and i need to check all subsets of the n numbers for the...
3
by: pierrej | last post by:
I was wondering if anyone could possibly assist me please. I'm going to come out and say I am not a C# dev and have limited knowledge on the subject but I am starting to learn it. I being...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.