473,396 Members | 1,838 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

How do I NTH Select in SQL????

ja
Hello,

I have a table that contains approx 2 million records. I want to
query 2,000 random records.

Example: 500,000 names divided by 20,000 samples = 25. The N is 25,
meaning samples would be sent to every 25th name on the list.

this is what I am using to create a random selection

order by substing(f1,4,1) + substing(f2,3,1) + substing(f3,3,1) +
substing(f4,2,1)

This method takes a very long time.

Any suggestions????
TIA
ja
Feb 9 '06 #1
7 5332
Hi ja,

To randomize your result set, you can order by newID(), like this...
SELECT TOP 25 * FROM sysdatabases
ORDER BY newID()

Feb 9 '06 #2
Hi ja,

To randomize your result set, you can order by newID(), like this...
SELECT TOP 25 * FROM sysdatabases
ORDER BY newID()

Feb 9 '06 #3
If your table has a numeric identifier, use the modulus (%) operator.

Using your example of 500,000 rows divided by 20,000 samples = 25, use
WHERE Identifier % 25 = 0
..

HTH. :)

Feb 9 '06 #4
If your table has a numeric identifier, use the modulus (%) operator.

Using your example of 500,000 rows divided by 20,000 samples = 25, use
WHERE Identifier % 25 = 0
..

HTH. :)

Feb 9 '06 #5
Hi Ja

Well First lets create some data. I'll create a table of Social Security
Numbers (SSN) and Names (Name 1, Name 2, etc).

Create Table People(
SSN varchar(11) not null Primary Key,
Name varchar(50))

declare @counter int
declare @f1 int
declare @f2 int
declare @f3 int
declare @c int
declare @SSN varchar(11)

set nocount on

set @Counter = 1
while @Counter <= 2000000
begin
set @F1 = 100 + cast((rand() * 900) as int)
set @F2 = cast((rand() * 100) as int)
set @F3 = cast((rand() * 10000) as int)
set @SSN = dbo.ZeroFill(@f1, 3) + '-' + dbo.ZeroFill(@f2, 2) + '-' +
dbo.ZeroFill(@f3, 4)
set @C = (Select count(*) from People where SSN = @SSN)
if @C = 0
begin
insert People(SSN, Name) values(@SSN, 'Name ' + cast(@counter as
varchar(7)))
Set @Counter = @Counter + 1
end
end

--Run Script to Here to Create DATA

Create Function ZeroFill(@num as int, @digits as int)
returns Varchar(4)
as
begin
declare @Tmp as varchar(4)
declare @n as int
set @tmp = cast(@num as varchar(4))
while len(@tmp) < @digits
begin
Set @tmp = '0' + @tmp
end
return @tmp
end
---
The Proc SelectSamplePeople will return a random list of People from the
People table.

Create Proc SelectSamplePeople(@SampleSize int)
as

set nocount on

Create table #TempPeople(
RowID int not null identity(1,1) Primary Key,
SSN varchar(11) not null)

--Order by NewId() to get People in Random Order
declare @SQL varchar(2000)
set @SQL = 'Insert #TempPeople(SSN) Select top ' + cast(@SampleSize * 25 as
varchar(12)) + ' SSN from People order by NewID()'

Exec(@SQL)

--Use t.RowID % 25 = 0 to get the 25th person from the Random List

Select p.SSN, p.Name
from People p
inner Join #TempPeople t
on p.SSN = t.SSN
where t.RowID % 25 = 0
--End of Procedure

So Now

SelectSamplePeople 20000

Will get 20000 Random People from the list.
If you forget about the every 25th Name idea, this next proc will get a
random Sample of the same size and it runs somewhat faster and uses less
temporary space.

Create Proc SelectSamplePeople1(@SampleSize int)
as

set nocount on

declare @SQL varchar(2000)
set @SQL = 'Select top ' + cast(@SampleSize as varchar(12)) + ' SSN, Name
from People order by NewID()'

Exec(@SQL)

--End of Proc

SelectSamplePeople1 20000

Will get 20000 Random People from the list

--
-Dick Christoph
"ja" <jo******@sbcglobal.net> wrote in message
news:sj********************************@4ax.com...
Hello,

I have a table that contains approx 2 million records. I want to
query 2,000 random records.

Example: 500,000 names divided by 20,000 samples = 25. The N is 25,
meaning samples would be sent to every 25th name on the list.

this is what I am using to create a random selection

order by substing(f1,4,1) + substing(f2,3,1) + substing(f3,3,1) +
substing(f4,2,1)

This method takes a very long time.

Any suggestions????
TIA
ja

Feb 14 '06 #6
Do this in SAS or SPSS or whatever your statistical package is. They
usually have a well-defined sampling routine that will not have
problems with the distribution of the sample data.

SQL is not always the answer.

Feb 15 '06 #7
yet another approach is to use rand():

create view wrapped_rand_view
as
select rand( ) as random_value
go
create function wrapped_rand()
returns float
as
begin
declare @f float
set @f = (select random_value from wrapped_rand_view)
return @f
end

select c.customer, dbo.wrapped_rand() wrapped_rand
into #t
from customers c

select top 1 percent c.* from customers c join #t on
c.customer=#t.customer
order by wrapped_rand

Feb 15 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: JT | last post by:
I have an input form for which I've created a "matrix" for user input. Basically, the user chooses a radio button and then through javascript, a select box is displayed to define a value for that...
3
by: Memduh Durmaz | last post by:
Hi, I'm using DB2 UDB 7.2. Also I'm doing some tests on SQL Server 2000 for some statements to use efectively. I didn't find any solution on Sql Server about WITH ... SELECT structure of DB2. ...
4
by: point | last post by:
Hello there... I'm a PHP programmer and starting to learn JS... I have a following problem.... I have 3 select boxes! one is hotel one is destination and one is country... if someone...
1
by: Carl Wu | last post by:
Hi all, I am newcomer in HTML, Javascript, I want to create two select controls S1, S2. There are 3 options: ALL, A, B in S1; When select A in S1, It let you select A1, A2 in S2,
4
by: Elroyskimms | last post by:
Using SQL 2000... tblCustomer: CustomerID int CompanyName varchar(20) HasRetailStores bit HasWholesaleStores bit HasOtherStores bit tblInvoiceMessages:
4
by: bobsawyer | last post by:
I've been building a series of SELECT lists that are populated dynamically using HTTPRequest. Things are going pretty well, and I've got the whole thing working flawlessly in Mozilla/Firebird....
10
by: serge | last post by:
Using "SELECT * " is a bad practice even when using a VIEW instead of a table? I have some stored procedures that are identical with the difference of one statement in the WHERE clause. If I...
1
by: serena.delossantos | last post by:
Trying to insert into a history table. Some columns will come from parameters sent to the store procedure. Other columns will be filled with a separate select statement. I've tried storing the...
6
by: Apaxe | last post by:
In the database i have a table with this information: key_id =1 key_desc =43+34+22+12 I want sum the values in key_desc. Something like: SELECT key_desc FROM table But the result of...
0
by: Gordon Padwick | last post by:
A form contains controls, one or more of which can be other forms. A form that contains another form is known as a main form. A form contained by a main form is known as a subform. A subform itself...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.