By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
434,677 Members | 1,076 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 434,677 IT Pros & Developers. It's quick & easy.

Find Missing data from table

P: n/a
Here is an issue that has me stumped for the past few days. I have a
table called MerchTran. Among various columns, the relevant columns for
this issue are:

FileDate datetime
, SourceTable varchar(25)

SQL statement:
SELECT DISTINCT
FileDate
, SourceTable
FROM
MerchTran
ORDER BY
FileDate
, SourceTable

Data looks like this:
FileDate DataSource
-----------------------------------
2005-02-13 00:00:00.000 S1
2005-02-13 00:00:00.000 S2
2005-02-13 00:00:00.000 S3
2005-02-14 00:00:00.000 S1
2005-02-14 00:00:00.000 S2
2005-02-14 00:00:00.000 S3
2005-02-15 00:00:00.000 S2
2005-02-15 00:00:00.000 S3
2005-02-16 00:00:00.000 S1
2005-02-16 00:00:00.000 S2
2005-02-16 00:00:00.000 S3
2005-02-17 00:00:00.000 S1
2005-02-17 00:00:00.000 S2
2005-02-18 00:00:00.000 S1
2005-02-18 00:00:00.000 S2
2005-02-18 00:00:00.000 S3
2005-02-19 00:00:00.000 S1
2005-02-19 00:00:00.000 S3

We run a daily process that inserts data in to this table everyday for
all 3 sources S1, S2, S3

Notice how some data is missing indicating the import process for that
particular source failed.
Example: Missing record
2005-02-15 00:00:00.000 S1
2005-02-17 00:00:00.000 S3
2005-02-19 00:00:00.000 S2

Can someone please help me with a SQL Statement that will return me the
3 missing records as above.

Thanks in advance for all your help!

DBA in distress!
Vishal

Jul 23 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
[posted and mailed, please reply in news]

SQLJunkie (vs******@gmail.com) writes:
Here is an issue that has me stumped for the past few days. I have a
table called MerchTran. Among various columns, the relevant columns for
this issue are:
...
We run a daily process that inserts data in to this table everyday for
all 3 sources S1, S2, S3

Notice how some data is missing indicating the import process for that
particular source failed.
Example: Missing record
2005-02-15 00:00:00.000 S1
2005-02-17 00:00:00.000 S3
2005-02-19 00:00:00.000 S2

Can someone please help me with a SQL Statement that will return me the
3 missing records as above.


The below assumes that daily means daily and not only Monday to Friday.
It will catch if a date is missing entirely, although if the first or
last day is missing.
CREATE TABLE demo (date datetime NOT NULL,
src char(2) NOT NULL,
CONSTRAINT pk_demo PRIMARY KEY (date, src))
go
-- Creates a numbers table in a somewhat casual way that
-- is not guaranteed to work. The MAXDOP turns of parallelism
-- to improve our chances.
SELECT TOP 8000 n = identity(int, 1, 1)
INTO numbers
FROM Northwind..Orders a
CROSS JOIN Northwind..Orders b
OPTION (MAXDOP 1)
go
INSERT demo (date, src)
SELECT '2005-02-13 00:00:00.000', 'S1' UNION
SELECT '2005-02-13 00:00:00.000', 'S2' UNION
SELECT '2005-02-13 00:00:00.000', 'S3' UNION
SELECT '2005-02-14 00:00:00.000', 'S1' UNION
SELECT '2005-02-14 00:00:00.000', 'S2' UNION
SELECT '2005-02-14 00:00:00.000', 'S3' UNION
SELECT '2005-02-15 00:00:00.000', 'S2' UNION
SELECT '2005-02-15 00:00:00.000', 'S3' UNION
SELECT '2005-02-16 00:00:00.000', 'S1' UNION
SELECT '2005-02-16 00:00:00.000', 'S2' UNION
SELECT '2005-02-16 00:00:00.000', 'S3' UNION
SELECT '2005-02-17 00:00:00.000', 'S1' UNION
SELECT '2005-02-17 00:00:00.000', 'S2' UNION
SELECT '2005-02-18 00:00:00.000', 'S1' UNION
SELECT '2005-02-18 00:00:00.000', 'S2' UNION
SELECT '2005-02-18 00:00:00.000', 'S3' UNION
SELECT '2005-02-19 00:00:00.000', 'S1' UNION
SELECT '2005-02-19 00:00:00.000', 'S3'
go
SELECT *
FROM (SELECT date = dateadd(DAY, n - 1, mindate)
FROM numbers n
CROSS JOIN (SELECT mindate = MIN(date),
maxdate = MAX(date)
FROM demo) d
WHERE n BETWEEN 1 AND datediff(DAY, mindate, maxdate) + 1) AS a
CROSS JOIN (SELECT src = 'S1' UNION SELECT 'S2' UNION SELECT 'S3') AS s
WHERE NOT EXISTS (SELECT *
FROM demo
WHERE demo.date = a.date
AND demo.src = s.src)
go
DROP TABLE numbers
DROP TABLE demo
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #2

P: n/a
Thanks for the quick response Erland. I see what you are trying to do,
but I have data in the table starting 2003. Would be a lot of work if I
have to insert 3 records in table numbers for everyday (including
saturday and sunday) starting 2003. But I will try this nonetheless.

Thanks!

Vishal

Jul 23 '05 #3

P: n/a
SQLJunkie (vs******@gmail.com) writes:
Thanks for the quick response Erland. I see what you are trying to do,
but I have data in the table starting 2003. Would be a lot of work if I
have to insert 3 records in table numbers for everyday (including
saturday and sunday) starting 2003. But I will try this nonetheless.


Numbers need to have as many rows as there days in the timespan.

In fact, it does have to be a numbers table. I used a numbers, because
1) it was easier to compose one on the fly.
2) a numbers table is applicable other problems as well.
However, in our system we don't have a numbers - but we have a dates table,
and here is how we fill it in a safe way:
CREATE TABLE dates (
thedate aba_date NOT NULL,
CONSTRAINT pk_dates PRIMARY KEY (thedate)
)

-- Get a temptable with numbers. This is a cheap, but not 100% reliable.
-- Whence the query hint and all the checks.
SELECT TOP 80001 n = IDENTITY(int, 0, 1)
INTO #numbers
FROM sysobjects o1
CROSS JOIN sysobjects o2
CROSS JOIN sysobjects o3
CROSS JOIN sysobjects o4
OPTION (MAXDOP 1)
go
-- Make sure we have unique numbers.
CREATE UNIQUE CLUSTERED INDEX num_ix ON #numbers (n)
go
-- Verify that table does not have gaps.
IF (SELECT COUNT(*) FROM #numbers) = 80001 AND
(SELECT MIN(n) FROM #numbers) = 0 AND
(SELECT MAX(n) FROM #numbers) = 80000
BEGIN
DECLARE @msg varchar(255)

-- Insert the dates:
INSERT dates (thedate)
SELECT dateadd(DAY, n, '19800101')
FROM #numbers
WHERE dateadd(DAY, n, '19800101') < '21500101'

SELECT @msg = 'Inserted ' + ltrim(str(@@rowcount)) + ' rows into
#numbers'
PRINT @msg
END
ELSE
RAISERROR('#numbers is not contiguos from 0 to 80001!', 16, -1)
go
DROP TABLE #numbers


--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #4

P: n/a
CREATE TABLE Readings
(collection_date DATETIME NOT NULL,
source CHAR(2) NOT NULL
CHECK(source IN ('S1', 'S2', 'S3')),
PRIMARY KEY (collection_date, source));

This is a quick way to see what you do have on the days with missing
data:

CREATE VIEW Shortdates(collection_date, source)
AS
SELECT collection_date, source
FROM Readings
GROUP BY collection_date
HAVING COUNT(*) < 3;

We can now use this view or make it into a derived table. We will also
another table, which probably exists already.

CREATE TABLE Sources (sources CHAR(2) NOT NULL PRIMARY KEY, ..);

SELECT R1.collection_date, S1.source
FROM Readings AS R1,
Sources AS S1
WHERE S1.source
NOT IN (SELECT R2.source
FROM Shortdates AS R2 -- or use a derived table
WHERE R1.collection_date
= R2.collection_date);

-- Untested

Jul 23 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.