By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,386 Members | 1,817 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,386 IT Pros & Developers. It's quick & easy.

Can an update statement be used for interpolating missing data?

P: n/a
Here is a small sample of data from a table of about 500 rows
(Using MSSqlserver 2000)

EntryTime Speed Gross Net
------------------ ----- -----
21:09:13.310 0 0 0
21:09:19.370 9000 NULL NULL
21:09:21.310 NULL 95 NULL
21:10:12.380 9000 NULL NULL
21:10:24.310 NULL 253 NULL
21:11:24.370 8000 NULL NULL
21:11:27.310 NULL 410 NULL
21:11:51.320 NULL 438 NULL
21:11:51.490 NULL NULL 10

After the first row, every row has only one value of the three.
I would like to replace all the NULL values with calculated
interpolations.

I can do it w/ cursors or while loops.
I could do it w/ VB (I think)

Can this be done w/ an Update statement using self joins?
What would be the best way?

The value for speed can increase or decrease over time, but can never
be < 0

Net is always less than gross, and neither can go below 0.

TIA for any helpful suggestions.

Thanks,
BM

Jul 23 '05 #1
Share this Question
Share on Google+
9 Replies


P: n/a
It's not really clear how you want to calculate the new values, but
perhaps you can look at CASE and COALESCE in Books Online. If this
doesn't help, then you should post some more information about how you
want to calculate the new values.

Simon

Jul 23 '05 #2

P: n/a
I expect it will be possible with an UPDATE and a join/subquery.

UPDATE YourTable
SET speed =
(SELECT ...
FROM YourTable
WHERE entrytime < YourTable.entrytime ...)
WHERE speed IS NULL

If you need a complete solution then explain the calculation, show the
result you want and post DDL for the table. Also, it's best to post sample
data as INSERT statements so that others can more easily test out possible
solutions. That way you'll get accurate and useful answers more quickly.
See:
http://www.aspfaq.com/etiquette.asp?id=5006

--
David Portas
SQL Server MVP
--
Jul 23 '05 #3

P: n/a
Thank you greatly for the FAQ link. I learned a lot just reading it.

The table:
if exists (select * from dbo.sysobjects where id =
object_id(N'[tblProfileTemp]') and OBJECTPROPERTY(id, N'IsUserTable') =
1)
drop table [tblProfileTemp]
GO

if not exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[tblProfileTemp]') and OBJECTPROPERTY(id,
N'IsUserTable') = 1)
BEGIN
CREATE TABLE [tblProfileTempX] (
--[Item] [int] IDENTITY (1, 1) NOT NULL ,
[Item] [int] NOT NULL ,
[EntryTime] [datetime] NULL ,
[RunTime] [numeric](12, 3) NULL ,
[Speed] [int] NULL ,
[gross] [int] NULL ,
[net] [int] NULL
) ON [PRIMARY]
END

The insert statements (the code to generate this was an education in
itself).
(10 rows should be enough) I commented out the identity contraint so
inserting w/o the column list would be possible.
INSERT INTO [tblProfileTempx] VALUES(1,'Jun 21 2005
9:09:13:310PM',0.000,0,0,0)
INSERT INTO [tblProfileTempx] VALUES(2,'Jun 21 2005
9:09:19:370PM',6.060,9000,NULL,NULL)
INSERT INTO [tblProfileTempx] VALUES(3,'Jun 21 2005
9:09:21:310PM',8.000,NULL,95,NULL)
INSERT INTO [tblProfileTempx] VALUES(4,'Jun 21 2005
9:10:12:380PM',59.070,9000,NULL,NULL)
INSERT INTO [tblProfileTempx] VALUES(5,'Jun 21 2005
9:10:24:310PM',71.000,NULL,253,NULL)
INSERT INTO [tblProfileTempx] VALUES(6,'Jun 21 2005
9:11:24:370PM',131.060,8000,NULL,NULL)
INSERT INTO [tblProfileTempx] VALUES(7,'Jun 21 2005
9:11:27:310PM',134.000,NULL,410,NULL)
INSERT INTO [tblProfileTempx] VALUES(8,'Jun 21 2005
9:11:51:320PM',158.010,NULL,438,NULL)
INSERT INTO [tblProfileTempx] VALUES(9,'Jun 21 2005
9:11:51:490PM',158.180,0,NULL,NULL)
INSERT INTO [tblProfileTempx] VALUES(10,'Jun 21 2005
9:13:51:310PM',278.000,NULL,446,NULL)

Explanation of data:
The data represents the output of a running press. Each data element
is recorded at EntryTime. RunTime represents the time elapsed since
the start, and is expressed in seconds. Gross is number of copies
printed. Net is number of copies not rejected automatically by various
defect detectors.
Desired Result:
Example:
Gross for item 1 is 0
Gross for item 2 is null
Gross for item 3 is 95

I need to replace the null in item 2 with a value that represents the
gross count for that time, assuming a constant press speed. It will
not necesarily be constant, but the error will be slight.

The formula for that value will be:

Gross2 = Gross1 + ((Gross3 - Gross1) * ((RunTime2-RunTime1) / (RunTime3
- RunTime1)))

Similar interpolations will be calculated for Net and Speed.

It gets harder where there are two or more nulls between known values.

I was working along the lines of:

Update t1
Set t1.Gross = t0.Gross + ((t2.Gross = t0.Gross) *
((t1.runtime-t0.runtime)/(t2.runtime-t0.runtime)))

from tblProfileTempX t1 inner join tblProfileTempX t0 on t0.item =
t1.item
inner join tblProfileTempX t2 on t2.item = t0.item

where t1.gross is null
and t0.EntryTime = (select Max(EntryTime) from tblProfileTempX
where gross is not null and item < t1.item)
and t2.EntryTime = (select Min(EntryTime) from tblProfileTempX
where gross is not null and item > t1.item)

I've reduce the errors to the following:
Server: Msg 170, Level 15, State 1, Line 2
Line 2: Incorrect syntax near '='.
Server: Msg 156, Level 15, State 1, Line 9
Incorrect syntax near the keyword 'and'.

Thanks in advance for your time and effort, and apologies for the group
etiquette breach,

Regards,
BM

David Portas wrote:
I expect it will be possible with an UPDATE and a join/subquery.

UPDATE YourTable
SET speed =
(SELECT ...
FROM YourTable
WHERE entrytime < YourTable.entrytime ...)
WHERE speed IS NULL

If you need a complete solution then explain the calculation, show the
result you want and post DDL for the table. Also, it's best to post sample
data as INSERT statements so that others can more easily test out possible
solutions. That way you'll get accurate and useful answers more quickly.
See:
http://www.aspfaq.com/etiquette.asp?id=5006

--
David Portas
SQL Server MVP
--


Jul 23 '05 #4

P: n/a
Groan:
Tired eyes mistook an = sign for a - sign in line two.
With that fixed, it runs, but:

0 rwos affected

Jul 23 '05 #5

P: n/a
[posted and mailed, please reply in news]

Vorpal (br****@gmail.com) writes:
Here is a small sample of data from a table of about 500 rows
(Using MSSqlserver 2000)

EntryTime Speed Gross Net
------------------ ----- -----
21:09:13.310 0 0 0
21:09:19.370 9000 NULL NULL
21:09:21.310 NULL 95 NULL
21:10:12.380 9000 NULL NULL
21:10:24.310 NULL 253 NULL
21:11:24.370 8000 NULL NULL
21:11:27.310 NULL 410 NULL
21:11:51.320 NULL 438 NULL
21:11:51.490 NULL NULL 10

After the first row, every row has only one value of the three.
I would like to replace all the NULL values with calculated
interpolations.

I can do it w/ cursors or while loops.
I could do it w/ VB (I think)

Can this be done w/ an Update statement using self joins?


Not "an", but a couple. In the below script I get the data into a temp
table with an IDENTITY column, which has a consecutive number. I then
find the next and previous row with a non-NULL value for speed, for those
rows that have a NULL value. Once I have these pointers I can make the
interpolation. There is no extrapolation for the NULL values at the end.

The number of UPDATE statements could be reduced if you have three
sets of pointer columns, but I'm not sure that is worth the pain.

The script does not include handling of Net. That is left as an exercise
to the reader. :-)

CREATE TABLE tbl (entrytime datetime NOT NULL PRIMARY KEY,
speed int NULL,
gross int NULL,
net int NULL)
go
INSERT tbl(entrytime, speed, gross, net)
SELECT '21:09:13.310', 0, 0, 0 UNION
SELECT '21:09:19.370', 9000, NULL, NULL UNION
SELECT '21:09:21.310', NULL, 95, NULL UNION
SELECT '21:10:12.380', 9000, NULL, NULL UNION
SELECT '21:10:24.310', NULL, 253, NULL UNION
SELECT '21:11:24.370', 8000, NULL, NULL UNION
SELECT '21:11:27.310', NULL, 410, NULL UNION
SELECT '21:11:51.320', NULL, 438, NULL UNION
SELECT '21:11:51.490', NULL, NULL, 10
go
CREATE TABLE #temp (ident int IDENTITY UNIQUE,
entrytime datetime NOT NULL PRIMARY KEY,
speed int NULL,
gross int NULL,
net int NULL,
prevval int NULL,
nextval int NULL)

INSERT #temp(entrytime, speed, gross, net)
SELECT entrytime, speed, gross, net
FROM tbl
ORDER BY entrytime

UPDATE #temp
SET prevval = (SELECT MAX(t2.ident)
FROM #temp t2
WHERE t2.ident < t.ident
AND t2.speed IS NOT NULL)
FROM #temp t
WHERE t.speed IS NULL

UPDATE #temp
SET nextval = (SELECT MIN(t2.ident)
FROM #temp t2
WHERE t2.ident > t.ident
AND t2.speed IS NOT NULL)
FROM #temp t
WHERE t.speed IS NULL

UPDATE t
SET speed = p.speed +
1E0 * (n.speed - p.speed) * (t.ident - t.prevval) /
(t.nextval - t.prevval)
FROM #temp t
JOIN #temp p ON t.prevval = p.ident
JOIN #temp n ON t.nextval = n.ident
WHERE t.speed IS NULL

UPDATE #temp
SET prevval = NULL, nextval = NULL
UPDATE #temp
SET prevval = (SELECT MAX(t2.ident)
FROM #temp t2
WHERE t2.ident < t.ident
AND t2.gross IS NOT NULL)
FROM #temp t
WHERE t.gross IS NULL

UPDATE #temp
SET nextval = (SELECT MIN(t2.ident)
FROM #temp t2
WHERE t2.ident > t.ident
AND t2.gross IS NOT NULL)
FROM #temp t
WHERE t.gross IS NULL

UPDATE t
SET gross = p.gross +
1E0 * (n.gross - p.gross) * (t.ident - t.prevval) /
(t.nextval - t.prevval)
FROM #temp t
JOIN #temp p ON t.prevval = p.ident
JOIN #temp n ON t.nextval = n.ident
WHERE t.gross IS NULL

UPDATE #temp
SET prevval = NULL, nextval = NULL

go
UPDATE tbl
SET speed = t.speed,
gross = t.gross,
net = t.net
FROM tbl
JOIN #temp t ON tbl.entrytime = t.entrytime
go
SELECT * FROM #temp
SELECT * FROM tbl ORDER BY entrytime
go
DROP TABLE tbl
DROP TABLE #temp


--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #6

P: n/a
Erland Sommarskog (es****@sommarskog.se) writes:
UPDATE t
SET speed = p.speed +
1E0 * (n.speed - p.speed) * (t.ident - t.prevval) /
(t.nextval - t.prevval)
FROM #temp t
JOIN #temp p ON t.prevval = p.ident
JOIN #temp n ON t.nextval = n.ident
WHERE t.speed IS NULL


So I did not consider time. This might be better:

UPDATE t
SET speed = p.speed +
1E0 * (n.speed - p.speed) *
datediff(ms, p.entrytime, t.entrytime) /
datediff(ms, p.entrytime, n.entrytime)
FROM #temp t
JOIN #temp p ON t.prevval = p.ident
JOIN #temp n ON t.nextval = n.ident
WHERE t.speed IS NULL


--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #7

P: n/a
I have settled on the following for now:
1. First, if the final value is a null, set it to the maximum value
for that field.
This is necessary so that the intervening values can be calculated.
This update does that:
update t1
set Gross = t0.gross
from tblProfiletemp t1, tblprofiletemp t0
where t1.gross is null
and t1.entrytime = (select max(entrytime) from tblprofiletemp)
and t0.gross = (select max(gross) from tblprofiletemp)

Once that is done, then the following update statement fills in all
intervening values with the correct value:
Update t1
set Gross=t0.Gross + ((t2.Gross - t0.Gross) *
((t1.runtime-t0.runtime)/(t2.runtime-t0.runtime)))
from tblProfiletemp t0 ,tblProfiletemp t1,tblProfiletemp t2
where t1.gross is null
and t0.entrytime = (select Max(EntryTime) from tblProfiletemp where
item < t1.item and gross is not null)
and t2.EntryTime = (select Min(EntryTime) from tblProfiletemp where
item > t1.item and gross is not null)

The reason I could not get the update statement to work before is that
I was erroneously creating self joins.

The above will work properly on the columns where the data always
increases. It may need some modification for the speed columns.

Thanks for all the input.
BM

Jul 23 '05 #8

P: n/a
Don't rely on the IDENTITY column to drive the sequence. IDENTITY is
only supposed to be an arbitrary key. EntryTime should be a better way
to do it:

UPDATE tblProfileTemp
SET gross =
(SELECT T0.gross +
((T2.gross - T0.gross) *
((tblProfileTemp.runtime-T0.runtime)/(T2.runtime-T0.runtime)))
FROM tblProfileTemp AS T0,
tblProfileTemp AS T2
WHERE T0.entrytime =
(SELECT MAX(entrytime)
FROM tblProfileTemp AS T
WHERE entrytime < tblProfileTemp.entrytime
AND gross IS NOT NULL)
AND T2.entrytime =
(SELECT MIN(entrytime)
FROM tblProfileTemp AS T
WHERE entrytime > tblProfileTemp.entrytime
AND gross IS NOT NULL))
WHERE gross IS NULL

Thanks for posting the DDL and sample. It helped.

--
David Portas
SQL Server MVP
--

Jul 23 '05 #9

P: n/a
For the speed table, I took a different approach.
Once the values for gross copy count have been inserted, then the speed
can be calculated backwards as
Speed = (Gross - Gross0)/(RunTime - RunTime0).

I changed the runtime so it is recorded in seconds, rather than
minutes, and round the speed value to the neares 100.
These two changes give a smoother graph.

Where a speed needs to be calculated from two gross values recorded
very short times apart, then the speed sometimes appears anomalously
high or low. (usually high).

The results of this are now in testing, and I'll see what feedback from
the users is before making further changes.

Thanks tremendously to all who helped.

Jul 23 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.