I have a query that does a self join following by correlated subquery. I want to be able to select from "this row" or the latest row before "this row", based on the "rn" column, where column1 is populated. Its currently working using a correlated subquery as follows: - select
-
a.id,
-
coalesce(a.typeName, b.typeName) as typeName
-
FROM table1 a, table1 b
-
where a.id = b.id AND b.rn = (
-
select max(rn) from table1 where id = a.id and rn <= a.rn and column1 is not null
-
)
-
Its a bit slow though since its needs to execute the inner query for each row in the outer query. I have seen examples of converting this to a join but cant seem to get it to work, as I always need to reference the outer query for
part.
Can anyone help converting this to a non-correlated query?
thanks
OK I think I have this sorted now. The below query has gone from 17 minutes to 22 seconds on my machine: -
-- Fills in the blanks of the items table by joining each item change record
-
-- to the previous valid record for the same item, to allow retrival of last "real" price change
-
select * from (
-
select
-
a.id,
-
coalesce(a.priceChange, b.priceChange) as priceChange,
-
ROW_NUMBER() OVER (PARTITION BY a.id, a.aMonth ORDER BY b.rn desc) as rn1
-
FROM items a, items b
-
where
-
a.id = b.id AND -- Only want records that relate to same item
-
b.rn <= a.rn and -- Previous or same (ensures we get items with no change at all) records only
-
b.priceChange IS NOT NULL -- Previous record must be a "real" record
-
) tt where rn1 = 1 -- Get the latest "real" record that relates to same item. Latest is determined by the
-
-- row number column being 1 (since we ordered the partition DESC)
-
The correlated subquery has been replaced by the use of the analytical ROW_NUMBER() function. I do a self join to get all items before the "current" item, where there is a valid priceChange. The use of <= ensures that if there is no price change at all for a particular item, it will be joined to itself.
As part of the column selection, I assign each record a row number by partitioning the result set on the item id and month, ordering it so that the latest change record will be given a value of 1. This makes it possible to select the latest value using rn1 = 1 rather than selecting the max value.
There has been a huge performance increase using this strategy. It again confirms that correlated subqueries should be looked at if there is a query performance issue.
Thanks for your help, hopefully someone else will benefit from this in the future.
11 3262 NeoPa 32,556
Expert Mod 16PB
Perhaps it may be easier to understand what you're looking for if you described it in words.
OK, I used sample tables/columns in first example. There scenario is I have a table of item changes over time, basically ID, priceChange and a calculated row number. Some items may not change in a particular month and so I have a situation where I end up with a table like - ID priceChange Month rn
-
1 100 March 2010 1
-
1 null June 2010 2
-
1 200 July 2010 3
-
2 520 March 2010 1
-
2 null June 2010 2
-
I want to fill in the null amount for each item by selecting the previous "real" item price for that particular item. Hence I want to end up with - ID priceChange Month rn
-
1 100 March 2010 1
-
1 100 June 2010 2
-
1 200 July 2010 3
-
2 520 March 2010 1
-
2 520 June 2010 2
-
To do this, I do a join based on the item id being the same and select the record with largest rn value that has same ID. -
select
-
a.id,
-
coalesce(a.priceChange, b.priceChange) as priceChange
-
FROM items a, items b
-
where a.id = b.id AND b.rn = (
-
select max(rn) from items where id = a.id and rn <= a.rn
-
)
-
It works fine, just a bit slow, due to the correlated subquery. I have seen some examples of how these can be replaced with JOINs but I cant get it to fit my problem
Is that any clearer?
NeoPa 32,556
Expert Mod 16PB
Yes. Certainly.
Try this : - SELECT [ID]
-
, COALESCE(tI.Price, sI.PrevPrice) AS [Price]
-
, [Month]
-
, [RN]
-
FROM [Items] AS tI
-
LEFT OUTER JOIN
-
(
-
SELECT [Price] AS [PrevPrice]
-
FROM [Items]
-
INNER JOIN
-
(
-
SELECT [ID]
-
, MAX([RN]) AS MaxRN
-
FROM [Items]
-
GROUP BY [ID]
-
) AS sIi
-
ON tI.ID = sIi.ID
-
AND tI.RN > sIi.RN
-
) AS sI
-
ON tI.ID = sI.ID
First (logically - not in the SQL) find the RN of the previous record (The sub-sub-query sIi), then find the [Price] that matches that record (The sub-qury sI). Lastly, match this record (if found - hence LEFT OUTER JOIN) with the original and coalesce in the normal way.
Let us know how this works out for you.
Welcome to Bytes!
That doesnt run. The ON clause of the INNER JOIN is incorrect it appears: - ON tI.ID = sIi.ID AND tI.RN > sIi.RN
, since you are referring to an alias (tI) that isnt available at that point:
The multi-part identifier "tI.ID" could not be bound
NeoPa 32,556
Expert Mod 16PB
I'll try for a deeper look at this tomorrow. Sorry to get it wrong (at least so my initial tests would indicate).
Having most of my SQL experience in Jet (Access) I do sometimes lead people amiss (unintentionally of course). Preliminary tests indicate that T-SQL doesn't handle this as well as Jet does, which is something of a surprise. Normally the reverse is more likely. It may just be that I have the vocabulary wrong and it is supported but just using a different approach.
Jeez dont be apologising, any help is great!
Tweak this for now while am working on your sample data...
Good Luck!!!
~~ CK
Here you go...
Your data... -
declare @YourTable as table
-
(
-
ID int,
-
priceChange money,
-
cMonth varchar(50),
-
rn int
-
)
-
insert into @YourTable
-
select 1,100, 'March 2010', 1
-
union all
-
select 1,null,'June 2010', 2
-
union all
-
select 1,200, 'July 2010', 3
-
union all
-
select 2,520, 'March 2010', 1
-
union all
-
select 2,null,'June 2010', 2
-
-
Your SQL -
select y1.*, LastValidPrice = isnull(y1.pricechange,y2.pricechange)
-
from @YourTable y1
-
left join @YourTable y2 on y1.id = y2.id and y1.rn = y2.rn + 1
-
Happy Coding!!!
~~ CK
Hey,
Thanks for the input. However, I think my sample data wasnt quite complete. Its not always valid to use
y1.rn = y2.rn + 1, the previous "real" value may be a few rows back. This is why I used the max() function, rather than what you did. A complete sample data would be - ID priceChange Month rn
-
1 100 March 2010 1
-
1 null April 2010 2
-
1 null May 2010 3
-
1 null June 2010 4
-
1 200 July 2010 5
-
2 520 March 2010 1
-
2 null April 2010 2
-
2 null May 2010 3
-
2 300 June 2010 4
-
In this case the value to be filled in for item 1 in April, May and June would come from March i.e. 100. Therefore I cant rely on it being always the previous row.
Any further ideas?
From tweaking your link, the following seems to work as required on the "updated" sample date: -
select id, priceChange,
-
coalesce((
-
select priceChange
-
from YourTable b
-
where a.rn > b.rn and a.id = b.id and b.priceChange is not null and a.priceChange is null order by b.rn desc limit 1 ),
-
a.priceChange
-
) tmp,
-
cMonth
-
from YourTable a order by id, rn
-
Is this also not a correlated subquery, which will have similar performance issues?
OK I think I have this sorted now. The below query has gone from 17 minutes to 22 seconds on my machine: -
-- Fills in the blanks of the items table by joining each item change record
-
-- to the previous valid record for the same item, to allow retrival of last "real" price change
-
select * from (
-
select
-
a.id,
-
coalesce(a.priceChange, b.priceChange) as priceChange,
-
ROW_NUMBER() OVER (PARTITION BY a.id, a.aMonth ORDER BY b.rn desc) as rn1
-
FROM items a, items b
-
where
-
a.id = b.id AND -- Only want records that relate to same item
-
b.rn <= a.rn and -- Previous or same (ensures we get items with no change at all) records only
-
b.priceChange IS NOT NULL -- Previous record must be a "real" record
-
) tt where rn1 = 1 -- Get the latest "real" record that relates to same item. Latest is determined by the
-
-- row number column being 1 (since we ordered the partition DESC)
-
The correlated subquery has been replaced by the use of the analytical ROW_NUMBER() function. I do a self join to get all items before the "current" item, where there is a valid priceChange. The use of <= ensures that if there is no price change at all for a particular item, it will be joined to itself.
As part of the column selection, I assign each record a row number by partitioning the result set on the item id and month, ordering it so that the latest change record will be given a value of 1. This makes it possible to select the latest value using rn1 = 1 rather than selecting the max value.
There has been a huge performance increase using this strategy. It again confirms that correlated subqueries should be looked at if there is a query performance issue.
Thanks for your help, hopefully someone else will benefit from this in the future.
Sign in to post your reply or Sign up for a free account.
Similar topics
by: Jason |
last post by:
Hello All,
I have a SQL Query with multiple correlated Subqueries in it. When it
gets executed it runs rather slow due to the size of the QT table.
Does anybody have any suggestions how to alter...
|
by: Venkata C |
last post by:
Hi!
Does anyone here know of a way to goad DB2 into converting a correlated
subquery to a non-correlated one? Does DB2 ever do such a conversion?
We have a query of the form
SELECT .. FROM A...
|
by: Mike MacSween |
last post by:
tblProductions one to many to tblEvents
tblEvents contains StartDate
I want a report where the data are grouped by tblProductions.ProdID, and
sorted by the earliest date in each Production.
...
|
by: mike |
last post by:
I'm using postgresl 7.3.2 and have a query that executes very slowly.
There are 2 tables: Item and LogEvent. ItemID (an int4) is the primary key
of Item, and is also a field in LogEvent. Some...
|
by: flamingo |
last post by:
I have a query that works just fine, but because I need to use it in Business Objects and I have to use a prompt for the date, I need the initial query for the count as a correlated subquery in the...
|
by: flamingo |
last post by:
I have a query that works just fine, but because I need to use it in Business Objects and I have to use a prompt for the date, I need the initial query for the count as a correlated subquery in the...
|
by: sql_server_user |
last post by:
Hi,
I have a history table with about 400 million rows, with a unique
composite nonclustered index on two columns (object id and time period)
which is used for most of the queries into the...
|
by: steven.fafel |
last post by:
I am running 2 versions of a correlated subquery. The two version
differ slightly in design but differ tremendously in performance....if
anyone can answer this, you would be awesome.
The "bad"...
|
by: AccessHunter |
last post by:
Please Help....
I have a query with 2 tables, LegalFile and Cases.
LegalFile has the field CaseID(Number Field) and Cases has CaseNbr(Text Field).
I am trying to find entries that are in...
|
by: hergele |
last post by:
Hello all.. I am trying to convert an application which is working on Oracle to postgreSql..
I've seen a query in code something like this...
select *
from table1 a1 , table1 a2, table2 a3, ...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
| |