473,836 Members | 1,586 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Recursive Query

Bob
Hi there,

Need a little help with a certain query that's causing a lot of acid
in my stomach...

Have a table that stores sales measures for a given client. The sales
measures are stored per year and there could be multiple sales
measures every year per client. There is another field called last
update date. If there are multiple sales measures then need to select
the one that's been entered last based on this field. Also, if there
is no sales measure data for current year then I need to return the
last year's data for which sales measure has been entered. For
example: if client #1 has sales measure value of $200 for 1999 and
nothing since, then I need to return $200 for any year following 1999.

So the query would look something like this:

SELECT client_name, sm_dollars FROM <tables>

Based on the DDL at the bottom I would expect to get back: c1, 100;
c2, 200

The way I am doing it now is with correlated subqueries (3 to be
exact) that each do an aggregate and join back to the original table.
It works, but it is notoriously slow. SQL Server is scanning the
index and does a merge join which in a large query takes %95 of the
time. Here is the part of the query plan for it:

| | | | | | |--Merge
Join(Inner Join, MANY-TO-MANY
MERGE:([sales_measure].[client_id])=([sales_measure].[client_id]),RESIDUAL:(([sales_measure].[client_id]=[sales_measure].[client_id]
AND [sales_measure].[tax_year]=[sales_measure].[tax_year]) AND
[Expr1013]=[sales_measure].[last_update_dat e]))
| | | | | | |--Stream
Aggregate(GROUP BY:([sales_measure].[client_id],
[sales_measure].[tax_year])
DEFINE:([Expr1013]=MAX([sales_measure].[last_update_dat e])))
| | | | | | | |--Merge
Join(Inner Join, MERGE:([sales_measure].[client_id],
[Expr1010])=([sales_measure].[client_id], [sales_measure].[tax_year]),
RESIDUAL:([sales_measure].[client_id]=[sales_measure].[client_id] AND
[sales_measure].[tax_year]=[Expr1010]))
| | | | | | |
|--Stream Aggregate(GROUP BY:([sales_measure].[client_id])
DEFINE:([Expr1010]=MAX([sales_measure].[tax_year])))
| | | | | | | |
|--Index Scan(OBJECT:([stars_perftest].[dbo].[sales_measure].[sales_measure_i dx1]),
ORDERED FORWARD)
| | | | | | |
|--Index Scan(OBJECT:([stars_perftest].[dbo].[sales_measure].[sales_measure_i dx1]),
ORDERED FORWARD)
| | | | | | |--Index
Scan(OBJECT:([stars_perftest].[dbo].[sales_measure].[sales_measure_i dx1]),
ORDERED FORWARD)
There are two indexes on sales measure table:

sales_measure_p k - sales_measure_i d (primary key) clustered
sales_measure_i dx1 - client_id, tax_year, last_update_dat e, sm_dollars

sales_measure table has 800,000 rows in it.

Here is the rest of the DDL:

IF OBJECT_ID('dbo. client') IS NOT NULL
DROP TABLE dbo.client
GO
create table dbo.client (
client_id int identity primary key
, client_name varchar(100) NOT NULL)
GO
IF OBJECT_ID('dbo. sales_measure') IS NOT NULL
DROP TABLE dbo.sales_measu re
GO
create table dbo.sales_measu re(
sales_measure_i d int identity primary key
, client_id int NOT NULL
, tax_year smallint NOT NULL
, sm_dollars money NOT NULL
, last_update_dat e datetime NOT NULL)
GO
CREATE INDEX sales_measure_i dx1 ON sales_measure (client_id, tax_year,
last_update_dat e, sm_dollars)
GO
INSERT dbo.client(clie nt_name)
SELECT 'c1' UNION SELECT 'c2' UNION SELECT 'c3'
GO
INSERT dbo.sales_measu re(client_id, tax_year, sm_dollars,
last_update_dat e)
SELECT 1, 2004, 100, '1/4/2004'
UNION
SELECT 2, 2003, 100, '1/3/2004'
UNION
SELECT 2, 2004, 150, '1/4/2004'
UNION
SELECT 2, 2004, 200, '1/5/2004'
The view that I use to calculate sales measures:

CREATE VIEW sales_measure_v w AS
SELECT sm.*
FROM sales_measure sm
INNER JOIN (SELECT sm2.client_id, sm2.tax_year,
MAX(sm2.last_up date_date) as last_update_dat e
FROM sales_measure sm2
INNER JOIN (SELECT sm4.client_id, MAX(sm4.tax_yea r)
as tax_year
FROM sales_measure sm4 GROUP BY
sm4.client_id) sm3
on sm3.client_id = sm2.client_id
and sm3.tax_year = sm2.tax_year
GROUP BY sm2.client_id, sm2.tax_year ) sm1
ON sm.client_id = sm1.client_id AND
sm.tax_year = sm1.tax_year AND
sm.last_update_ date = sm1.last_update _date

Any advice on how to tame this would be appreciated. Also, any advice
on the indexes would help as well.

Thanks

Bob
Jul 20 '05
14 9303
Bob
Hi,

First, let me say that your time and replies are greatly appreciated.

Erland Sommarskog <so****@algonet .se> wrote in message news:<Xn******* *************** @127.0.0.1>...
Bob (bo*********@ya hoo.com) writes:
Thanks for your reply. Your suggestion with indexes did not help.
The query plan did not change for the better with the suggested
indexes in place.
But at least you should get slightly better performance with sm_dollars
out from the index.


Wouldn't SQL Server have to do a bookmark lookup if I take the dollars
out of the index? My thought was to cover the query.
I have looked at little more at your case, and I don't really have any
good suggestions right now. Query tuning when you don't have access
to the data is kind of difficult. (It can be difficult even if you have
access!)

Your query looks good; it is basically the kind of query you would write
for a situation like this one. What you could do is create a new table,
in which client_id is the primary key, and which would only hold the
most recent rows. You would maintain this table through a trigger. The
query would be faster, but updates would take a toll.
We are looking at denormalizing the table and possibly adding a flag
that marks the "current" sales measure for each client. This is an
alternative to what you are suggesting.
While you did post CREATE TABLE scripts and the query and pieces of the
query plan, I notice that the query-plan extract mentions an object which
was not in the script. Maybe you could post a script and the complete
query plan that goes with it. Could you put the plan in an zip file or
something, so that the lines does not get broken.
Actually, the script is an extreme simplification of the actual
objects involved. The top level query is about three pages long and
involves two dozen tables. The problem is that the query is not
particularly selective and potentially could return a large number of
rows. The reason I felt compelled to post this is that in the large
query 95% of the time is spend executing this particular "small" part
of the query. If I take sales measure out of the query it comes back
in 1 second. With sales measure in, it takes 17 seconds.
Also, if you could give some indications of the volume and the distribution
that would be helpful.


There are ~ 700K clients.
~ 470K clients have sales measures, most (288K) have one, hand full
have 9 (max)

Thanks

Bob
Jul 20 '05 #11
Bob (bo*********@ya hoo.com) writes:
Wouldn't SQL Server have to do a bookmark lookup if I take the dollars
out of the index? My thought was to cover the query.
True. On the other hand it makes the index broader.
We are looking at denormalizing the table and possibly adding a flag
that marks the "current" sales measure for each client. This is an
alternative to what you are suggesting.
Yes, I was thinking along those lines too, but I figured that I should
stop at one suggestion, and to get your thoughts going.
Actually, the script is an extreme simplification of the actual
objects involved. The top level query is about three pages long and
involves two dozen tables. The problem is that the query is not
particularly selective and potentially could return a large number of
rows. The reason I felt compelled to post this is that in the large
query 95% of the time is spend executing this particular "small" part
of the query. If I take sales measure out of the query it comes back
in 1 second. With sales measure in, it takes 17 seconds.
But if you run this extracted query, it still runs slow? Or is it the
interaction between this part and the rest of the query that is causing
problems?
There are ~ 700K clients.
~ 470K clients have sales measures, most (288K) have one, hand full
have 9 (max)


No small numbers, but alarmingly high.

--
Erland Sommarskog, SQL Server MVP, so****@algonet. se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #12


jc*******@earth link.net says...

Since you cross-posted in the general database newsgroup, I gave you
Standard SQL-92 code (well, the at-sign -- aka "petite escarot" --
should have been colon).

Eh? The at sign (@) is called "robasse" in French - where did you get
this "escarot" word from? My dictionary doesn't even have it!
Paul...
--
plinehan y_a_h_o_o and d_o_t com
C++ Builder 5 SP1, Interbase 6.0.1.6 IBX 5.04 W2K Pro
Please do not top-post.

"XML avoids the fundamental question of what we should do,
by focusing entirely on how we should do it."

quote from http://www.metatorial.com
Jul 20 '05 #13
A long time ago, in a galaxy far, far away, Paul <pa**@not.a.cha nce.ie> wrote:
jc*******@earth link.net says...
Since you cross-posted in the general database newsgroup, I gave you
Standard SQL-92 code (well, the at-sign -- aka "petite escarot" --
should have been colon).

Eh? The at sign (@) is called "robasse" in French - where did you get
this "escarot" word from? My dictionary doesn't even have it!


If it had been spelled "escargot," I'd buy that the "@" looks somewhat
like a snail...
--
wm(X,Y):-write(X),write( '@'),write(Y). wm('cbbrowne',' cbbrowne.com').
http://www.ntlug.org/~cbbrowne/x.html
"Now they show you how detergents take out bloodstains, a pretty
violent image there. I think if you've got a T-shirt with a bloodstain
all over it, maybe laundry isn't your biggest problem. Maybe you
should get rid of the body before you do the wash." --Jerry Seinfeld
Jul 20 '05 #14


cb******@acm.or g says...

Paul <pa**@not.a.cha nce.ie> wrote:

Eh? The at sign (@) is called "robasse" in French - where did you get
this "escarot" word from? My dictionary doesn't even have it!
If it had been spelled "escargot," I'd buy that the "@" looks somewhat
like a snail...

Putain, mais oui (Doh). J'avais oublié que M. Celko aime s'amuser de
temps en temps!
Paul...
--
plinehan y_a_h_o_o and d_o_t com
C++ Builder 5 SP1, Interbase 6.0.1.6 IBX 5.04 W2K Pro
Please do not top-post.

"XML avoids the fundamental question of what we should do,
by focusing entirely on how we should do it."

quote from http://www.metatorial.com
Jul 20 '05 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
3135
by: replace-this-with-my-name | last post by:
Hi. How do I return a string containing an entire menu-tree from a recursive function? Here is my current recursive function: function page_tree( $_i ){ //Call global mysql connection variable
2
7034
by: Perttu Pulkkinen | last post by:
I need to find toplevel image categories and a) number of images directly in them and b) number of subcategories directly in those topcategories. In toplevel image categories "icat_parent_id IS NULL". Below is also current erraneous draft but it has ambigous filed name problem. CREATE TABLE icategories ( icat_id int(11) NOT NULL auto_increment, icat_parent_id int(11) default NULL, icat_name char(100) default NULL, PRIMARY KEY ...
4
9276
by: Rodusa | last post by:
I am having problem to apply updates into this function below. I tried using cursor for updates, etc. but no success. Sql server keeps telling me that I cannot execute insert or update from inside a function and it gives me an option that I could write an extended stored procedure, but I don't have a clue of how to do it. To quickly fix the problem the only solution left in my case is to convert this recursive function into one recursive...
2
2927
by: Steven Burn | last post by:
..:: The Specs: MS Access 2000 (host charges extra for SQL/MySQL) MS Windows Server 2003 (prod) / MS XP SP1 (dev) ..:: The setup: The database has been setup with two tables; tblDownloads
3
4178
by: Vincenzino | last post by:
Hi, I have some problem in using SQL3 recursive queries on DB2 database system (8.1 and 8.2 UDB). I need to compute the transitive closure of a (possibly) ciclic graph using SQL3 on DB2. I MUST use pure SQL3 and I cannot use cursors, indexes and the like. The graph is represented by a set of arcs with the table "arc" having two attributes "a1" and "a2" for source and target nodes resp. The SQL3 query I'm using is the following:
3
2547
by: Dennis Haney | last post by:
Hi As far as I can tell, the pull_up_IN_clauses does not optimize recursively. Am I totally misguided here? Index: plan/subselect.c =================================================================== RCS file: /projects/cvsroot/pgsql-server/src/backend/optimizer/plan/subselect.c,v retrieving revision 1.85
3
3067
by: NatRoger | last post by:
Hi, We are using DB2 V7 (MVS) in our shop, which dont have the luxury of the 'WITH clause' - CTE support to accomplish recursive queries. Here is my problem - Table1 From-ID To-ID A1 M1 - Member A1's id changes to M1 M1 D1 - M1 changes to D1
2
5551
by: Jim Devenish | last post by:
I have a table named StockItems with field: StockItemID QuantityInStock I am creating assemblies of stock items and have another table named StockItemAssemblies with fields: StockItemAssemblyID AssemblyID StockItemID Quantity
3
2216
by: srinit | last post by:
hi i have a table like col1 col2 col3 col4 35 R 0 0 36 W 1 35 37 R 0 0 38 W 2 35,36 I am giving col1 value 35 as input
8
8320
by: Pivot_Tables | last post by:
Hi, I have created a recursive SQL Query in DB2 and it works fine until some point in the tree where the data gets into infinite loop. Below are some sample data from my relationship table. Relationship Table PARENT FIELD CHILD FIELD AAA BBB AAA CCC
0
10843
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10546
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10254
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9371
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7790
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6978
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
4448
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4015
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3112
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.