By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,822 Members | 729 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,822 IT Pros & Developers. It's quick & easy.

Optimize my stored procedure

P: n/a
I'm having some problems to optimize my stored procedure (select
statement with joins)

What I'm trying to do is calculate total work.

My situation:

I have 3 tables I'm using
-Input (char destA, char destB, int work)where I have all input data
(moves from destA to destB) around 3000 rows

-Map"distancemap"(int pointA, int pointB, int distance) for all
possible locations and their distances (distances between pointA and
pointB) around 250^2= 70000 roews

-PointMap"ldm"(int pointA, char destA, char type) where I have
location pointA for destA and type means if it's from destA or to
destA
around 250 rows

Instead of using large Map table i have a fuction that calculates
distance between each 2 points.

My problem:

I need to calculate total work = moves*distance for each of 250
points. For example I take 1 row from input table, then find points
for "destA from" and "destB to" and multiply input.moves*map.distance.
Everything works fine and ittakes about a second but when I look at
the estimation execution plan one of my joins is returning over 800K
rows and another over 4M rows.
I need hel to optimize my stored procedure to the MAXperformance.

Thank You
here is my stored procedure:

SET NOCOUNT ON

--inbound
update dbo.ldm set workdist =X.suma from dbo.ldm
inner join
(
select i.inbound as inb, sum(i.moves*d.distance) as suma from
dbo.input i, dbo.distancemap d
inner join
(
select inbound, doorid as did from dbo.ldm, dbo.input
where dest = inbound and doortype = 'I' group by inbound, doorid
)F on F.did = d.fromid

inner join
(
select outbound, doorid as did from dbo.ldm, dbo.input
where dest = outbound and doortype = 'O' group by outbound, doorid
)T on T.did = d.toid

where f.inbound=i.inbound and t.outbound = i.outbound
group by i.inbound
)X on (X.inb= dest and doortype = 'I')
OPTION(MAXDOP 1)

--outbound
update dbo.ldm set workdist =X.suma from dbo.ldm
inner join
(
select i.outbound as inb, sum(i.moves*d.distance) as suma from
dbo.input i, dbo.distancemap d
inner join
(
select inbound, doorid as did from dbo.ldm, dbo.input
where dest = inbound and doortype = 'I' group by inbound, doorid
)F on F.did = d.fromid

inner join
(
select outbound, doorid as did from dbo.ldm, dbo.input
where dest = outbound and doortype = 'O' group by outbound, doorid
)T on T.did = d.toid

where f.inbound=i.inbound and t.outbound = i.outbound
group by i.outbound

)X on (X.inb= dest and doortype = 'O')
OPTION(MAXDOP 1)
Jul 20 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
mjuricek (mj******@yahoo.com) writes:
I need to calculate total work = moves*distance for each of 250
points. For example I take 1 row from input table, then find points
for "destA from" and "destB to" and multiply input.moves*map.distance.
Everything works fine and ittakes about a second but when I look at
the estimation execution plan one of my joins is returning over 800K
rows and another over 4M rows.


I would not bother too much about the estimated execution plan. Estimates
may be right - or they may not be. Looking at the plan to see what it
does, may give a better idea of what it does. And the only way to find
out whether it is a good plan or not, is to run the query.

Anyway, it is very difficult to give suggestions, because there are
columns in the query that I can't see in the your brief table description.
And since you don't prefix all columns with aliases, I can't tell to
which table they belong.

The minimum to post for a query like this is to include CREATE TABLE and
CREATE INDEX statements for the table. Sample data may be difficult,
since you cannot post 4MB of data. Then again, you could make it available
for download (provided that you can produce a sample that does not
contain confidential information). But without data to test on, it
is difficult to be sure that suggestions for rewrites or changed
indexes will have any effect.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #2

P: n/a
Hi,

I'm sorry I didn't realize hen I was writing that post that I forgot to
include "quite important" part Below there ill be create table and
create index scripts for all 3 tables.

if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[ldm]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[ldm]
GO

CREATE TABLE [dbo].[ldm] (
[DoorID] [smallint] NOT NULL ,
[Dest] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[DoorType] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Segment] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Workdist] [bigint] NULL
) ON [PRIMARY]
GO

CREATE UNIQUE CLUSTERED INDEX [IX_ldm] ON [dbo].[ldm]([DoorID]) ON
[PRIMARY]
GO

/****** Object: Table [dbo].[DistanceMap] Script Date: 11/5/2004
9:21:32 AM ******/
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[DistanceMap]') and OBJECTPROPERTY(id, N'IsUserTable')
= 1)
drop table [dbo].[DistanceMap]
GO

/****** Object: Table [dbo].[DistanceMap] Script Date: 11/5/2004
9:21:32 AM ******/
CREATE TABLE [dbo].[DistanceMap] (
[FromID] [int] NOT NULL ,
[ToID] [int] NOT NULL ,
[Distance] [bigint] NULL ,
[DistanceCopy] [int] NULL ,
[Weight] [int] NULL ,
[Connected] [bit] NULL ,
[Comment] [varchar] (80) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
GO

CREATE UNIQUE CLUSTERED INDEX [IX_DistanceMap] ON
[dbo].[DistanceMap]([FromID], [ToID]) ON [PRIMARY]
GO

/****** Object: Table [dbo].[Input] Script Date: 11/5/2004 9:22:26
AM ******/
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[Input]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[Input]
GO

/****** Object: Table [dbo].[Input] Script Date: 11/5/2004 9:22:26
AM ******/
CREATE TABLE [dbo].[Input] (
[Index] [int] NOT NULL ,
[Inbound] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[Outbound] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[Moves] [bigint] NOT NULL
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[Input] WITH NOCHECK ADD
CONSTRAINT [PK_Input_Inbound] PRIMARY KEY CLUSTERED
(
[Index],
[Inbound]
) ON [PRIMARY]
GO

CREATE UNIQUE INDEX [IX_Input_Outbound] ON [dbo].[Input]([Index],
[Outbound]) ON [PRIMARY]
GO

I'm not very experienced with sql yet. I've been using it for about a
month. So I'm pretty sure there are still ways to optimize it. I was
using estimated plan to see what is going on and it helped me to get rid
of some overhead. My execution sped up from 1 in 40 sec to around
170/min.
I blocked second CPU, changed type so convert func didn't has to be
called, reindex and so on... But it still could be done faster.
As far as sample data. I'm not sure how can I do that. Those are just
sample data anyway they have no meaning but for me, so i'm not worried
about releasing confidential data, but how can i send it?

Thank You
Martin Juricek


*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #3

P: n/a
[posted and mailed, please reply in news]

Martin Juricek (mj******@yahoo.com) writes:
I'm sorry I didn't realize hen I was writing that post that I forgot to
include "quite important" part Below there ill be create table and
create index scripts for all 3 tables.


Thanks for the tables.

It is still a very difficult task to suggest improvements, because the
query is complex, and I have very little understanding of the business
requiresments.

Nevertheless, I rewrote the UPDATE statements to use only ANSI JOIN, and
that gave me:

UPDATE ldm
SET workdist = X.suma
FROM ldm l
JOIN (SELECT i.outbound as inb, sum(i.moves*d.distance) as suma
FROM input i
JOIN (SELECT i.inbound, l.doorid
FROM ldm l
JOIN input i ON l.dest = i.inbound
WHERE l.doortype = 'I'
GROUP BY i.inbound, l.doorid) f ON f.inbound = i.inbound
JOIN (SELECT i.outbound, l.doorid
FROM ldm l
JOIn input i ON l.dest = i.outbound
WHERE l.doortype = 'O'
GROUP BY i.outbound, l.doorid) t
ON t.outbound = i.outbound
JOIN distancemap d ON f.doorid = d.fromid
AND t.doorid = d.toid
GROUP BY i.outbound) X ON X.inb = l.dest
WHERE l.doortype = 'O'

When I looked closer at this, I found the inner derived tables to be funky.
You have:

(SELECT i.inbound, l.doorid
FROM ldm l
JOIN input i ON l.dest = i.inbound
WHERE l.doortype = 'I'
GROUP BY i.inbound, l.doorid)

There is a GROUP BY, but there is no aggregate. Basically this is the
same as:

(SELECT DISTINCT i.inbound, l.doorid
FROM ldm l
JOIN input i ON l.dest = i.inbound
WHERE l.doortype = 'I')

Next funny thing is that this derived table is joined to input, so you
have a self-join from input to input that gives you a doorid. So I wrote
this simplified query:

UPDATE ldm
SET workdist = X.suma
FROM ldm l
JOIN (SELECT i.inbound AS inb, SUM(i.moves*d.distance) AS suma
FROM input i
JOIN ldm L1 ON L1.dest = i.inbound
JOIN ldm L2 ON L2.dest = i.outbound
JOIN distancemap d ON L1.doorid = d.fromid
AND L2.doorid = d.toid
WHERE L1.doortype = 'I'
AND L2.doortype = 'O'
GROUP BY i.inbound) X
ON X.inb = l.dest
WHERE l.doortype = 'I'

I have no idea whether this produces the correct result, since I don't
know whether that DISTINCT matters, and in such case what effect it has.

But it is possible that you from my observation of the seemingly
redudant inclusion of the input table that you are able to achieve
the correct simplification of the query.

It might also be that the entire table design needs an overhaul. As I said,
I know too little about this, to say anything for sure.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #4

P: n/a
hi again, I'm soory a didn't reply here, but before I I got
nofitication from groups I got your email, and I just replied. I don't
know if u got my 2 files, that I sent u. Those files had those 2
stored procedures, that I'm running. I don't know what should I say
about my project for ppl to better understand what I'm doing. I'm open
to any questions. What I'm trying to do is to minimize work. There are
still no other conditions, this is for test purposes only. I was given
2 tables Input and doormap. Doormap has coordinates of each door
(talking about rectangle). From this table I created table that
carries distances for each pair fromdoor - todoor. My outputtable is
supposed to have ideal destination, based on my input table. I'm using
very simple algorithm (similar to brute force) I'm trying to "shuffle"
the doors and compute the totals in each step. You already know how
that part works. IT WORK VERY GOOD. Thanks.
The problem is the duration. For this specific example I have 226
active doors (doors that have destination) out of 264 doors available.
To try every single combination is impossible. So I'm trying to speed
it up. I used a "quality filter" that will commit only changes that
fall throught the filter. And still I've ben running this on and off
for over a week, but it's still slow. So I wanted to optimize it as
much as I can. I drop from 38 sec/1 step to 226/min (thanks to u). But
I feel there are still ways it can be optimized.
Jul 20 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.