By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,853 Members | 1,034 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,853 IT Pros & Developers. It's quick & easy.

Complex query (compare records in same table / join a 2nd table)

P: 19
Question for all you SQL gurus out there. I have a (seemingly) complex query to write, and since
I'm just learning SQL server, am not sure how to go about it.

Tables:
t_trans
t_account

All tables are bound by an SSN column ("SSN").

t_trans has the bulk of the data I need, and includes the following data:
ssn, trans_date, trans_amount, trans_code
111-11-1111, 8/9/1999, $350.00, 10
111-11-1111, 12/31/1999, $25.00, 30

t_account has the following data:
ssn, close_date
111-11-1111, 8/20/1999

I need a query to pull records from t_trans using the following criteria:

1. Show all t_trans records that are trans_code 30, ONLY if the 30 occurs after a 10,
2. Trans code 30 must include positive dollars,
3. Trans code 30 trans_date must be greater than the close_date from t_account

I'm having trouble doing this since criteria 1 above is comparing records in the same table. In other words, if there's a single SSN with four t_trans entries, one 10 and three 30's, only the 30's will show, and only if those 30's have positive dollars, and only if those 30's have trans_dates that occur after the account close date.

I'm so confuzzled :(
Jan 3 '08 #1
Share this Question
Share on Google+
5 Replies


camel
P: 55
Going to have a stab at this off top of head, but no warranties implied !

The main problem is the "must be a 10 before the 30 issue". I would get round that basically as below, by putting all the records of possible interest in a temp table with an auto generated Row ID. You can then use this Row ID to identify the records that meet the "10 followed by 30 rule" as below.

So, put your t_trans records in a temp table. Note I have just written "etc", you need to define all the columns of interest in temp table and ins statement

CREATE TABLE #Transactions (RowID int IDENTITY(1,1), SSN etc)
INSERT #Transactions(SSN etc)
SELECT SSN etc
FROM t_trans -- Probably want to limit rows added by trans_date

Now you should find something like the following picks out records you want

SELECT T *
FROM #Transactions T
JOIN ( SELECT RowID, SSN
FROM #Transactions
WHERE trans_code = 10) AS D
ON T.SSN = D.SSN
AND (T.RowID - 1) = D.RowID -- This is the key SARG to handle the rule
JOIN t_account A
ON T.SSN = A.SSN
WHERE T.trans_Code = 30
AND T.trans_date > A.close_date
AND T.trans_amount > 0
Jan 3 '08 #2

P: 19
The records I have to sort through in t_trans total over 15 million. I need to sorth through ALL those records to find the trans_code 10 and 30's, and those alone total over 6 million.

I'm hesitant to use a temp table for that since I'm not sure how much resources it will use.
Jan 3 '08 #3

P: 19
Well I was able to whittle it down to about 36,000 records to review, and it seems to be working. Thanks for your help :)
Jan 3 '08 #4

camel
P: 55
You are right, its not an option to use a temp table with this many records. Is it possible to add an identity column to the main table itself ? Without a RowID style column or a timestamp on each row I don't know any way you can use a check on value of previous record within a set based statement in the way your requirement implies is needed (and as I have illustrated). If not you have to use a cursor based approach that really will annihilate performance.

Perhaps you could mediate the performance hit by working through several subsets of rows, be that by vertically partitioning the main table or using queries targeted by trans_date. Resource wise you are going to have serious issues going after so many rows in memory in a non-set based fashion

I really wonder why you have so many active records to process ? To carry out a report of this nature on so many records sounds more like a data warehouse scenario where of course you can optimise by building a cube to suit the kind of question you need to ask. Does not strike me you would want this overhead on a production server supporting running applications at the same time....
Jan 3 '08 #5

camel
P: 55
Missed your previous post reporting some success whilst responding your reply worrying over number of records ! Glad the original reply was helpful and you know have some more ideas for approaching the problem in the future perhaps.
Jan 3 '08 #6

Post your reply

Sign in to post your reply or Sign up for a free account.