473,399 Members | 3,038 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

Extracting and joining header from denormalized table

Hello,

I am currently working on a monthly load process with a datamart. I
originally designed the tables in a normalized fashion with the idea
that I would denormalize as needed once I got an idea of what the
performance was like. There were some performance problems, so the
decision was made to denormalize. Now the users are happy with the
query performance, but loading the tables is much more difficult.
Specifically...

There were two main tables, a header table and a line item table. These
have been combined into one table. For my loads I still receive them as
separate files though. The problem is that I might receive a line item
for a header that began two months ago. When this happens I don't get a
header record in the current month's file - I just get the record in
the line items file. So now I have to join the header and line item
tables in my staging database to get the denormalized rows, but I also
may have to get header information from my main denormalized table
(~150 million rows). For simplicity I will only include the primary
keys and one other column to represent the rest of the row below. The
tables are actually very wide.

Staging database:

CREATE TABLE dbo.Claims (
CLM_ID BIGINT NOT NULL,
CLM_DATA VARCHAR(100) NULL )

CREATE TABLE dbo.Claim_Lines (
CLM_ID BIGINT NOT NULL,
LN_NO SMALLINT NOT NULL,
CLM_LN_DATA VARCHAR(100) NULL )

Target database:

CREATE TABLE dbo.Target (
CLM_ID BIGINT NOT NULL,
LN_NO SMALLINT NOT NULL,
CLM_DATA VARCHAR(100) NULL,
CLM_LN_DATA VARCHAR(100) NULL )

I can either pull back all of the necessary header rows from the target
table to the claims table and then do one insert using a join between
claims and claim lines into the target table OR I can do one insert
with a join between claims and claim lines and then a second insert
with a join between claim lines and target for those lines that weren't
already added.

Some things that I've tried:

INSERT INTO Staging.dbo.Claims (CLM_ID, CLM_DATA)
SELECT DISTINCT T.CLM_ID, T.CLM_DATA
FROM Staging.dbo.Claim_Lines CL
LEFT OUTER JOIN Staging.dbo.Claims C ON C.CLM_ID = CL.CLM_ID
INNER JOIN Target.dbo.Target T ON T.CLM_ID = CL.CLM_ID
WHERE C.CLM_ID IS NULL

INSERT INTO Staging.dbo.Claims (CLM_ID, CLM_DATA)
SELECT T.CLM_ID, T.CLM_DATA
FROM Staging.dbo.Claim_Lines CL
LEFT OUTER JOIN Staging.dbo.Claims C ON C.CLM_ID = CL.CLM_ID
INNER JOIN Target.dbo.Target T ON T.CLM_ID = CL.CLM_ID
WHERE C.CLM_ID IS NULL
GROUP BY T.CLM_ID, T.CLM_DATA

INSERT INTO Staging.dbo.Claims (CLM_ID, CLM_DATA)
SELECT DISTINCT T.CLM_ID, T.CLM_DATA
FROM Target.dbo.Target T
INNER JOIN (SELECT CL.CLM_ID
FROM Staging.dbo.Claim_Lines CL
LEFT OUTER JOIN Staging.dbo.Claims C ON C.CLM_ID =
CL.CLM_ID
WHERE C.CLM_ID IS NULL) SQ ON SQ.CLM_ID = T.CLM_ID

I've also used EXISTS and IN in various queries. No matter which method
I use, the query plans tend to want to do a clustered index scan on the
target table (actually a partitioned view partitioned by year). The
number of headers that were in the target but not the header file this
month was about 42K out of 1M.

So.... any other ideas on how I can set up a query to get the distinct
headers from the denormalized table? Right now I'm considering using
worktables if I can't figure anything else out, but I don't know if
that will really help me much either.

I'm not looking for a definitive answer here, just some ideas that I
can try.

Thanks,
-Tom.

Nov 30 '05 #1
2 1964
Thomas R. Hummel (to********@hotmail.com) writes:
I've also used EXISTS and IN in various queries. No matter which method
I use, the query plans tend to want to do a clustered index scan on the
target table (actually a partitioned view partitioned by year). The
number of headers that were in the target but not the header file this
month was about 42K out of 1M.


Did you try using an index hint?

But wait, you can't force an index on the view, as the view does not
have any indexes. And maybe there lies the problem. What happens if
you instead join directly to the underlying tables?
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Dec 1 '05 #2
Thanks for the suggestion Erland. I would have to join to a union of
the tables or perform multiple insert statements using NOT EXISTS (or
equivalent LEFT JOIN logic) to make sure that I only add each row once.
I will try to set this up and run some tests on it, but in the meantime
I found another solution that is adequate. I've generated a table
using:

SELECT CLM_ID, MIN(LN_NO)
FROM Target.dbo.Target
GROUP BY CLM_ID

Using this table I created a view on the full set of UNION'd tables
joined to this so that it only returns one row per claim (the one with
the minimum LN_NO). At least for what I was doing this gave performance
that was acceptable until I have time to further optimize it.

Thanks,
-Tom.

Dec 2 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Job Lot | last post by:
Is there anyway of Joining two or more DataTable with similar structure? I have three DataTables with following structures Data, AmountB/F, Repayments, InterestCharged and AmountC/F i want...
1
by: Cognizance | last post by:
Hi gang, I'm an ASP developer by trade, but I've had to create client side scripts with JavaScript many times in the past. Simple things, like validating form elements and such. Now I've been...
2
by: James | last post by:
Can anyone please shed some light on the following... I have a framework that uses dynamically created tables, named using an incremental "attribute set ID", as follows: attrdata_1 attrdata_2...
3
by: Alfred | last post by:
Hi I would like to extract only 15 records at a time from the backend in alfabetic order. Click on a button and then the next 15. Reason data must come over a 56k modem. The data is not...
2
by: Dirtyweeker | last post by:
Hi, I have a database which records fitness test results of pupils. There are the usual name fields and then a series of fields holding results, e.g. field BP1 and field BP2; each of these...
1
by: estark | last post by:
Hello, Does anyone know the advantages and disadvantages to denormalized databases? And perhaps, why a denormalized database would be chosen over a normalized database? Thank you! Elizabeth
4
by: Debbiedo | last post by:
My software program outputs an XML Driving Directions file that I need to input into an Access table (although if need be I can import a dbf or xls) so that I can relate one of the fields...
13
by: lenygold via DBMonster.com | last post by:
I found this problem on ORACLE board. 2 input TABLES: Items Id ItemName 1 Phone 2 Table 3 Lamp 4 TV 5 Stereo
2
by: Steve Holden | last post by:
Fredrik Lundh wrote: Thanks so much, Fredrik. The reason I asked is because I found the specification completely opaque ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.