473,767 Members | 2,131 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

INNER JOIN using surrogate ID, or [Date] BETWEEN?

{CREATE TABLEs and INSERTs follow...}

Gents,

I have a main table that is in ONE-MANY with many other tables. For example, if
the main table is named A, there are these realtionships:

A-->B
A-->C
A-->D
A-->E

With one field in Common (Person). The tables B, C, D and E are History tables,
with Start and End dates. Each person has a Program history (table B, ie), an
Experience history (table C, ie), and so on...many differernt types of
histories, and it may grow from here....table F, G, etc.

The included CREATE TABLEs and INSERTs contain tables A, B and C.

The problem: Each tblCase (table A) record has a date. When joining all of the
history tables to tblCase on Person, obviously you get a cross-product of each
history unless you specify a WHERE clause that extracts one single record from
each of the histories (duh...that's the point...to extract a single record from
each history, because there can only be one value in effect at the time of the
Case.)

QUESTION: From a performance standpoint, would it behoove me to maintain the
surrogate ***HistoryID from each history table in tblCase, or, assuming the
indexes are set up properly, would a WHERE condition for each history be
sufficient? For example, the following select works as expected:

SELECT CasePerson, CaseDate, ProCode, ExpYear
FROM tblExperienceHi story INNER JOIN (tblCase INNER JOIN tblProgramHisto ry ON
tblCase.CasePer son = tblProgramHisto ry.ProPerson) ON tblCase.CasePer son =
tblExperienceHi story.ExpPerson
WHERE CaseDate BETWEEN ProStartDate and ProEndDate
AND CaseDate BETWEEN ExpStartDate and ExpEndDate

It extracts the single record from each history for each person for each case.
But I'm afraid of performace with such a scenario.

Instead, I could store each ***HistoryID in the table tblCase, and then just
join on that...no WHERE needed. But the trade-off is that I'd have to build
processes to maintain that. ("Hey, when you insert a record into tblCase, make
sure to go get each HistoryID from the History tables!" or "If the user changes
the date ranges in one of histories, make sure to update tblCase to match the
new historyID!")

Maybe a clustered index on each ***History table on Person/StartDate combined
with the WHERE clause should perform as well as a real JOIN on surrogate
integers.

It seems cheesey to have to resort to surrogate IDs...but the performance
increase might be worth it. Also, if I go that route, whenever I add a new
history table, I'd have to change the design of tblCase AND any SPs that
reference it. With the WHERE solution, I'd only have to change the SPs.

Comments are welcome! (tblCase grows at 250,000 records per year; the history
tables will increase about 1000 records per year)

DCMFAN

CREATE TABLE [dbo].[tblCase] (
[CaseID] [char] (5) CONSTRAINT [PK_tblCase] PRIMARY KEY CLUSTERED NOT NULL ,
[CaseDate] [smalldatetime] NOT NULL ,
[CasePerson] [char] (5) NOT NULL
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[tblExperienceHi story] (
[ExperienceHistI D] [int] IDENTITY (1, 1) NOT NULL ,
[ExpPerson] [char] (5) NOT NULL ,
[ExpStartDate] [smalldatetime] NOT NULL ,
[ExpEndDate] [smalldatetime] NOT NULL ,
[ExpYear] [int] NOT NULL
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[tblProgramHisto ry] (
[ProgramHistID] [int] IDENTITY (1, 1) NOT NULL ,
[ProPerson] [char] (5) NOT NULL ,
[ProStartDate] [smalldatetime] NOT NULL ,
[ProEndDate] [smalldatetime] NOT NULL ,
[ProCode] [int] NOT NULL
) ON [PRIMARY]
GO

INSERT INTO [tblCase]([CaseID], [CaseDate], [CasePerson])
VALUES('12345', '3/1/03', '00000')

INSERT INTO [tblCase]([CaseID], [CaseDate], [CasePerson])
VALUES('A1G34', '4/23/03', '00001')

INSERT INTO [tblExperienceHi story]([ExpPerson], [ExpStartDate], [ExpEndDate],
[ExpYear])
VALUES('00000', '1/1/03', '5/19/03', 1)

INSERT INTO [tblExperienceHi story]([ExpPerson], [ExpStartDate], [ExpEndDate],
[ExpYear])
VALUES('00000', '5/20/03', '12/31/03', 2)

INSERT INTO [tblExperienceHi story]([ExpPerson], [ExpStartDate], [ExpEndDate],
[ExpYear])
VALUES('00001', '4/20/03', '11/1/03', 0)

INSERT INTO [tblProgramHisto ry]([ProPerson], [ProStartDate], [ProEndDate],
[ProCode])
VALUES( '00000', '2/1/03', '9/30/03', '55555')

INSERT INTO [tblProgramHisto ry]([ProPerson], [ProStartDate], [ProEndDate],
[ProCode])
VALUES( '00000', '10/1/03', '5/1/04', '55555')

INSERT INTO [tblProgramHisto ry]([ProPerson], [ProStartDate], [ProEndDate],
[ProCode])
VALUES( '00001', '1/1/03', '12/31/03', '55555')

Jul 20 '05 #1
4 8106
DCM Fan (dc****@aol.com SPNOAM) writes:
With one field in Common (Person). The tables B, C, D and E are History
tables, with Start and End dates. Each person has a Program history
(table B, ie), an Experience history (table C, ie), and so on...many
differernt types of histories, and it may grow from here....table F, G,
etc.


Permit to try to understand the business rules here. Your data model
looks a little bit funny.

The way your table stands, one person can be on several cases? And these
histories are not related to the cases? That is, person 1234 can have
an experience of ZYX from 2002-12-12 to 2003-03-12. Can he also have
an experience of UHG from 2003-02-02 to 2003-08-07, or must intervals
not overlap?

If person 1234 has a case on 2003-02-27, then the experience record
that includes this date should be listed. But if the case is on
2003-08-10, and there is no experience record for this date, the
case should not be listed at all?

Maybe rather than having sparse table with start_date and end_date, you
should have history tables like:

CREATE TABLE experience_hist ory
(person char(5) NOT NULL,
date smalldatetime NOT NULL,
exp_code char(2) NULL,
CONSRAINT pk_exphist PRIMARY KEY (person, date))

The tables will be bigger, and you will have to find a way to fill
them up, but they will be very easy to use, and queries like the
one you had will be swift. And there will not be any surrogate keys.
--
Erland Sommarskog, SQL Server MVP, so****@algonet. se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #2
<<The way your table stands, one person can be on several cases? >>

Certainly...a Case record is created every day of the year for every person.

<<And these histories are not related to the cases>>

Of course they are...sort of...on person! The histories have nothing to do with
the cases, other than the fact that the Person is stored in each. And try not
to think of Cases as "Cases in litigation."

Technically, you're correct in that the Cases are not related to teh history
tables in a RI sense...the true structure has ONE-MANY from a Person table to
each if the histories and the Case table. But the bottom line is that each Case
data record can only map back to one of the MANY history records for each
PERSON, based on Person(ID) and tblCase.DATE

Each Person has a "program history," an "experience history" and a few others.
Each of those histories has Effectvie dates: From date1 to date2, so-and-so is
in program X; they can't overlap. The key need is to extract the ONE record in
each history table at the time of the Case record.

<<But if the case is on
2003-08-10, and there is no experience record for this date, the
case should not be listed at all?>>

That's right! And that should come out in the Audit section of the
application...s howing a "gap" in the history for the Case date.

<<
CREATE TABLE experience_hist ory
(person char(5) NOT NULL,
date smalldatetime NOT NULL,
exp_code char(2) NULL,
CONSRAINT pk_exphist PRIMARY KEY (person, date))


Are you suggesting a record for every date? That is, if a person is in exp_code
XX all year there should be 365 records in there for 2003?? That would
certainly ease my main concern (speed), but add a bunch of application code to
maintain them. Hmmm...how would I even show the user what the intervals are
based on that data? I would need some other SET-based grouping mechanism to
keep each interval unique:

CREATE TABLE experience_hist ory
(person char(5) NOT NULL,
date smalldatetime NOT NULL,
exp_code char(2) NULL,
IntervalSet (type) NOT NULL
CONSRAINT pk_exphist PRIMARY KEY (person, date))

That way, I could use a GROUP BY to return the list of intervals back to the
application:

SELECT MIN([date]),MAX([date]), MAX(exp_code)
FROM experience_hist ory
GROUP BY IntervalSet
ORDER BY 1

I'll ponder that one for a little while...THANKS!

DCMFAN
Jul 20 '05 #3
DCM Fan (dc****@aol.com SPNOAM) writes:
Are you suggesting a record for every date? That is, if a person is in
exp_code XX all year there should be 365 records in there for 2003??
Yes.

This is an extreme denormalization , but simplifies programming. We did
this in our application for currency prices. The table grew large, and
we later had limit which currency pairs we keep prices for. Overall
programming was grossly simplified.
That would certainly ease my main concern (speed), but add a bunch of
application code to maintain them.
A bunch? Nah, some code. The good thing is that you only need that code
in one place, whereas you can benefit from for each new report you need.
Hmmm...how would I even show the user what the intervals are based on
that data? I would need some other SET-based grouping mechanism to keep
each interval unique:

CREATE TABLE experience_hist ory
(person char(5) NOT NULL,
date smalldatetime NOT NULL,
exp_code char(2) NULL,
IntervalSet (type) NOT NULL
CONSRAINT pk_exphist PRIMARY KEY (person, date))

That way, I could use a GROUP BY to return the list of intervals back to
the application:

SELECT MIN([date]),MAX([date]), MAX(exp_code)
FROM experience_hist ory
GROUP BY IntervalSet
ORDER BY 1


Yes, if a person can be in the same code on two distinct intervals. But
maybe rather a running number for each interval the person is in the
exp_code.

SELECT person, exp_code, MIN(date), MAX(date)
FROM experience_hist ory
GROUP BY person, exp_code, exp_code_no
ORDER BY person, 3

--
Erland Sommarskog, SQL Server MVP, so****@algonet. se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #4
<<
Yes, if a person can be in the same code on two distinct intervals. But
maybe rather a running number for each interval the person is in the
exp_code.

SELECT person, exp_code, MIN(date), MAX(date)
FROM experience_hist ory
GROUP BY person, exp_code, exp_code_no
ORDER BY person, 3


Right on. Thanks again.
Jul 20 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
6416
by: Prem | last post by:
Hi, I am having many problems with inner join. my first problem is : 1) I want to know the precedance while evaluating query with multiple joins. eg. select Employees.FirstName, Employees.LastName, TerritoryID, Employees.EmployeeID, RegionID, ProductID from Employees
112
10357
by: Andy | last post by:
Hi All! We are doing new development for SQL Server 2000 and also moving from SQL 7.0 to SQL Server 2000. What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000? Please, share your experience in using IDENTITY as PK .
2
2672
by: psuaudi | last post by:
I have a main query that I would like to call two different subqueries. In MS Access, I usually just save the two subqueries as separate queries which are then called by a third separate and main query. However, I'd like to put them all into one SQL command. Is this possible? Here are the queries: -This query calls the other two queries below- SELECT ., ., Format(((.Date)-(.Date)),"Fixed") AS , .Type FROM INNER JOIN ON (. = .) AND...
3
2362
by: chaitanya02 | last post by:
I have problem joining four tables using Inner Join in ACCESS DB, I want to join 4 tables: 1, 2, 3 , 4 ....and query data in all these to see, i get a right url associated. sql1-account id, account name sql2-account id, email id sql3-account id, contract number, maintenance start date, maintenance expiration date, download url, contact, maintenance status sq4-account name, user id, pwd first- i will check if the USER NAME and PWD I...
12
13191
by: Chamnap | last post by:
Hello, everyone I have one question about the standard join and inner join, which one is faster and more reliable? Can you recommend me to use? Please, explain me... Thanks Chamnap
0
1243
by: Ironr4ge | last post by:
Hi everyone, I was wondering whether is was possible to select only the filtered records of one form with Recordsource = "C" and open a diffrent form with recordsource "L" . The way I was thinking accompleshing this was by using the following innerjoin statement.. and code to open form... ----------------------------------------------------------------------- Private Sub cmdFilter_Click() Dim strWhere As String 'The...
0
1285
by: stanlew | last post by:
Happy New Year everyone! I'm new to both T-SQL and this forum. I'm currently doing an internship and my first task was to create a small program which will send an email detailing the sales of the previous day versus monthly targets and sales. Most of the parts were figured out and eveything was done in Visual Studio. The gist of the code was written in one large chunk of SQL code, as below: SELECT derivedtbl_1.family AS 'Family',...
11
19972
by: YZXIA | last post by:
Is there any difference between explicit inner join and implicit inner join Example of an explicit inner join: SELECT * FROM employee INNER JOIN department ON employee.DepartmentID = department.DepartmentID
1
5419
by: ewokspy | last post by:
I have 2 tables that I need to use an INNER JOIN on. tblEMPLOYEES has FIRSTNAME, LASTNAME, EMPILID, DATETRAINED, DEPT tblMAIN has EMPILID, OBSERVATIONDATE In tblMAIN, the EMPILID is not the primary key, and each ID can have several dates. I only want the last date each ID has an Observation, and then to join that with the tblEMPLOYEES data and essentially get a recordset of each employee, and their last observation date. I was playing...
0
10169
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10013
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9841
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8838
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7383
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6655
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
3930
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3533
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2807
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.