473,750 Members | 2,447 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Joins: strategy and how-to approach

My SQL acumen stems from just a couple courses, and everything since from
the trenches. Fun + angst over time.

I'm needing some advice on joins. Though I understand the basics, I'm
having problems abstracting from instances where it's easy to think about
discrete key values (. . . and studentid = 1234) to entire sets of users,
with the joins doing their work.

For example, currently I'm going nuts trying to return dates for which
attendance has not been taken for students, but should have been. Students
have active and inactive periods of enrollment in our schools, so we have a
history table of when they were active and inactive -- as well as two more
tables that layer other bounds on eligible dates (what range of dates fall
within a given school's term? What of holidays and staff institute days?).
I also have a populated calendar table, and a table where students are
identified. Finally, there's a site history table which is a REAL pain in
the butt for me to think about.
CREATE TABLE Student (
StudentID int IDENTITY(1,1) NOT NULL,
CurrentStatus varchar(2) NOT NULL)

CREATE TABLE Calendar (
Dateid int NOT NULL,
Date datetime NULL,
Workday bit NULL )

CREATE TABLE DailyAttendance (
StudentID int NOT NULL,
AttendanceDate datetime NOT NULL,
SiteID varchar(6) NOT NULL,
Attend_Status varchar(2) NOT NULL)

(the last field is, e.g., present or absent)

CREATE TABLE StudentActivity History (
StudentID int NOT NULL,
StatusStartDate datetime NOT NULL,
StatusEndDate datetime NULL,
Activity_Status varchar(2) NULL,
StudentStatusHi storyID int IDENTITY(1,1) NOT NULL)

(the activity_status is either A or I; the important records in this table
are the 'A' records. A student's most recent status record always has an
end date of '12/31/9999 12:00:00 AM', whether that's an A or I record. No
dates not between start/end dates of students' A records would need
attendance taken. students may have many periods of activity -- A records
-- as well as many inactive periods.)

CREATE TABLE SiteTerms (
SiteID varchar(6) NOT NULL,
Term varchar(3) NOT NULL,
StartOfTerm datetime NOT NULL,
Quarter varchar(2) NOT NULL,
SchoolYear varchar(9) NOT NULL,
EndOfTerm datetime NOT NULL)

(different schools vary their term start and end dates. No dates not
between term start and end dates would need attendance taken by students
assigned to and active in that school during that period.)

CREATE TABLE SiteExceptionDa ys (
SiteID varchar(6) NOT NULL,
SchoolDayStartT ime datetime NOT NULL,
SchoolDayEndTim e datetime NOT NULL,
SchoolDayType varchar(2) NOT NULL)

(there are two kinds of days -- partial attendance, and no attendance. In
short, if the type of day is "N" no attendance needs to be taken for
students assigned to that school and active on that day)

CREATE TABLE StudentSiteHist ory (
StudentID int NOT NULL,
SiteStartDate datetime NOT NULL,
SiteID varchar(6) NOT NULL,
SiteEndDate datetime NULL,
StudentSiteHist oryID int IDENTITY(1,1) NOT NULL)

(Pain. The attendance table tells which site a student was assigned when
attendance was taken. To find which school a student was a assigned to on
days attendance was NOT taken, this table's implicated 'cause it's the only
way of connecting everything else together)

Dangitall, I know this can be done but I've beat my head against the wall.
Due diligence has gotten me a headache and a hankerin' for whiskey, and I'm
not much of a drinker. Is there anyone in the group for whom this kind of
thing is a no-brainer? I'd just as soon get some tips on how to approach
this kind of thing, and figure it out myself with some guidance.

Any takers? Gotta run, dang I'm late for something.

TIA

--

Scott
Oct 5 '05 #1
4 1425
On Wed, 5 Oct 2005 17:40:58 -0500, Scott Marquardt wrote:
My SQL acumen stems from just a couple courses, and everything since from
the trenches. Fun + angst over time.

I'm needing some advice on joins. Though I understand the basics, I'm
having problems abstracting from instances where it's easy to think about
discrete key values (. . . and studentid = 1234) to entire sets of users,
with the joins doing their work.

For example, currently I'm going nuts trying to return dates for which
attendance has not been taken for students, but should have been. Students
have active and inactive periods of enrollment in our schools, so we have a
history table of when they were active and inactive -- as well as two more
tables that layer other bounds on eligible dates (what range of dates fall
within a given school's term? What of holidays and staff institute days?).
I also have a populated calendar table, and a table where students are
identified. Finally, there's a site history table which is a REAL pain in
the butt for me to think about.
(snip CREATE TABLE statements)
Dangitall, I know this can be done but I've beat my head against the wall.
Due diligence has gotten me a headache and a hankerin' for whiskey, and I'm
not much of a drinker. Is there anyone in the group for whom this kind of
thing is a no-brainer?


Hi Scott,

Are you sure that the headache is not caused by the whiskey? Or by
banging your head against the wall? (Next time, leave more room between
you and the wall when you're head-banging. Or switch from heavy metal to
something more placid - Mozart, perhaps?)

Anyway, I don't think this is a no-brainer for anyone. The number of
tables involved and the way they are related make this a tough one. And
the absence of constraints in the DDL, of INSERT statements with sample
data and expected output didn't help either (hint, hint).

Here's my attempt. It's untested. If it doesn't work as expected, then
please read www.aspfaq.com/5006 and follow the instructions carefully.

SELECT s.StudentID, c.Date
FROM Student AS s
INNER JOIN StudentActivity History AS sah
ON sah.StudentID = s.StudentID
AND sah.Activity_St atus = 'A'
INNER JOIN Calendar AS c
ON c.Date BETWEEN sah.StatusStart Date AND sah.StatusEndDa te
INNER JOIN StudentSiteHist ory AS ssh
ON ssh.StudentID = s.StudentID
AND c.Date BETWEEN ssh.SiteStartDa te AND ssh.SiteEndDate
INNER JOIN SiteTerms AS st
ON st.SiteID = ssh.SiteID
AND c.Date BETWEEN st.StartOfTerm AND st.EndOfTerm
WHERE NOT EXISTS
(SELECT *
FROM DailyAttendance AS da
WHERE da.StudentID = s.StudentID
AND da.AttendanceDa te = c.Date)
AND NOT EXISTS
(SELECT *
FROM SiteExceptionDa ys AS sed
WHERE sed.SiteID = ssh.SiteID
AND sed.SchoolDaySt artTime >= c.Date
AND sed.SchoolDaySt artTime < DATEADD(day, 1, c.Date)
AND sed.SchoolDayTy pe = 'N')

(Note: You can also convert the NOT EXISTS subqueries to LEFT OUTER
JOINS. Use this generic approach:

FROM a WHERE NOT EXISTS (SELECT * FROM b WHERE b.x = a.x)
is equivalent to
FROM a LEFT OUTER JOIN b ON b.x = a.x WHERE b.x IS NULL

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Oct 7 '05 #2
> I'm [..] trying to return dates for which attendance has not been taken for students, but should have been.

First, let's try to get the dates where attendance should have been
taken. For this step, we are only considering the StudentSiteHist ory
table, the student's CurrentStatus (I'm assuming 'A' for active) and
the working days from the Calendar table:

SELECT SSH.StudentID, SSH.SiteID, C.Date
FROM StudentSiteHist ory SSH
INNER JOIN Student S ON S.StudentID=SSH .StudentID
INNER JOIN Calendar C
ON C.Date BETWEEN SSH.SiteStartDa te AND SSH.SiteEndDate
WHERE S.CurrentStatus ='A' AND C.Workday<>0

Now, let's add the SiteTerms, SiteExceptionDa ys and
StudentActivity History tables:

SELECT SSH.StudentID, SSH.SiteID, C.Date
FROM StudentSiteHist ory SSH
INNER JOIN Student S ON S.StudentID=SSH .StudentID
INNER JOIN Calendar C
ON C.Date BETWEEN SSH.SiteStartDa te AND SSH.SiteEndDate
WHERE S.CurrentStatus ='A' AND C.Workday<>0
AND EXISTS (
SELECT * FROM SiteTerms ST
WHERE ST.SiteID=SSH.S iteID
AND C.Date BETWEEN ST.StartOfTerm AND ST.EndOfTerm
) AND NOT EXISTS (
SELECT * FROM SiteExceptionDa ys SED
WHERE SED.SiteID=SSH. SiteID AND SED.SchoolDayTy pe='N'
AND SED.SchoolDaySt artTime>=C.Date
AND SED.SchoolDaySt artTime<C.Date+ 1
) AND EXISTS (
SELECT * FROM StudentActivity History SAH
WHERE SAH.StudentID=S SH.StudentID AND SAH.Activity_St atus='A'
AND C.Date BETWEEN SAH.StatusStart Date AND SAH.StatusEndDa te
)

I have used subqueries to add these conditions, mainly because they are
more intuitive. I believe I could have used INNER JOIN-s instead of
EXISTS subqueries (adding a DISTINCT keyword or a GROUP BY), but
performance would have been worse.

In the SiteExceptionDa ys table, I'm assuming that the
SchoolDayStartT ime and SchoolDayEndTim e would store the date and the
time (unlike all the other datetime columns, which store only the date,
with a time of 0:00:00). Also I'm assuming that SchoolDayStartT ime and
SchoolDayEndTim e are always in the same day (for each row). This should
be verified with a CHECK CONSTRAINT (and it would have been useful if
you also provided all the constraints for these tables: primary keys,
foreign keys, unique constraints, check constraints).

Now, we only need to filter the days which have no attendance. Of
course, we can use a subquery (like above), but this time let's use a
LEFT JOIN:

SELECT X.StudentID, X.SiteID, X.Date
FROM (
SELECT SSH.StudentID, SSH.SiteID, C.Date
FROM StudentSiteHist ory SSH
INNER JOIN Student S ON S.StudentID=SSH .StudentID
INNER JOIN Calendar C
ON C.Date BETWEEN SSH.SiteStartDa te AND SSH.SiteEndDate
WHERE S.CurrentStatus ='A' AND C.Workday<>0
AND EXISTS (
SELECT * FROM SiteTerms ST
WHERE ST.SiteID=SSH.S iteID
AND C.Date BETWEEN ST.StartOfTerm AND ST.EndOfTerm
) AND NOT EXISTS (
SELECT * FROM SiteExceptionDa ys SED
WHERE SED.SiteID=SSH. SiteID AND SED.SchoolDayTy pe='N'
AND SED.SchoolDaySt artTime>=C.Date
AND SED.SchoolDaySt artTime<C.Date+ 1
) AND EXISTS (
SELECT * FROM StudentActivity History SAH
WHERE SAH.StudentID=S SH.StudentID AND SAH.Activity_St atus='A'
AND C.Date BETWEEN SAH.StatusStart Date AND SAH.StatusEndDa te
)
) X LEFT JOIN (
SELECT StudentID, SiteID, AttendanceDate
FROM DailyAttendance
WHERE Attend_Status=' P'
) Y
ON X.StudentID=Y.S tudentID
AND X.SiteID=Y.Site ID
AND X.Date=Y.Attend anceDate
WHERE Y.StudentID IS NULL

After I wrote these queries, I also saw Hugo's response and I was
amazed (again) how similar our queries are.

Razvan

Oct 8 '05 #3
You have tables without keys, use BIT and IDENTITY. Can you explain
what the heck a date_id would mean? Why do you think that a date is
not unique in itself? You also keep saying "fields" and
"records" so we know that you are designing a file system and not
an RDBMS.
A student's most recent status record [sic] always has an end date of '12/31/9999 12:00:00 AM' <<


This is a really bad idea for a lot of reasons. Ignoring the use of
"record", a status is an attribute and not an entity. Cleaning up just
a little, we have:

CREATE TABLE Calendar -- 10-20 years is usually enough
(cal_date DATETIME NOT NULL PRIMARY KEY,
date_type CHAR(1) NOT NULL);

CREATE TABLE Students
(student_id INTEGER NOT NULL PRIMARY KEY,
student_status CHAR(2) NOT NULL);

CREATE TABLE DailyAttendance
(student_id INTEGER NOT NULL
REFERENCES Students(studen t_id),
attendance_date DATETIME NOT NULL,
site_id VARCHAR(6) NOT NULL,
attend_code CHAR(2) NOT NULL);

History tables are keyed with the entity and the start date of an
event, not with IDENTITY unless you meant to destroy data integrity.

CREATE TABLE StudentActivity History
(student_id INTEGER NOT NULL,
status_startdat e DATETIME NOT NULL,
status_enddate DATETIME,
activity_status CHAR(2) NOT NULL,
PRIMARY KEY (student_id, status_startdat e))

CREATE TABLE StudentSiteHist ory
(student_id INTEGER NOT NULL,
site_id VARCHAR(6) NOT NULL,
site_startdate DATETIME NOT NULL,
site_enddate DATETIME,
PRIMARY KEY (student_id, site_startdate) );

Why is there not a general table like this instead?

CREATE TABLE StudentHistory
(student_id INTEGER NOT NULL,
startdate DATETIME DEFAULT CURRENT_TIMESTA MP NOT NULL,
enddate DATETIME, -- null means current
CHECK (startdate < enddate),
activity_code CHAR(2) NOT NULL
CHECK (activity_code IN (..)),
site_id VARCHAR(6) NOT NULL
REFERENCES Sites(site_id)
ON UPDATE CASCADE,
PRIMARY KEY (student_id, startdate));

Now your question is easy -- you have an event anchored in both time
and space, like it should be. And the DailyAttendance table is
redundant; the current day has (enddate IS NULL).

Oct 8 '05 #4
--CELKO-- opined thusly on Oct 8:
You have tables without keys, use BIT and IDENTITY. Can you explain
what the heck a date_id would mean?


Probably, but I won't bother. What I posted is what I inherited, not what I
designed. That said, with respect to the remaining information I'd probably
do little better myself -- or possibly worse. Bear in mind my opening
disclaimer in the post.

I can't very well turn the rudder on a rather intractable ship (the app is
pretty embedded just now, so I'm not at liberty to make the changes you
suggest). I'll try the other respondents' ideas with what I actually have,
and (for the second time in several weeks) post my progress/result. Thanks
to all.

Just now, though, I go a-bed. Dang I'm tired.

- Scott
Oct 9 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
3479
by: Ralph Freshour | last post by:
I am having a hard time with joins - my following code displays: ..member_name .gender instead of the actual data - I've been reading through my PHP and MySQL manuals - the MySQL manual tells me how to form the syntax but it is always shown in MySQL interactive mode and not using PHP code so I have to try and figure it out in PHP (I've already opened mysql and selected the database):
2
3109
by: raulgz | last post by:
Hi I work ith sql server 2000 and i need know the diferent of joins in format not ansi ( with * ) and joins in format ansi ( with 'outher join on' ). Two format work equal ???
3
1882
by: jakelake | last post by:
Help....I have a DB I'm working with that I know doesn't work with the ANSI-92 JOIN SYNTAX....I'm not sure how much this limits my ability to deal with the following situation, so I'm soliciting the help of a guru....I apologize for the lack of scripted table structure, but this database is embedded in an application that I have no true schema for. I have a crude diagram of the tables and some of the relationships, but I've managed to have...
4
2048
by: michaelnewport | last post by:
Greetings, I like to write my inner joins as below, but someone at work tells me its not as 'performant' as using the 'inner join' statement. Is this true ? Is there a better way to write it ? thanks
0
2675
by: DG | last post by:
Can all "joins" be mimicked via nested selects?
3
1659
by: orekinbck | last post by:
Hi There What are the best value practice exam engines for 70-316? How does this strategy sound: * First iteration - Go through Kalani's training guide + MSDN + http://www.codeclinic.com/70-316skills.htm + Any other online resources * Second Iteration - Go through the ExamCram book and questions * Third Iteration - Take a practice exam or two
1
2489
by: Nick | last post by:
Hi, I read somewhere recently that the strategy pattern could be used instead of using switch command on an enum (for example). How would this work? A small example would be great. Thanks, Nick
1
1393
by: prileep | last post by:
I have two tables Users and UserLogin. Here i will use two methods of table design. and which query will return me the result more fast. The table size will be large that it may contain records in lakhs. Method 1: Tables: Users ( UserID varchar(20) primary key, PassWord varchar(20)) UsersLogin(UserID varchar(20),LoginDate DateTime) Query: Select Users.UserID,UsersLogin.LoginDate from Users U inner join UsersLogin UL
1
10620
by: rneel | last post by:
i have two tables a1 and b1. they have a common column which is not having any constraints... they have null values... when i use a join between the two columns the null values are neglected... how do i include the null values in the joins... need help r -neel
0
1004
by: satishreddy | last post by:
how many types of joins are there in oracle
0
8839
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9584
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9397
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8264
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6810
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6081
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
3327
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2807
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2226
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.