somewhat painful join code - Microsoft SQL Server

jason

Hello.

Newbie on SQL and suffering through this.

I have two tables created as such:

drop table table1;
go
drop table table2;
go
create table table1(
name varchar(10),
code1 integer,
code2 integer,
code3 integer,
code4 integer,
code5 integer
);
go
create table table2(
code integer,
descript varchar(50)
);
go
INSERT INTO table1 VALUES ('mary',1,7,8,9 ,13)
INSERT INTO table1 VALUES ('mary',1,7,8,9 ,13)
INSERT INTO table1 VALUES ('mary',1,7,7,7 ,13)
INSERT INTO table1 VALUES ('mary',1,7,8,9 ,13)
INSERT INTO table1 VALUES ('joe',1,7,8,9, 3)
INSERT INTO table1 VALUES ('bob',1,7,8,9, 3)
INSERT INTO table1 VALUES ('larry',22,17, 18,19,113)
INSERT INTO table1 VALUES ('mary',1,3,2,9 ,13)
INSERT INTO table2 VALUES (1,'code1')
INSERT INTO table2 VALUES (3,'code3')
INSERT INTO table2 VALUES (7,'code7')
go

Table1 will have duplicate name entries and code1 - code5 could
contain any range of numeric codes, including dups in the same row or
other rows.

Table2 sorted and indexed unique by code contains only the codes I'm
interested in reporting.

I need to produce report:

name code count (of names that contain at least 1 code in Table2)

I'll go back to suffering through it now, but I thought I'd post since
I'm not exactly sure where to begin.

Thank you for any help or information!

My above email is no longer active, so please post to the list so that
everyone can benefit and your contribution lives on.

Jul 20 '05 #1

Subscribe Reply

2433

David Portas

Here's the query:

SELECT T1.name, COUNT(*)
FROM Table2 AS T2
JOIN
(SELECT name, code1
FROM Table1
UNION ALL
SELECT name, code2
FROM Table1
UNION ALL
SELECT name, code3
FROM Table1
UNION ALL
SELECT name, code4
FROM Table1
UNION ALL
SELECT name, code5
FROM Table1) AS T1(name,code)
ON T1.code = T2.code
GROUP BY T1.name

But your tables need a complete redesign. I'm afraid there's so much wrong
here and without knowing more about the situation you are modelling it's
difficult to know where to start or even whether the above solution is
meaningful. If you're new to relational databases then read up on database
design and Normalization, or get whoever's responsible to fix the design at
the earliest opportunity.

Hope this helps.

--
David Portas
------------
Please reply only to the newsgroup
--

Jul 20 '05 #2

jason

Hello David, thanks for response.

These two tables where created from flat files just to process them in
sql2000. I am new to SQL. It was not my intention to insure data
integrity or set constraints with the tables. Prior to your response,
I wrote the following and did a quick spot check on the output and it
appears to have produced what I wanted. I'm sure there's a good
explanation - but why and how is the result different from yours?

SELECT Table1.name, Table2.code, count(*) FROM Table1
INNER JOIN Table2 ON table1.code1=ta ble2.code
or table1.code2 = table2.code or table1.code3=ta ble2.code
or table1.code4=ta ble2.code or table1.code5=ta ble2.code
group by table1.name,tab le2.code
Thanks again!

Jul 20 '05 #3

David Portas

My query counts the number of occurrences of each code. Your query counts
the number of rows on which a code appears. If the same code appears
multiple times on the same row then your query will only count it once
whereas mine will count it multiple times.

--
David Portas
------------
Please reply only to the newsgroup
--

Jul 20 '05 #4

--CELKO--

>> Newbie on SQL and suffering through this. <<

Let me do a cut & paste on a trick you should consider using on this
kind of crappy denormalized data; sorting it within the row.

=========
Another trick is the Bose-Nelson sort ("A Sorting Problem" by R. C.
Bose and R. J. Nelson; Journal of the ACM, vol. 9 pages 282-296),
which I had written about in DR. DOBB'S JOURNAL back in 1985. This is
a recursive procedure that takes an integer and then generates swap
pairs for a vector of that size. A swap pair is a pair of position
numbers from 1 to (n) in the vector which need to be exchanged if they
are out of order. This swap pairs are also related to Sorting
Networks in the literature (see THE ART OF COMPUTER PROGRAMMING by
Donald Knuth, vol 3).

You are probably thinking that this method is a bit weak because the
results are only good for sorting a fixed number of items. But a
table only has a fixed number of columns, so that is not a problem in
denormalized SQL.

You can set up a sorting network that will sort five items, with the
minimal number of exchanges, nine swaps, like this:

swap (c1, c2);
swap (c4, c5);
swap (c3, c5);
swap (c3, c4);
swap (c1, c4);
swap (c1, c3);
swap (c2, c5);
swap (c2, c4);
swap (c2, c3);

You might want to deal yourself a hand of five playing cards in one
suit to see how it works. Put the cards face down on the table and
pick up the pairs, swapping them if required, then turn over the row
to see that it is in sorted order when you are done.

In theory, the minimum number of swaps needed to sort (n) items is
CEILING (log2 (n!)) and as (n) increases, this approaches
O(n*log2(n)). The Computer Science majors will remember that "Big O"
expression as the expected performance of the best sorting algorithms,
such as Quicksort. The Bose-Nelson method is very good for small
values of (n). If (n < 9) then it is perfect, actually. But as
things get bigger, Bose-Nelson approaches O(n ^ 1.585). In English,
this method is good for a fixed size list of 16 or fewer items and
goes to hell after that.

You can write a version of the Bose-Nelson procedure which will output
the SQL code for a given value of (n). The obvious direct way to do a
swap() is to write a chain of UPDATE statements. Remember that in
SQL, the SET clause assignments happen in parallel, so you can easily
write a SET clause that exchanges the two items when are out of order.
Using the above swap chain, we get this block of code:

BEGIN ATOMIC
-- swap (c1, c2);
UPDATE Foobar
SET c1 = c2, c2 = c1
WHERE c1 > c2;

-- swap (c4, c5);
UPDATE Foobar
SET c4 = c5, c5 = c4
WHERE c4 > c5;

-- swap (c3, c5);
UPDATE Foobar
SET c3 = c5, c5 = c3
WHERE c3 > c5;

-- swap (c3, c4);
UPDATE Foobar
SET c3 = c4, c4 = c3
WHERE c3 > c4;

-- swap (c1, c4);
UPDATE Foobar
SET c1 = c4, c4 = c1
WHERE c1 > c4;

-- swap (c1, c3);
UPDATE Foobar
SET c1 = c3, c3 = c1
WHERE c1 > c3;

-- swap (c2, c5);
UPDATE Foobar
SET c2 = c5, c5 = c2
WHERE c2 > c5;

-- swap (c2, c4);
UPDATE Foobar
SET c2 = c4, c4 = c2
WHERE c2 > c4;

-- swap (c2, c3);
UPDATE Foobar
SET c2 = c3, c3 = c2
WHERE c2 > c3;

END;

This fully portable, standard SQL code and it can be machine
generated. But that parallelism is useful. It is worthwhile to
combine some of the UPDATE statements. But you have to be careful not
to change the effective sequence of the swap operations.

If you look at the first two UPDATE statements, you can see that they
do not overlap. This means you could roll them into one statement
like this:

-- swap (c1, c2) AND swap (c4, c5);
UPDATE Foobar
SET c1 = CASE WHEN c1 <= c2 THEN c1 ELSE c2 END,
c2 = CASE WHEN c1 <= c2 THEN c2 ELSE c1 END,
c4 = CASE WHEN c4 <= c5 THEN c4 ELSE c5 END,
c5 = CASE WHEN c4 <= c5 THEN c5 ELSE c4 END
WHERE c4 > c5 OR c1 > c2;

The advantage of doing this is that you have to execute only one
UPDATE statement and not two. Updating a table, even on non-key
columns, usually locks the table and prevents other users from getting
to the data. If you could roll the statements into one single UPDATE,
you would have the best of all possible worlds, but I doubt that the
code would be easy to read.

=============== ==
I also cannot stand the useless table names, so let me change them as
if human beings reading the code mattered.
-- since it has no key, it is not a table!! Not 1NF either!
CREATE TABLE PeopleSkills
(name VARCHAR(10) NOT NULL,
code1 INTEGER NOT NULL,
code2 INTEGER NOT NULL,
code3 INTEGER NOT NULL,
code4 INTEGER NOT NULL,
code5 INTEGER NOT NULL);

DELETE FROM PeopleSkills;
INSERT INTO PeopleSkills VALUES ('mary', 1, 7, 8, 9, 13);
INSERT INTO PeopleSkills VALUES ('mary', 1, 7, 8, 9, 13);
INSERT INTO PeopleSkills VALUES ('mary', 1, 7, 7, 7, 13);
INSERT INTO PeopleSkills VALUES ('mary', 1, 7, 8, 9, 13);
INSERT INTO PeopleSkills VALUES ('joe', 1, 7, 8, 9, 3);
INSERT INTO PeopleSkills VALUES ('bob', 1, 7, 8, 9, 3);
INSERT INTO PeopleSkills VALUES ('larry', 22, 17, 18, 19, 113); -- 0
target codes
INSERT INTO PeopleSkills VALUES ('mary', 1, 3, 2, 9, 13);
INSERT INTO PeopleSkills VALUES ('melvin', 1, 3, 2, 9, 13); -- 2
target codes
INSERT INTO PeopleSkills VALUES ('irving', 1, 8, 2, 9, 13); -- 1
target codes

CREATE TABLE TargetCodes
(code INTEGER NOT NULL PRIMARY KEY,
descript VARCHAR(50) NOT NULL);
INSERT INTO TargetCodes VALUES (1, 'code1')
INSERT INTO TargetCodes VALUES (3, 'code3')
INSERT INTO TargetCodes VALUES (7, 'code7')

I need to produce report:

name code count (of names that contain at least 1 code in Table2)<<

This will give you just the names:

SELECT DISTINCT name
FROM PeopleSkills AS P1
WHERE code1 IN (SELECT code FROM TargetCodes)
OR code2 IN (SELECT code FROM TargetCodes)
OR code3 IN (SELECT code FROM TargetCodes)
OR code4 IN (SELECT code FROM TargetCodes)
OR code5 IN (SELECT code FROM TargetCodes);

This modification will shown you which codes each person has, with 1/0
flags. This has aneat trick with little-used SUM(DISTINCT)
construction, but you have to know what the target codes are in
advance.

SELECT name,
SUM(DISTINCT CASE WHEN 1 IN (code1, code2, code3,code4,cod e5)
THEN 1 ELSE 0 END) AS code1,
SUM(DISTINCT CASE WHEN 3 IN (code1, code2, code3,code4,cod e5)
THEN 1 ELSE 0 END) AS code3,
SUM(DISTINCT CASE WHEN 7 IN (code1, code2, code3,code4,cod e5)
THEN 1 ELSE 0 END) AS code7
FROM PeopleSkills AS P1
WHERE code1 IN (SELECT code FROM TargetCodes)
OR code2 IN (SELECT code FROM TargetCodes)
OR code3 IN (SELECT code FROM TargetCodes)
OR code4 IN (SELECT code FROM TargetCodes)
OR code5 IN (SELECT code FROM TargetCodes)
GROUP BY name;

But I think you wanted a count of how many of the code appear in each
name, which is pretty easy:

SELECT P1.name, COUNT(DISTINCT T1.code)
FROM PeopleSkills AS P1,
TargetCodes AS T1
WHERE T1.code IN (code1, code2, code3, code4, code5)
GROUP BY name;

I would do it this way, to get the zeroes:
SELECT P1.name, COUNT(DISTINCT T1.code)
FROM PeopleSkills AS P1
LEFT OUTER JOIN
TargetCodes AS T1
ON T1.code IN (code1, code2, code3, code4, code5)
GROUP BY name;

Jul 20 '05 #5

Similar topics

2448

Can you join a union

by: Tomas | last post by:

I would like to know if and how it is possible to join the *result* of a union with another table (without first doing individual joins to each part of the result that then will be combined into a union) ?? For example, I would like to do such a join, if possible, to avoid doing something like this: select tableA.col1 as c1 , tableC.col2, tableC.col3, tableC.col4, tableC.col5, ..... from tableA inner join tableC on tableA.col1 =...

MySQL Database

4972

OUTER JOIN problem

by: Matt | last post by:

Hello I have to tables ar and arb, ar holds articles and a swedish description, arb holds descriptions in other languages. I want to retreive all articles that match a criteria from ar and also display their corresponding entries in arb, but if there is NO entry in arb I still want it to show up as NULL or something, so that I can get the attention that there IS no language associated with that article.

Microsoft SQL Server

23100

LEFT OUTER JOIN's possible in DB2?

by: Ian Boyd | last post by:

i know nothing about DB2, but i'm sure this must be possible. i'm trying to get a client to create a view (which it turns out is called a "Logical" in DB2). The query needs a LEFT OUTER JOIN, but he doesn't know how to do that, or even if he can, and i don't have to time to learn DB2 from scratch right now. The following SQL Query is a trimmed sample of the full View (i.e. Logical) definition - and i would create it on an SQL based...

DB2 Database

5570

SQL92 Join and DB2

by: dunleav1 | last post by:

I have an application that uses the old join syntax instead of the SQL92 standards join syntax. I need to justify changing the code to the new standard. Is there any performance issue related to using the old syntax that are documented? Are there any other issues that I can use to justify a code upgrade?

DB2 Database

2387

Problem with Inner Join

by: Anila | last post by:

Hi Friends, My problem with Inner join is ... first i joined two tables and i got the result. after that iam trying to join one more table its giving syn tax error in JOIN condition. Here is the Query

Oracle Database

2714

LEFT JOIN Optimization

by: polycom | last post by:

Assistance needed to optimize this query SELECT SD.content_id AS Id, SD.title AS Title, CT.name AS Contenttype, PV.content_id AS SponsorId, PV.display_name AS Sponsor, CONCAT_WS("", NT.title, " - ", N.pubdate, " ") AS ListGroup,

MySQL Database

4608

left outer join/inner join statement

by: teneesh | last post by:

Here I have a code for a view that has been created by a developer on my team. I am trying to use the very same code to create a view for a different formid/quesid. But I cannot figure out how this one starts and ends. can someone please help. here's the code from the developer. SELECT a.EvalRecNo, w1.q1, w2.q2, w3.q3, w4.q4, w5.q5, w6.comment FROM (SELECT DISTINCT u.EvalRecNo FROM dbo.UData AS u...

Microsoft SQL Server

2631

Which is better(in terms of both memory & time): join or subquery

by: theGeek | last post by:

I always wonder which one of join or subquery should I be using to solve a particuar problem so that I get the correct result set in minimum time. It usually happens that I can write a query quickly using subqueries (I think I happen to think that way more easily than join way :)) but I see others making use of joins all the time. May be that they can think of a query more easily using joins. I am saying that because I don't usually see much of...

Microsoft SQL Server

4511

INNER JOIN & IS NOT NULL probs

by: MATTXtwo | last post by:

I have this store procedure to select data from table with join like this...SELECT tblPeribadi.Personel_No, tblPeribadi.Nama,tblCompany.Keterangan as Company_Code, tblPeribadi.Jawatan, tblPeribadi.Taraf_Jawatan, tblGroup.Keterangan AS Kumpulan,tblPeribadi.Gred,tblBusiness_Area.Keterangan AS Business_Area,tblCost_Center.Keterangan AS Kod_Pusat_Kos, tblPeribadi.IC_Baru, tblPeribadi.IC_Lama, ...

Microsoft SQL Server

8836

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

9575

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

9394

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

9256

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

6803

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

6080

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

4712

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

4885

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

2223

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General