This is called a relational division. Here is my usual "cut & paste"

about it. You might want to get a copy of SQL FOR SMARTIES for your

desk:

Relational division is one of the eight basic operations in Codd's

relational algebra. The idea is that a divisor table is used to

partition a dividend table and produce a quotient or results table.

The quotient table is made up of those values of one column for which a

second column had all of the values in the divisor.

This is easier to explain with an example. We have a table of pilots

and the planes they can fly (dividend); we have a table of planes in

the hangar (divisor); we want the names of the pilots who can fly every

plane (quotient) in the hangar. To get this result, we divide the

PilotSkills table by the planes in the hangar.

CREATE TABLE PilotSkills

(pilot CHAR(15) NOT NULL,

plane CHAR(15) NOT NULL,

PRIMARY KEY (pilot, plane));

PilotSkills

pilot plane

=============== ==========

'Celko' 'Piper Cub'

'Higgins' 'B-52 Bomber'

'Higgins' 'F-14 Fighter'

'Higgins' 'Piper Cub'

'Jones' 'B-52 Bomber'

'Jones' 'F-14 Fighter'

'Smith' 'B-1 Bomber'

'Smith' 'B-52 Bomber'

'Smith' 'F-14 Fighter'

'Wilson' 'B-1 Bomber'

'Wilson' 'B-52 Bomber'

'Wilson' 'F-14 Fighter'

'Wilson' 'F-17 Fighter'

CREATE TABLE Hangar

(plane CHAR(15) NOT NULL PRIMARY KEY);

Hangar

plane

=============

'B-1 Bomber'

'B-52 Bomber'

'F-14 Fighter'

PilotSkills DIVIDED BY Hangar

pilot

=============== ==============

'Smith'

'Wilson'

In this example, Smith and Wilson are the two pilots who can fly

everything in the hangar. Notice that Higgins and Celko know how to

fly a Piper Cub, but we don't have one right now. In Codd's original

definition of relational division, having more rows than are called for

is not a problem.

The important characteristic of a relational division is that the CROSS

JOIN (Cartesian product) of the divisor and the quotient produces a

valid subset of rows from the dividend. This is where the name comes

from, since the CROSS JOIN acts like a multiplication operator.

Division with a Remainder

There are two kinds of relational division. Division with a remainder

allows the dividend table to have more values than the divisor, which

was Codd's original definition. For example, if a pilot can fly more

planes than just those we have in the hangar, this is fine with us.

The query can be written in SQL-89 as

SELECT DISTINCT pilot

FROM PilotSkills AS PS1

WHERE NOT EXISTS

(SELECT *

FROM Hangar

WHERE NOT EXISTS

(SELECT *

FROM PilotSkills AS PS2

WHERE (PS1.pilot = PS2.pilot)

AND (PS2.plane = Hangar.plane))) ;

The quickest way to explain what is happening in this query is to

imagine an old World War II movie where a cocky pilot has just walked

into the hangar, looked over the fleet, and announced, "There ain't no

plane in this hangar that I can't fly!" We are finding the pilots for

whom there does not exist a plane in the hangar for which they have no

skills. The use of the NOT EXISTS() predicates is for speed. Most SQL

systems will look up a value in an index rather than scan the whole

table. The SELECT * clause lets the query optimizer choose the column

to use when looking for the index.

This query for relational division was made popular by Chris Date in

his textbooks, but it is not the only method nor always the fastest.

Another version of the division can be written so as to avoid three

levels of nesting. While it is not original with me, I have made it

popular in my books.

SELECT PS1.pilot

FROM PilotSkills AS PS1, Hangar AS H1

WHERE PS1.plane = H1.plane

GROUP BY PS1.pilot

HAVING COUNT(PS1.plane ) = (SELECT COUNT(plane) FROM Hangar);

There is a serious difference in the two methods. Burn down the

hangar, so that the divisor is empty. Because of the NOT EXISTS()

predicates in Date's query, all pilots are returned from a division by

an empty set. Because of the COUNT() functions in my query, no pilots

are returned from a division by an empty set.

In the sixth edition of his book, INTRODUCTION TO DATABASE SYSTEMS

(Addison-Wesley; 1995 ;ISBN 0-201-82458-2), Chris Date defined another

operator (DIVIDEBY ... PER) which produces the same results as my

query, but with more complexity.

Exact Division

The second kind of relational division is exact relational division.

The dividend table must match exactly to the values of the divisor

without any extra values.

SELECT PS1.pilot

FROM PilotSkills AS PS1

LEFT OUTER JOIN

Hangar AS H1

ON PS1.plane = H1.plane

GROUP BY PS1.pilot

HAVING COUNT(PS1.plane ) = (SELECT COUNT(plane) FROM Hangar)

AND COUNT(H1.plane) = (SELECT COUNT(plane) FROM Hangar);

This says that a pilot must have the same number of certificates as

there planes in the hangar and these certificates all match to a plane

in the hangar, not something else. The "something else" is shown by a

created NULL from the LEFT OUTER JOIN.

Please do not make the mistake of trying to reduce the HAVING clause

with a little algebra to:

HAVING COUNT(PS1.plane ) = COUNT(H1.plane)

because it does not work; it will tell you that the hangar has (n)

planes in it and the pilot is certified for (n) planes, but not that

those two sets of planes are equal to each other.

Note on Performance

The nested EXISTS() predicates version of relational division was made

popular by Chris Date's textbooks, while the author is associated with

popularizing the COUNT(*) version of relational division. The Winter

1996 edition of DB2 ON-LINE MAGAZINE

(

http://www.db2mag.com/96011ar:htm) had an article entitled "Powerful

SQL:Beyond the Basics" by Sheryl Larsen which gave the results of

testing both methods. Her conclusion for DB2 was that the nested

EXISTS() version is better when the quotient has less than 25% of the

dividend table's rows and the COUNT(*) version is better when the

quotient is more than 25% of the dividend table.