I am doing a contract job for a client with
some existing messy data.
I am trying to find a solution to a problem of
many-to-many mappings between two tables in
a database. See below for my contrived example.
Tables a and b each have simple primary keys.
They also have 2 columns that are related to
each other. In other words, a1 and b1 carry the
same "key" type of data. Same thing with a2 and
b2. I need to locate all of the "groups" of records
in both table a and table b where there is
more than one record in both tables. Man, this is
even difficult to state...
DROP DATABASE IF EXISTS many;
CREATE DATABASE many;
USE many;
CREATE TABLE a
(
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
a1 INT NOT NULL,
INDEX(a1),
a2 INT NOT NULL,
INDEX(a2)
);
CREATE TABLE b
(
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
b1 INT NOT NULL,
INDEX(a1),
b2 INT NOT NULL,
INDEX(a2)
);
INSERT INTO a (a1,a2) VALUES (1,8);
INSERT INTO a (a1,a2) VALUES (1,2);
INSERT INTO a (a1,a2) VALUES (2,1);
INSERT INTO a (a1,a2) VALUES (1,8);
INSERT INTO a (a1,a2) VALUES (1,1);
INSERT INTO a (a1,a2) VALUES (1,1);
INSERT INTO a (a1,a2) VALUES (4,4);
INSERT INTO a (a1,a2) VALUES (1,2);
INSERT INTO a (a1,a2) VALUES (1,2);
INSERT INTO b (b1,b2) VALUES (1,2);
INSERT INTO b (b1,b2) VALUES (2,1);
INSERT INTO b (b1,b2) VALUES (1,8);
INSERT INTO b (b1,b2) VALUES (1,8);
INSERT INTO b (b1,b2) VALUES (3,3);
INSERT INTO b (b1,b2) VALUES (1,2);
OK, so in my contrived example, I would like to
retrieve groups as follows:
group: table a: #1, #4; table b: #3, #4;
group: table a: #2, #8, #9; table b: #1, #6;
Hopefully, you follow my notation.
Is there a way to do this in straight SQL? If not,
how would you do it? The tables are huge, with 10's
of millions of records in each table.
Thanks in advance.
Dean Hoover