473,320 Members | 1,719 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Is this possible with SQL?

Hello,

I need to somehow relate certain records together in a table and I am
wondering if this is possible in SQL at all. Consider the table
definition below. I need to find all pairs that have associations
whether through id1 or id2. For the sample data set, the result should
be two groupings: (1, 3, 5) and (2,4). The result could be in any
format (rows and/or columns) as long as there is some indication of the
groupings so I can print out a report.
Here is the sql to create a test case with a small sample of the data:

create table pairs (id1 integer, id2 integer);

insert into pairs values (1, 1);
insert into pairs values (1, 5);

insert into pairs values (3, 3);
insert into pairs values (3, 5);
insert into pairs values (5, 1);

insert into pairs values (5, 3);
insert into pairs values (5, 5);

insert into pairs values (2, 4);

Nov 19 '06 #1
12 1814

de*******@yahoo.com wrote:
Hello,

I need to somehow relate certain records together in a table and I am
wondering if this is possible in SQL at all. Consider the table
definition below. I need to find all pairs that have associations
whether through id1 or id2. For the sample data set, the result should
be two groupings: (1, 3, 5) and (2,4). The result could be in any
format (rows and/or columns) as long as there is some indication of the
groupings so I can print out a report.
In general I would say no, this is not *practically* possible in SQL.
What you are describing is an unrestricted graph and sql does not
handle this very well. See for example this article:

* Maintaining the transitive closure of graphs in SQL.
Guozhu Dong, Leonid Libkin, Jianwen Su and Limsoon Wong.
Int. Journal of Information Technology, 5 (1999), 46-78.

http://citeseer.ist.psu.edu/cache/pa...aintaining.pdf

Especially chapter 4: Transitive closure of arbitrary directed graphs
If you could restrict your graph somehow, there might be easier
solutions lurking around. Trees for example are quite easy to handle,
and there are a number of well known methods.

/Lennart

Nov 19 '06 #2
>I need to somehow relate certain records [sic] together in a table and I am wondering if this is possible in SQL at all. Consider the table definition below. I need to find all pairs that have associations whether through id1 or id2. <<

Can I assume that your DDL should have been:

CREATE TABLE Pairs
(id1 INTEGER NOT NULL,
id2 INTEGER NOT NULL,
PRIMARY KEY(id1, id2));

Can you define what you mean by "association" ? I sse that {1,3,5} all
are in a group with (id1 = 5) , but since you do not have a (2,2) pair,
how did you get {2,4} ?

Nov 20 '06 #3
Lennart wrote:
In general I would say no, this is not *practically* possible in SQL.
What you are describing is an unrestricted graph and sql does not
handle this very well. See for example this article:

* Maintaining the transitive closure of graphs in SQL.
Guozhu Dong, Leonid Libkin, Jianwen Su and Limsoon Wong.
Int. Journal of Information Technology, 5 (1999), 46-78.

http://citeseer.ist.psu.edu/cache/pa...aintaining.pdf

Especially chapter 4: Transitive closure of arbitrary directed graphs

Thanks a lot for the reference. I am just reading it now and will see
what assumptions we can make about the data to simplify the solution.

Nov 20 '06 #4
--CELKO-- wrote:
I need to somehow relate certain records [sic] together in a table and I am wondering if this is possible in SQL at all. Consider the table definition below. I need to find all pairs that have associations whether through id1 or id2. <<

Can I assume that your DDL should have been:

CREATE TABLE Pairs
(id1 INTEGER NOT NULL,
id2 INTEGER NOT NULL,
PRIMARY KEY(id1, id2));

Yes, that is right. I did not include the constraints for sake of
brevity.

Can you define what you mean by "association" ? I sse that {1,3,5} all
are in a group with (id1 = 5) , but since you do not have a (2,2) pair,
how did you get {2,4} ?
Good observation! This is my mistake. The full set should also
include records (2,2), (4,2) and (4,4).

As a general rule, each record is always associated to itself. What I
mean by association is based on a pretty loose definition: if there is
a record with (x, y) in the table, then x is associated to y, and y is
associated to x. The challenge is to identify all the chains of
associations in the whole table.

Thanks!

Nov 20 '06 #5

de*******@yahoo.com wrote:
Hello,

I need to somehow relate certain records together in a table and I am
wondering if this is possible in SQL at all. Consider the table
definition below. I need to find all pairs that have associations
whether through id1 or id2. For the sample data set, the result should
be two groupings: (1, 3, 5) and (2,4). The result could be in any
format (rows and/or columns) as long as there is some indication of the
groupings so I can print out a report.
Here is the sql to create a test case with a small sample of the data:

create table pairs (id1 integer, id2 integer);

insert into pairs values (1, 1);
insert into pairs values (1, 5);

insert into pairs values (3, 3);
insert into pairs values (3, 5);
insert into pairs values (5, 1);

insert into pairs values (5, 3);
insert into pairs values (5, 5);

insert into pairs values (2, 4);
Isn't this just a recursive query?

B.

Nov 20 '06 #6

de*******@yahoo.com wrote:
--CELKO-- wrote:
[...]
Can you define what you mean by "association" ? I sse that {1,3,5} all
are in a group with (id1 = 5) , but since you do not have a (2,2) pair,
how did you get {2,4} ?

Good observation! This is my mistake. The full set should also
include records (2,2), (4,2) and (4,4).
Why is that necessary? If the graph is undirected why not store each
edge once, and remove the loops (x,x)? With your sample data that would
leave us with:

insert into pairs values (1, 5);
insert into pairs values (3, 5);
insert into pairs values (2, 4);

A constraint like CHECK ( id1 < id2 ) could be of help.

By storing both directions and the loops you will have to verify that
for each (x,y) you also have (x,x) (y,y) and (y,x).

Just my 2 euro thoughts

/Lennart

[...]

Nov 20 '06 #7

Lennart wrote:
[...]
A constraint like CHECK ( id1 < id2 ) could be of help.
Note that it doesnt prevent one from adding, say (1,3) to Pairs

/Lennart

Nov 20 '06 #8
Can we assume that you can get rid of identity-type pairs, e.g., {1,1},
as well as one of any reciprocals, e.g., remove either {3,5} or {5,3}?

While you're mulling that over, you might want to take a look at a
posting of mine in this forum titled "Trees, recursion, and grouping,"
in which I was asking in essence the same question.

Some very helpful queries were proferred, as was a clarification by Joe
that I was not dealing with a tree, but rather with a forest, as you
are (after de-duping as I suggest above, you are left w/ two
independent root nodes: 1 and 2).

In a follow-up to this posting, I will post the query I ended up with
(as soon as I track it down :-).

--Jeff

de*******@yahoo.com wrote:
Hello,

I need to somehow relate certain records together in a table and I am
wondering if this is possible in SQL at all. Consider the table
definition below. I need to find all pairs that have associations
whether through id1 or id2. For the sample data set, the result should
be two groupings: (1, 3, 5) and (2,4). The result could be in any
format (rows and/or columns) as long as there is some indication of the
groupings so I can print out a report.
Here is the sql to create a test case with a small sample of the data:

create table pairs (id1 integer, id2 integer);

insert into pairs values (1, 1);
insert into pairs values (1, 5);

insert into pairs values (3, 3);
insert into pairs values (3, 5);
insert into pairs values (5, 1);

insert into pairs values (5, 3);
insert into pairs values (5, 5);

insert into pairs values (2, 4);
Nov 21 '06 #9
Great paper--thanks for sharing it!

--Jeff

Lennart wrote:
de*******@yahoo.com wrote:
Hello,

I need to somehow relate certain records together in a table and I am
wondering if this is possible in SQL at all. Consider the table
definition below. I need to find all pairs that have associations
whether through id1 or id2. For the sample data set, the result should
be two groupings: (1, 3, 5) and (2,4). The result could be in any
format (rows and/or columns) as long as there is some indication of the
groupings so I can print out a report.

In general I would say no, this is not *practically* possible in SQL.
What you are describing is an unrestricted graph and sql does not
handle this very well. See for example this article:

* Maintaining the transitive closure of graphs in SQL.
Guozhu Dong, Leonid Libkin, Jianwen Su and Limsoon Wong.
Int. Journal of Information Technology, 5 (1999), 46-78.

http://citeseer.ist.psu.edu/cache/pa...aintaining.pdf

Especially chapter 4: Transitive closure of arbitrary directed graphs
If you could restrict your graph somehow, there might be easier
solutions lurking around. Trees for example are quite easy to handle,
and there are a number of well known methods.

/Lennart
Nov 21 '06 #10
OK--here's the gist of what I came up with (with much help from those
who replied to my own posting):

WITH
PAIRS(ID_A, ID_B) AS
(
VALUES
(1,5),
(5,3),
(2,4)
),
PAIR_TREE(LEVEL, ID_A, ID_B)
AS
(
SELECT
DISTINCT 1,
PRS1.ID_A,
PRS1.ID_A
FROM
PAIRS PRS1
WHERE
NOT EXISTS
(
SELECT
1
FROM
PAIRS PRS2
WHERE
PRS2.ID_B = PRS1.ID_A
)
UNION ALL
SELECT
PT.LEVEL + 1,
PT.ID_A,
PRS.ID_B
FROM
PAIRS PRS,
PAIR_TREE PT
WHERE
PRS.ID_A = PT.ID_B
)
SELECT
DENSE_RANK() OVER(ORDER BY PT.ID_A),
CASE
LEVEL
WHEN
1
THEN
PT.ID_A
ELSE
PT.ID_B
END
FROM
PAIR_TREE PT
ORDER BY
PT.ID_A

Note that I actually had to abandon this whole effort, as my forests
were not mutually exclusive with respect to nodes. In other words,
where you have two separate trees--{1,3,5} and {2,4}, I might have had
{1,3,5}, {2,4}, and {4,5}.

Note also that you might get into trouble with infinite recursion if
you have something like {3,5}, {5,1}, {5,3}, as 3 points to 5 which
points to 3 which points to.... You get the idea. The way around this
is usually to prime your recursive query with a starting value (or
hard-code a level/boundary), but I'm not sure you can do this as it
looks like you want your query to explore all trees on its own.

I'm no discrete math expert, but bottome-line is you should be OK if 1)
your trees don't share nodes and 2) your leaf nodes doint point back to
branches or the root (i.e., your graph is acyclic).

This is really a fascinating area of query writing. Thanks for posing
your question, and best of luck with your query.

--Jeff

de*******@yahoo.com wrote:
Hello,

I need to somehow relate certain records together in a table and I am
wondering if this is possible in SQL at all. Consider the table
definition below. I need to find all pairs that have associations
whether through id1 or id2. For the sample data set, the result should
be two groupings: (1, 3, 5) and (2,4). The result could be in any
format (rows and/or columns) as long as there is some indication of the
groupings so I can print out a report.
Here is the sql to create a test case with a small sample of the data:

create table pairs (id1 integer, id2 integer);

insert into pairs values (1, 1);
insert into pairs values (1, 5);

insert into pairs values (3, 3);
insert into pairs values (3, 5);
insert into pairs values (5, 1);

insert into pairs values (5, 3);
insert into pairs values (5, 5);

insert into pairs values (2, 4);
Nov 21 '06 #11
jefftyzzer wrote:
OK--here's the gist of what I came up with (with much help from those
who replied to my own posting):

WITH
PAIRS(ID_A, ID_B) AS
(
VALUES
(1,5),
(5,3),
(2,4)
),
PAIR_TREE(LEVEL, ID_A, ID_B)
AS
(
SELECT
DISTINCT 1,
PRS1.ID_A,
PRS1.ID_A
FROM
PAIRS PRS1
WHERE
NOT EXISTS
(
SELECT
1
FROM
PAIRS PRS2
WHERE
PRS2.ID_B = PRS1.ID_A
)
UNION ALL
SELECT
PT.LEVEL + 1,
PT.ID_A,
PRS.ID_B
FROM
PAIRS PRS,
PAIR_TREE PT
WHERE
PRS.ID_A = PT.ID_B
)
SELECT
DENSE_RANK() OVER(ORDER BY PT.ID_A),
CASE
LEVEL
WHEN
1
THEN
PT.ID_A
ELSE
PT.ID_B
END
FROM
PAIR_TREE PT
ORDER BY
PT.ID_A

Note that I actually had to abandon this whole effort, as my forests
were not mutually exclusive with respect to nodes. In other words,
where you have two separate trees--{1,3,5} and {2,4}, I might have had
{1,3,5}, {2,4}, and {4,5}.

Note also that you might get into trouble with infinite recursion if
you have something like {3,5}, {5,1}, {5,3}, as 3 points to 5 which
points to 3 which points to.... You get the idea. The way around this
is usually to prime your recursive query with a starting value (or
hard-code a level/boundary), but I'm not sure you can do this as it
looks like you want your query to explore all trees on its own.

I'm no discrete math expert, but bottome-line is you should be OK if 1)
your trees don't share nodes and 2) your leaf nodes doint point back to
branches or the root (i.e., your graph is acyclic).

This is really a fascinating area of query writing. Thanks for posing
your question, and best of luck with your query.

--Jeff
If you search, you'll see that Knut had a suggestion about storing the
path to detect recursion.

B.

Nov 21 '06 #12
>As a general rule, each record is always associated to itself. What I mean by association is based on a pretty loose definition: if there is a record [sic] with (x, y) in the table, then x is associated to y, and y is associated to x. The challenge is to identify all the chains of associations in the whole table. <<

Look up Cogito at www.cogitoinc.com. They make a graph analytic tool
for this kind of problem. I did part of a white paper for them in
which I used SQL to solve the "Six degrees of Kevin Bacon" problem.
Yes, it can be done; but much like making a size 26 thong, it shoudl
not be attempted with real data.

Nov 21 '06 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Julia Briggs | last post by:
I am struggling to create a PHP function that would take a specified image (JPG, GIF or PNG) from a link, and resize it down to a thumbnail so it will always fit in a 200x250 space. I am hoping...
36
by: rbt | last post by:
Say I have a list that has 3 letters in it: I want to print all the possible 4 digit combinations of those 3 letters: 4^3 = 64 aaaa
20
by: CHIN | last post by:
Hi all.. here s my problem ( maybe some of you saw me on other groups, but i cant find the solution !! ) I have to upload a file to an external site, so, i made a .vbs file , that logins to...
7
by: Andrzej | last post by:
Is it possible to call a function which name is given by a string? Let assume that I created a program which call some functions for example void f1(void), void f2(void), void f3(void). ...
2
by: Bhupesh Naik | last post by:
This is a query regarding my problem to make a spell and grammar check possible in text area of a web page. We have aspx pages which are used to construct letters. The browser based screens...
1
by: AAA | last post by:
hi, I'll explain fastly the program that i'm doing.. the computer asks me to enter the cardinal of a set X ( called "dimX" type integer)where X is a table of one dimension and then to fill it...
25
by: Piotr Nowak | last post by:
Hi, Say i have a server process which listens for some changes in database. When a change occurs i want to refresh my page in browser by notyfinig it. I do not want to refresh my page i.e....
4
by: RSH | last post by:
Okay my math skills aren't waht they used to be... With that being said what Im trying to do is create a matrix that given x number of columns, and y number of possible values i want to generate...
7
by: Robert S. | last post by:
Searching some time now for documents on this but still did not find anything about it: Is it possible to replace the entry screen of MS Office Access 2007 - that one presenting that default...
14
by: bjorklund.emil | last post by:
Hello pythonistas. I'm a newbie to pretty much both programming and Python. I have a task that involves writing a test script for every possible combination of preference settings for a software...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.