469,625 Members | 1,687 Online

# transitive closure of a graph

Transitive closure (TC) of a graph is

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

except that this query might never terminate. I was surprised to learn
from Graeme Birchall's DB2 cookbook that there is no really
satisfactory solution to this problem. Graeme suggests building a path
and searching for repeated node entry in the path. Is there a more
satisfactory approach that doesn't require string parsing? I understand
that the problem of finding loops is undecideable in general case, but,
cmon, we are taking about very narrowly scoped transitive closure
problem here.

BTW, TC with goofy oracle syntax is just 3 lines

select connect_by_root(tail), tail
from Edges
connect by nocycle prior head = tail

with cycles being automatically taken care of.

Nov 12 '05 #1
9 2519

Mikito Harakiri wrote:
Transitive closure (TC) of a graph is

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

except that this query might never terminate.

Oops, it's actually quite obvious

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

Graeme handles more general case, hence the complication.

Nov 12 '05 #2
Mikito Harakiri wrote:
Mikito Harakiri wrote:
Transitive closure (TC) of a graph is

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

except that this query might never terminate.

Oops, it's actually quite obvious

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

Graeme handles more general case, hence the complication.

There will be a new developerWorks article on mapping CONnECT BY to DB2
tomorrow (if all goes well).

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #3

Serge Rielau wrote:
Mikito Harakiri wrote:
Mikito Harakiri wrote:
Transitive closure (TC) of a graph is

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

except that this query might never terminate.

Oops, it's actually quite obvious

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

Graeme handles more general case, hence the complication.

There will be a new developerWorks article on mapping CONnECT BY to DB2
tomorrow (if all goes well).

There is July 2003 article by Torsten Steinbach. I guess your article
would address 10g features, cycle handling first of all. Looking

Interestingly, that recursive "with" was brought up in past flame wars
"A" vs. "B". (As if technical excellence really matters:-). It's
amaising what difference a tiny standard uncomplience makes -- I mean
"union all" instead of "union". Suddenly, transitive closure for graph
with cycles is not expressible anymore.

Nov 12 '05 #4
Mikito Harakiri wrote:
Transitive closure (TC) of a graph is

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

except that this query might never terminate. I was surprised to learn
from Graeme Birchall's DB2 cookbook that there is no really
satisfactory solution to this problem. Graeme suggests building a path
and searching for repeated node entry in the path. Is there a more
satisfactory approach that doesn't require string parsing? I understand
that the problem of finding loops is undecideable in general case, but,
cmon, we are taking about very narrowly scoped transitive closure
problem here.

BTW, TC with goofy oracle syntax is just 3 lines

select connect_by_root(tail), tail
from Edges
connect by nocycle prior head = tail

with cycles being automatically taken care of.

I wonder if the non-termination problem is a result of the sql syntax
rather than the basic relationships.

I don't know sql well enough to express an alternative solution but it
seems to me that the basic 'parts explosion' problem, if we ignore part
counts and part levels, doesn't require recursion.

Ie., if the starting table can be viewed as a set of pairs of nodes then
each row represents an arc between adjacent nodes. Say this this
table's columns are named A and B, where A and B stand, respectively,
say for 'part' and 'sub-part'. Ignoring issues of optimization we could
take the product of this table and a copy of itself where A and B have
been renamed to C and D. This table might be theoretically big but it
would be finite. It would contain a row for every possible pair of arcs
in the graph. I would think we need only to select pairs of arcs that
are connected, either because the original table says they are or
because the tail of the LHS arc matches the head of the RHS arc. I'd
think this would give all sub-parts of every part because there could be
no possible arc pair that wasn't in the product.

The simple closure of the original table would be the union of the rows
where B = C and the rows where {B,C} match the original {A,B} rows with
the original column names now being represented by the A and D columns.
That whole mess unioned with the original table would give all the
derived pairs.

That's how I worked some small examples out on paper, even ones that had
loops and multiple graphs. I imagine if filtering was applied as the
product was built, then a large cross-product intermediate result could
be avoided. Also, simple (uncompounded) part counts would be possible,
but not levels as far as I can see. Maybe compound counts and levels
could be produced from the simple closure, after the fact.

cheers,
paul c.

Nov 12 '05 #5
Mikito Harakiri wrote:
Serge Rielau wrote:
Mikito Harakiri wrote:
Mikito Harakiri wrote:
Transitive closure (TC) of a graph is

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

except that this query might never terminate.
Oops, it's actually quite obvious

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

Graeme handles more general case, hence the complication.

There will be a new developerWorks article on mapping CONnECT BY to DB2
tomorrow (if all goes well).

There is July 2003 article by Torsten Steinbach. I guess your article
would address 10g features, cycle handling first of all. Looking

Interestingly, that recursive "with" was brought up in past flame wars
"A" vs. "B". (As if technical excellence really matters:-). It's
amaising what difference a tiny standard uncomplience makes -- I mean
"union all" instead of "union". Suddenly, transitive closure for graph
with cycles is not expressible anymore.

LEVEL pseudo column
The CONNECT_BY_ROOT expression
The SYS_CONNECT_BY_PATH() procedure
ORDER SIBLINGS BY expression
The NOCYCLE keyword
CONNECT_BY_ISCYCLE
CONNECT_BY_ISLEAF

Unless I missed something it's a complete and generic mapping up to
Oracle 10gR2.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #6

Serge Rielau wrote:
Mikito Harakiri wrote:
Serge Rielau wrote:
Mikito Harakiri wrote:

Mikito Harakiri wrote:
>Transitive closure (TC) of a graph is
>
>( select tail, head from Edges
> union all
> select e.tail, ee.head from Edges e, TransClosedEdges ee
>)
>select distinct * from TransClosedEdges
>
>except that this query might never terminate.
Oops, it's actually quite obvious

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

Graeme handles more general case, hence the complication.
There will be a new developerWorks article on mapping CONnECT BY to DB2
tomorrow (if all goes well).

There is July 2003 article by Torsten Steinbach. I guess your article
would address 10g features, cycle handling first of all. Looking

Interestingly, that recursive "with" was brought up in past flame wars
"A" vs. "B". (As if technical excellence really matters:-). It's
amaising what difference a tiny standard uncomplience makes -- I mean
"union all" instead of "union". Suddenly, transitive closure for graph
with cycles is not expressible anymore.

LEVEL pseudo column
The CONNECT_BY_ROOT expression
The SYS_CONNECT_BY_PATH() procedure
ORDER SIBLINGS BY expression
The NOCYCLE keyword
CONNECT_BY_ISCYCLE
CONNECT_BY_ISLEAF

Unless I missed something it's a complete and generic mapping up to
Oracle 10gR2.

How do you implement depth first traversal without a UDF ?

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab

Nov 12 '05 #7
vc wrote:
Serge Rielau wrote:
Mikito Harakiri wrote:
Serge Rielau wrote:
Mikito Harakiri wrote:
>Mikito Harakiri wrote:
>
>
>
>>Transitive closure (TC) of a graph is
>>
>>( select tail, head from Edges
>>union all
>>select e.tail, ee.head from Edges e, TransClosedEdges ee
>>)
>>select distinct * from TransClosedEdges
>>
>>except that this query might never terminate.
>
>
>Oops, it's actually quite obvious
>
>( select tail, head from Edges
> where tail <> head --*******
> union all
> select e.tail, ee.head from Edges e, TransClosedEdges ee
> and e.tail <> ee.head --*******
>)
>select distinct * from TransClosedEdges
>
>Graeme handles more general case, hence the complication.
>

There will be a new developerWorks article on mapping CONnECT BY to DB2
tomorrow (if all goes well).
There is July 2003 article by Torsten Steinbach. I guess your article
would address 10g features, cycle handling first of all. Looking

Interestingly, that recursive "with" was brought up in past flame wars
"A" vs. "B". (As if technical excellence really matters:-). It's
amaising what difference a tiny standard uncomplience makes -- I mean
"union all" instead of "union". Suddenly, transitive closure for graph
with cycles is not expressible anymore.

LEVEL pseudo column
The CONNECT_BY_ROOT expression
The SYS_CONNECT_BY_PATH() procedure
ORDER SIBLINGS BY expression
The NOCYCLE keyword
CONNECT_BY_ISCYCLE
CONNECT_BY_ISLEAF

Unless I missed something it's a complete and generic mapping up to
Oracle 10gR2.

How do you implement depth first traversal without a UDF ?

Where did I say I excluded UDFs? I like UDF :-)
Cheers
Serge

--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #8

Serge Rielau wrote:
Mikito Harakiri wrote:
Mikito Harakiri wrote:
Transitive closure (TC) of a graph is

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

except that this query might never terminate.

Oops, it's actually quite obvious

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

Graeme handles more general case, hence the complication.

There will be a new developerWorks article on mapping CONnECT BY to DB2
tomorrow (if all goes well).

See your article. Unfortunately, understanding code in C is beyond my
capabilities. What is "binary-encoded path"? Suppose we have

Name Sal
------------ ---
KING 100
--> SMITH 50
-----> JONES 20
-----> JAMES 30

What is the binary-encoded path for JAMES node is?

Nov 12 '05 #9
Mikito Harakiri wrote:
Serge Rielau wrote:
Mikito Harakiri wrote:
Mikito Harakiri wrote:
Transitive closure (TC) of a graph is

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

except that this query might never terminate.
Oops, it's actually quite obvious

( select tail, head from Edges
union all
select e.tail, ee.head from Edges e, TransClosedEdges ee
)
select distinct * from TransClosedEdges

Graeme handles more general case, hence the complication.

There will be a new developerWorks article on mapping CONnECT BY to DB2
tomorrow (if all goes well).

See your article. Unfortunately, understanding code in C is beyond my
capabilities. What is "binary-encoded path"? Suppose we have

Name Sal
------------ ---
KING 100
--> SMITH 50
-----> JONES 20
-----> JAMES 30

What is the binary-encoded path for JAMES node is?

Teh binary encoded path is a concatenation of INTEGERs.
So in "SOURCE" I number the input with ROW_NUMBER() OVER().
Assuming the rows get numbered 1. KING, 2. SMITH, 3. JONES, 4. James.
With 256 Bytes / 4 Bytes Integers => 64 Levels
So James is be:
0x000000010000000200000004
King is:
0x000000010

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #10

### This discussion thread is closed

Replies have been disabled for this discussion.