469,579 Members | 1,899 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,579 developers. It's quick & easy.

Nested Coalescing possible in SQL?

First of all, I apologize if coalescing is not the right term to
describe my problem. I have a tree where each node has the same set of
attributes (is the same entity) but child nodes should inherit
attribute values from parent node.

for example, say I have the following table:

(nodeId int , color varchar, phone varchar) with two rows

5, "GREEN", "555-1212"
7, NULL, "777-5555"
8, NULL, NULL
9, "BLUE", NULL

in addition there is a tree structure that specifies that node 5 is
the parent of node 7, and 7 is the parent of nodes 8 and 9. I know
there is many ways to make trees in SQL but as a simple example let's
say the tree is:

id, parentid
8, 7
9, 7
7, 5

Thus in this case, node 7 inherits the value "GREEN" from node 5 for
attribute "color", but provides its own value "777-5555" for attribute
"phone". Node 8, in turn, inherits "GREEN" for "color" from node 7
(really from node 5 since 7 did not specify its own) and "777-5555"
for "phone" from node 7. Node 9 provides its own value for "color" and
inherits the one for "phone" from Node 7.

So the runtime values in the application are:

Node 5: "GREEN", "555-1212"
Node 7: "GREEN", "777-5555"
Node 8: "GREEN", "777-5555"
Node 9: "BLUE", "777-5555"

Question 1: Is there a single SQL statement that for a given node can
replace the NULLs with inherited values from the parent node?

Question 2: Is there a better way to structure such data in SQL as to
make answer to question 1 possible?

I would restate the problem as follows:

In a nested structure child nodes inherit values from parent nodes _by
reference_ or specify their own. "By reference" is the key word here.
If it wasn't for that you could just duplicate the necessary values
from the parent entitity upon creation.

Thanks!

- Jeff
Jul 20 '05 #1
12 3739

"Jeff Lanfield" <jl***********@yahoo.com> wrote in message
news:23**************************@posting.google.c om...
First of all, I apologize if coalescing is not the right term to
describe my problem. I have a tree where each node has the same set of
attributes (is the same entity) but child nodes should inherit
attribute values from parent node.

for example, say I have the following table:

(nodeId int , color varchar, phone varchar) with two rows

5, "GREEN", "555-1212"
7, NULL, "777-5555"
8, NULL, NULL
9, "BLUE", NULL

in addition there is a tree structure that specifies that node 5 is
the parent of node 7, and 7 is the parent of nodes 8 and 9. I know
there is many ways to make trees in SQL but as a simple example let's
say the tree is:

id, parentid
8, 7
9, 7
7, 5

Thus in this case, node 7 inherits the value "GREEN" from node 5 for
attribute "color", but provides its own value "777-5555" for attribute
"phone". Node 8, in turn, inherits "GREEN" for "color" from node 7
(really from node 5 since 7 did not specify its own) and "777-5555"
for "phone" from node 7. Node 9 provides its own value for "color" and
inherits the one for "phone" from Node 7.

So the runtime values in the application are:

Node 5: "GREEN", "555-1212"
Node 7: "GREEN", "777-5555"
Node 8: "GREEN", "777-5555"
Node 9: "BLUE", "777-5555"

Question 1: Is there a single SQL statement that for a given node can
replace the NULLs with inherited values from the parent node?

Question 2: Is there a better way to structure such data in SQL as to
make answer to question 1 possible?

I would restate the problem as follows:

In a nested structure child nodes inherit values from parent nodes _by
reference_ or specify their own. "By reference" is the key word here.
If it wasn't for that you could just duplicate the necessary values
from the parent entitity upon creation.


It looks easy. Find a closest node in the chain of ancestors that has
property not NULL.
Jul 20 '05 #2
jl***********@yahoo.com (Jeff Lanfield) wrote in message news:<23**************************@posting.google. com>...
First of all, I apologize if coalescing is not the right term to
describe my problem. I have a tree where each node has the same set of
attributes (is the same entity) but child nodes should inherit
attribute values from parent node.

for example, say I have the following table:

(nodeId int , color varchar, phone varchar) with two rows

5, "GREEN", "555-1212"
7, NULL, "777-5555"
8, NULL, NULL
9, "BLUE", NULL

in addition there is a tree structure that specifies that node 5 is
the parent of node 7, and 7 is the parent of nodes 8 and 9. I know
there is many ways to make trees in SQL but as a simple example let's
say the tree is:ancestor a

id, parentid
8, 7
9, 7
7, 5

Thus in this case, node 7 inherits the value "GREEN" from node 5 for
attribute "color", but provides its own value "777-5555" for attribute
"phone". Node 8, in turn, inherits "GREEN" for "color" from node 7
(really from node 5 since 7 did not specify its own) and "777-5555"
for "phone" from node 7. Node 9 provides its own value for "color" and
inherits the one for "phone" from Node 7.

So the runtime values in the application are:

Node 5: "GREEN", "555-1212"
Node 7: "GREEN", "777-5555"
Node 8: "GREEN", "777-5555"
Node 9: "BLUE", "777-5555"

Question 1: Is there a single SQL statement that for a given node can
replace the NULLs with inherited values from the parent node?

Question 2: Is there a better way to structure such data in SQL as to
make answer to question 1 possible?

I would restate the problem as follows:

In a nested structure child nodes inherit values from parent nodes _by
reference_ or specify their own. "By reference" is the key word here.
If it wasn't for that you could just duplicate the necessary values
from the parent entitity upon creation.

Thanks!

- Jeff

If you would like to do it in a single query, you will have to extend
your tree, or if your db supports it, you can use recursion. There are
several ways of extending your tree, nested set, transitive closure,
etc. If you google comp.database and comp.database.theory you will
find several threads regarding this.

Assuming you can "calculate" the set of ancestors for any given node,
define the suspects as "ancestors with property p". The property you
are looking for can be found in the suspect with the largest depth*.

*depth = number of ancestors
HTH
/Lennart
Jul 20 '05 #3
>> Question 2: Is there a better way to structure such data in SQL as
to make answer to question 1 possible? <<

Here is the link on Amazon.com for my new book on "Trees & Hierarchies
in SQL"

http://www.amazon.com/exec/obidos/tg...roduct-details

The classic scenario calls for a root class with all the common
attributes and then specialized sub-classes under it. As an example,
let's take the class of Vehicles and find an industry standard
identifier (VIN), and add two mutually exclusive sub-classes, Sport
utility vehicles and sedans ('SUV', 'SED').

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_type IN ('SUV', 'SED')),
UNIQUE (vin, vehicle_type),
..);

Notice the overlapping candidate keys. I then use a compound candidate
key (vin, vehicle_type) and a constraint in each sub-class table to
assure that the vehicle_type is locked and agrees with the Vehicles
table. Add some DRI actions and you are done:

CREATE TABLE SUV
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SUV' NOT NULL
CHECK(vehicle_type = 'SUV'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_type = 'SED'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

I can continue to build a hierarchy like this. For example, if I had
a Sedans table that broke down into two-door and four-door sedans, I
could a schema like this:

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_type IN ('2DR', '4DR', 'SED')),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE TwoDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '2DR' NOT NULL
CHECK(vehicle_type = '2DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE FourDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '4DR' NOT NULL
CHECK(vehicle_type = '4DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans (vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

The idea is to build a chain of identifiers and types in a UNIQUE()
constraint that go up the tree when you use a REFERENCES constraint.
Obviously, you can do variants of this trick to get different class
structures.

If an entity doesn't have to be exclusively one subtype, you play with
the root of the class hierarchy:

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_type IN ('SUV', 'SED')),
PRIMARY KEY (vin, vehicle_type),
..);

Now start hiding all this stuff in VIEWs immediately and add an
INSTEAD OF trigger to those VIEWs.
Jul 20 '05 #4
jl***********@yahoo.com (Jeff Lanfield) wrote in message news:<23**************************@posting.google. com>...

[...]

So say the tree is specified like this:

(nodeId int, parentId int, color varchar, phone varchar)

5, 0,"GREEN", "555-1212"
7, 5, NULL, "777-5555"
8, 7 NULL, NULL
9, 7 "BLUE", NULL
That is: node 7 inherits values from 5. Nodes 8,9 inherit values from
7. Node 5 is the top level node.

I want to run a query that would give the following result set:

select nodeId, color, phone from ...

5,"GREEN", "555-1212"
7,"GREEN", "777-5555"
8,"GREEN", "777-5555"
9,"BLUE", "777-5555"

Can such a query be constructed?

If you dont have a maximum depth of your tree (and cannot use
recursion), then no. The reason is that you cannot express queries
like "gimmie the ancestors of x". What you need to do is to inform
your system that 5 is an ancestor of 9. There are several ways of
doing this. Troels Arvin has a page with links to articles on the
subject:

http://troels.arvin.dk/db/rdbms/links/#hierarchical

1. Nested set: google for Celko and nested. There are also variants of
this.
2. Store the path in each node. In your case something like:

5,'',"GREEN", "555-1212"
7,'5',"GREEN", "777-5555"
8,'5.7',"GREEN", "777-5555"
9,'5.7',"BLUE", "777-5555"

In your case I dont think this is an option

3. Store a separate ancestor relation. In your case

create table ancestor (
nodeid int not null
references tree,
ancestorid int not null
references tree,
primary key (nodeid, ancestorid)
);

insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);

Main drawback is that you have to maintain this relation as you
modifies your tree. I have some notes on how that can be achieved:

http://fungus.teststation.com/~jon/t...eeHandling.htm

Anyhow, once you have a way of asking for ancestors for a given node,
you can do what you want in a single query, namely:

find the ancestor at the largest depth with property p

Assuming the following ddl

create table tree (
nodeid int not null primary key,
parent int not null
references tree
on delete restrict
);

create table data (
nodeid int not null primary key
references tree
on delete cascade,
color varchar(10),
phone varchar(10)
);

insert into tree values (5,5), (7,5), (8,7), (9,7);
insert into data values (5, 'GREEN', '555-1212'), (7, NULL,
'777-5555'), (8, NUL
L, NULL), (9, 'BLUE', NULL);

create table ancestor (
nodeid int not null
references tree,
ancestorid int not null
references tree,
primary key (nodeid, ancestorid)
);

insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);

You could use a query like:

with suspects (nodeid, suspectid, depth) as (
select
a.nodeid, a.ancestorid,
(select count(1) from ancestor where nodeid = a.ancestorid) as
depth
from ancestor a, data d
where a.ancestorid = d.nodeid
and d.color is not null
union all
select
d.nodeid, d.nodeid,
(select count(1) from ancestor where nodeid = d.nodeid) as
depth
from data d
where d.color is not null
and not exists (
select 1 from ancestor where ancestorid = d.nodeid
)
)
select s.nodeid, d.color from suspects s, data d
where d.nodeid = s.suspectid
and depth = (select max(depth) from suspects where nodeid =
s.nodeid)

NODEID COLOR
----------- ----------
7 GREEN
8 GREEN
9 BLUE

3 record(s) selected.

As you can see node 5 is missing, but I'll leave that for you ;-)
HTH
/Lennart

- Jeff

Jul 20 '05 #5
Thanks --CELKO--, this is a very useful suggestion and I will keep it
in mind. However, it is designed to handle an hiearchy of types (e.g.
representing class hierarchy). My case is slightly different: I don't
have types and subtypes, I have only one table representing one
entity. The inheritance of values is in the following sense: if an
entity instance (one row) does not specify a value for one of its
fields it should inherit the values from its nearest parent that does
have a value for this field. Note that the immediate parent may have a
NULL in that field too, so you'd have to go the parent's parent and so
on till you find a non-NULL value for the field in question. I think I
should be able to use COALESCE to do it somehow but I can't come up
with a *single* query.

So the exact example is this. I have a tree structure of values, there
is only one type of entity. Per Lennart's suggestion I'm keeping the
full path pre-computed for the sake of simplicity. There is only this
one table:

(nodeId int, varchar parent_path, color varchar, phone varchar)

5, "0", "GREEN", "555-1212"
7, "0,5", NULL, "777-5555"
8, "0,5,7" NULL, NULL
9, "0,5,7" "BLUE", NULL

I want to run a query that would give the following result set:

select nodeId, color, phone from ...

5,"GREEN", "555-1212"
7,"GREEN", "777-5555"
8,"GREEN", "777-5555"
9,"BLUE", "777-5555"

Is it possible to have such a query?

- Jeff


jc*******@earthlink.net (--CELKO--) wrote in message news:<18**************************@posting.google. com>...
Question 2: Is there a better way to structure such data in SQL as

to make answer to question 1 possible? <<

Here is the link on Amazon.com for my new book on "Trees & Hierarchies
in SQL"

http://www.amazon.com/exec/obidos/tg...roduct-details

The classic scenario calls for a root class with all the common
attributes and then specialized sub-classes under it. As an example,
let's take the class of Vehicles and find an industry standard
identifier (VIN), and add two mutually exclusive sub-classes, Sport
utility vehicles and sedans ('SUV', 'SED').

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_type IN ('SUV', 'SED')),
UNIQUE (vin, vehicle_type),
..);

Notice the overlapping candidate keys. I then use a compound candidate
key (vin, vehicle_type) and a constraint in each sub-class table to
assure that the vehicle_type is locked and agrees with the Vehicles
table. Add some DRI actions and you are done:

CREATE TABLE SUV
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SUV' NOT NULL
CHECK(vehicle_type = 'SUV'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_type = 'SED'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

I can continue to build a hierarchy like this. For example, if I had
a Sedans table that broke down into two-door and four-door sedans, I
could a schema like this:

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_type IN ('2DR', '4DR', 'SED')),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE TwoDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '2DR' NOT NULL
CHECK(vehicle_type = '2DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE FourDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '4DR' NOT NULL
CHECK(vehicle_type = '4DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans (vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

The idea is to build a chain of identifiers and types in a UNIQUE()
constraint that go up the tree when you use a REFERENCES constraint.
Obviously, you can do variants of this trick to get different class
structures.

If an entity doesn't have to be exclusively one subtype, you play with
the root of the class hierarchy:

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_type IN ('SUV', 'SED')),
PRIMARY KEY (vin, vehicle_type),
..);

Now start hiding all this stuff in VIEWs immediately and add an
INSTEAD OF trigger to those VIEWs.

Jul 20 '05 #6
Thanks Lennart. I think your third suggestion might do the trick. If
you have a chance to answer I'm curious:

1) Why do you think storing the path (suggestion #2) is not an option?
Or if it is option, why is it a bad idea compared to storing a
separate ancestor relation? (aside from the usual normal form
reasoning)

2) How would you structure the query if you stored the path
(suggestion #2).

Thanks!

- Jeff

le*****@kommunicera.umea.se (Lennart Jonsson) wrote in message news:<6d************************@posting.google.co m>...
jl***********@yahoo.com (Jeff Lanfield) wrote in message news:<23**************************@posting.google. com>...

[...]

So say the tree is specified like this:

(nodeId int, parentId int, color varchar, phone varchar)

5, 0,"GREEN", "555-1212"
7, 5, NULL, "777-5555"
8, 7 NULL, NULL
9, 7 "BLUE", NULL
That is: node 7 inherits values from 5. Nodes 8,9 inherit values from
7. Node 5 is the top level node.

I want to run a query that would give the following result set:

select nodeId, color, phone from ...

5,"GREEN", "555-1212"
7,"GREEN", "777-5555"
8,"GREEN", "777-5555"
9,"BLUE", "777-5555"

Can such a query be constructed?


If you dont have a maximum depth of your tree (and cannot use
recursion), then no. The reason is that you cannot express queries
like "gimmie the ancestors of x". What you need to do is to inform
your system that 5 is an ancestor of 9. There are several ways of
doing this. Troels Arvin has a page with links to articles on the
subject:

http://troels.arvin.dk/db/rdbms/links/#hierarchical

1. Nested set: google for Celko and nested. There are also variants of
this.
2. Store the path in each node. In your case something like:

5,'',"GREEN", "555-1212"
7,'5',"GREEN", "777-5555"
8,'5.7',"GREEN", "777-5555"
9,'5.7',"BLUE", "777-5555"

In your case I dont think this is an option

3. Store a separate ancestor relation. In your case

create table ancestor (
nodeid int not null
references tree,
ancestorid int not null
references tree,
primary key (nodeid, ancestorid)
);

insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);

Main drawback is that you have to maintain this relation as you
modifies your tree. I have some notes on how that can be achieved:

http://fungus.teststation.com/~jon/t...eeHandling.htm

Anyhow, once you have a way of asking for ancestors for a given node,
you can do what you want in a single query, namely:

find the ancestor at the largest depth with property p

Assuming the following ddl

create table tree (
nodeid int not null primary key,
parent int not null
references tree
on delete restrict
);

create table data (
nodeid int not null primary key
references tree
on delete cascade,
color varchar(10),
phone varchar(10)
);

insert into tree values (5,5), (7,5), (8,7), (9,7);
insert into data values (5, 'GREEN', '555-1212'), (7, NULL,
'777-5555'), (8, NUL
L, NULL), (9, 'BLUE', NULL);

create table ancestor (
nodeid int not null
references tree,
ancestorid int not null
references tree,
primary key (nodeid, ancestorid)
);

insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);

You could use a query like:

with suspects (nodeid, suspectid, depth) as (
select
a.nodeid, a.ancestorid,
(select count(1) from ancestor where nodeid = a.ancestorid) as
depth
from ancestor a, data d
where a.ancestorid = d.nodeid
and d.color is not null
union all
select
d.nodeid, d.nodeid,
(select count(1) from ancestor where nodeid = d.nodeid) as
depth
from data d
where d.color is not null
and not exists (
select 1 from ancestor where ancestorid = d.nodeid
)
)
select s.nodeid, d.color from suspects s, data d
where d.nodeid = s.suspectid
and depth = (select max(depth) from suspects where nodeid =
s.nodeid)

NODEID COLOR
----------- ----------
7 GREEN
8 GREEN
9 BLUE

3 record(s) selected.

As you can see node 5 is missing, but I'll leave that for you ;-)
HTH
/Lennart

- Jeff

Jul 20 '05 #7
le*****@kommunicera.umea.se (Lennart Jonsson) wrote in message news:<6d************************@posting.google.co m>...
jl***********@yahoo.com (Jeff Lanfield) wrote in message news:<23**************************@posting.google. com>...


[...]

Hmmm, I must have been just a little bit tired there ;-) Should read
(diff in snd part of union):

with suspects (nodeid, suspectid, depth) as (
select
a.nodeid, a.ancestorid,
(select count(1) from ancestor where nodeid = a.ancestorid)
as depth
from ancestor a, data d
where a.ancestorid = d.nodeid
and d.color is not null
union all
select
t.nodeid, t.nodeid,
(select count(1) from ancestor where nodeid = t.nodeid) as
depth
from data d, tree t
where d.nodeid = t.nodeid
and d.color is not null
)
select s.nodeid, d.color from suspects s, data d
where d.nodeid = s.suspectid
and depth = (select max(depth) from suspects where nodeid =
s.nodeid)
;
NODEID COLOR
----------- ----------
5 GREEN
7 GREEN
8 GREEN
9 BLUE
/Lennart
Jul 20 '05 #8
jl***********@yahoo.com (Jeff Lanfield) wrote in message news:<23*************************@posting.google.c om>...
Thanks Lennart. I think your third suggestion might do the trick. If
you have a chance to answer I'm curious:

1) Why do you think storing the path (suggestion #2) is not an option?
Or if it is option, why is it a bad idea compared to storing a
separate ancestor relation? (aside from the usual normal form
reasoning)

Since youre mainly concerned in retrieving ancestors, (not
investigating subtree
), I think it is a bit clumsy. Assume that you have a node x with path
1.2.3.56.89.112 and want to investigate which nodes in the path that
have property p. How would you do that? Subtree queries are much
easier since you can query: where path like '1.2.4.6.%'

2) How would you structure the query if you stored the path
(suggestion #2).

I would figure out a way to retrive ancestors and then use the same
way.

If I where in your shoes I would encapsulate the ancestor stuff in
views or table functions, and then use those in my queries. If you
decide to go for another aproach, you rewrite the functions/views and
can continue to use the queries. Simple example using table functions
in db2:
-- ancestor table aproach

CREATE FUNCTION SUBTREE (ID INTEGER)
RETURNS TABLE (NODEID INTEGER)
LANGUAGE SQL
READS SQL DATA
NO EXTERNAL ACTION
DETERMINISTIC
RETURN
select nodeid from ancestor where ancestorid = ID
@

CREATE FUNCTION SUBTREE_SELF (ID INTEGER)
RETURNS TABLE (NODEID INTEGER)
LANGUAGE SQL
READS SQL DATA
NO EXTERNAL ACTION
DETERMINISTIC
RETURN
select s.nodeid from table(subtree(ID)) as s
union all
values (ID)
@
-- changed my mind decided to go for path aproach instead

drop ...

CREATE FUNCTION SUBTREE (ID INTEGER)
RETURNS TABLE (NODEID INTEGER)
LANGUAGE SQL
READS SQL DATA
NO EXTERNAL ACTION
DETERMINISTIC
RETURN
select nodeid from tree where path like (ID || '.%')

etc

The same goes for depth

As a bonus it will simplify your original query. I.e

with suspects (nodeid, ancestorid, depth) as (
select
ps.nodeid, ps.ancestorid, depth_func(ps.nodeid) as depth
from data d, tree t, table(path_self(t.nodeid)) ps
where ps.ancestorid = d.nodeid and d.color is not null
)
select ...

HTH
/Lennart
Thanks!

- Jeff

le*****@kommunicera.umea.se (Lennart Jonsson) wrote in message news:<6d************************@posting.google.co m>...
jl***********@yahoo.com (Jeff Lanfield) wrote in message news:<23**************************@posting.google. com>...

[...]

So say the tree is specified like this:

(nodeId int, parentId int, color varchar, phone varchar)

5, 0,"GREEN", "555-1212"
7, 5, NULL, "777-5555"
8, 7 NULL, NULL
9, 7 "BLUE", NULL

That is: node 7 inherits values from 5. Nodes 8,9 inherit values from
7. Node 5 is the top level node.

I want to run a query that would give the following result set:

select nodeId, color, phone from ...

5,"GREEN", "555-1212"
7,"GREEN", "777-5555"
8,"GREEN", "777-5555"
9,"BLUE", "777-5555"

Can such a query be constructed?


If you dont have a maximum depth of your tree (and cannot use
recursion), then no. The reason is that you cannot express queries
like "gimmie the ancestors of x". What you need to do is to inform
your system that 5 is an ancestor of 9. There are several ways of
doing this. Troels Arvin has a page with links to articles on the
subject:

http://troels.arvin.dk/db/rdbms/links/#hierarchical

1. Nested set: google for Celko and nested. There are also variants of
this.
2. Store the path in each node. In your case something like:

5,'',"GREEN", "555-1212"
7,'5',"GREEN", "777-5555"
8,'5.7',"GREEN", "777-5555"
9,'5.7',"BLUE", "777-5555"

In your case I dont think this is an option

3. Store a separate ancestor relation. In your case

create table ancestor (
nodeid int not null
references tree,
ancestorid int not null
references tree,
primary key (nodeid, ancestorid)
);

insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);

Main drawback is that you have to maintain this relation as you
modifies your tree. I have some notes on how that can be achieved:

http://fungus.teststation.com/~jon/t...eeHandling.htm

Anyhow, once you have a way of asking for ancestors for a given node,
you can do what you want in a single query, namely:

find the ancestor at the largest depth with property p

Assuming the following ddl

create table tree (
nodeid int not null primary key,
parent int not null
references tree
on delete restrict
);

create table data (
nodeid int not null primary key
references tree
on delete cascade,
color varchar(10),
phone varchar(10)
);

insert into tree values (5,5), (7,5), (8,7), (9,7);
insert into data values (5, 'GREEN', '555-1212'), (7, NULL,
'777-5555'), (8, NUL
L, NULL), (9, 'BLUE', NULL);

create table ancestor (
nodeid int not null
references tree,
ancestorid int not null
references tree,
primary key (nodeid, ancestorid)
);

insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);

You could use a query like:

with suspects (nodeid, suspectid, depth) as (
select
a.nodeid, a.ancestorid,
(select count(1) from ancestor where nodeid = a.ancestorid) as
depth
from ancestor a, data d
where a.ancestorid = d.nodeid
and d.color is not null
union all
select
d.nodeid, d.nodeid,
(select count(1) from ancestor where nodeid = d.nodeid) as
depth
from data d
where d.color is not null
and not exists (
select 1 from ancestor where ancestorid = d.nodeid
)
)
select s.nodeid, d.color from suspects s, data d
where d.nodeid = s.suspectid
and depth = (select max(depth) from suspects where nodeid =
s.nodeid)

NODEID COLOR
----------- ----------
7 GREEN
8 GREEN
9 BLUE

3 record(s) selected.

As you can see node 5 is missing, but I'll leave that for you ;-)
HTH
/Lennart

- Jeff

Jul 20 '05 #9
Jan
e.g. in Oracle 9i,

you can:

Creating demo data

CREATE TABLE T
(
CHILD_ID NUMBER,
PARENT_ID NUMBER,
ATTR1 VARCHAR2(10),
ATTR2 VARCHAR2(10)
)

INSERT INTO T VALUES (1 NULL,'A','000');
INSERT INTO T VALUES (2,1, NULL, '111');
INSERT INTO T VALUES (3,1, 'B', NULL);
INSERT INTO T VALUES (4,2, NULL, NULL);
INSERT INTO T VALUES (5, 2, 'C', '999');
INSERT INTO T VALUES (6, 5, NULL, NULL);

and the query is (it is too ugly with INSTR/SUBSTR, but maybe faster
then an inline query per each attribute with another CONNECT BY):

------------------
SELECT child_id,PARENT_ID,attr1,attr2,tree_path,
RTRIM(SUBSTR(all_attr1,INSTR(all_attr1,'/',-1,2)+1),'/')
inherit_attr1,
RTRIM(SUBSTR(all_attr2,INSTR(all_attr2,'/',-1,2)+1),'/')
inherit_attr2
FROM

(SELECT child_id,parent_id,attr1,attr2,
SYS_CONNECT_BY_PATH(TO_CHAR(child_id), '/') tree_path,
CASE
WHEN attr1 IS NOT NULL THEN '/'||attr1||'/'
ELSE REPLACE('/'||SYS_CONNECT_BY_PATH(attr1,'/'),'//','/')
END all_attr1,

CASE
WHEN attr2 IS NOT NULL THEN '/'||attr2||'/'
ELSE REPLACE('/'||SYS_CONNECT_BY_PATH(attr2,'/'),'//','/')
END all_attr2

FROM T
START WITH parent_id IS NULL
CONNECT BY parent_id=PRIOR child_id) v
--------------------------------------------------------

jl***********@yahoo.com (Jeff Lanfield) wrote in message news:<23**************************@posting.google. com>...
First of all, I apologize if coalescing is not the right term to
describe my problem. I have a tree where each node has the same set of
attributes (is the same entity) but child nodes should inherit
attribute values from parent node.

for example, say I have the following table:

(nodeId int , color varchar, phone varchar) with two rows

5, "GREEN", "555-1212"
7, NULL, "777-5555"
8, NULL, NULL
9, "BLUE", NULL

in addition there is a tree structure that specifies that node 5 is
the parent of node 7, and 7 is the parent of nodes 8 and 9. I know
there is many ways to make trees in SQL but as a simple example let's
say the tree is:

id, parentid
8, 7
9, 7
7, 5

Thus in this case, node 7 inherits the value "GREEN" from node 5 for
attribute "color", but provides its own value "777-5555" for attribute
"phone". Node 8, in turn, inherits "GREEN" for "color" from node 7
(really from node 5 since 7 did not specify its own) and "777-5555"
for "phone" from node 7. Node 9 provides its own value for "color" and
inherits the one for "phone" from Node 7.

So the runtime values in the application are:

Node 5: "GREEN", "555-1212"
Node 7: "GREEN", "777-5555"
Node 8: "GREEN", "777-5555"
Node 9: "BLUE", "777-5555"

Question 1: Is there a single SQL statement that for a given node can
replace the NULLs with inherited values from the parent node?

Question 2: Is there a better way to structure such data in SQL as to
make answer to question 1 possible?

I would restate the problem as follows:

In a nested structure child nodes inherit values from parent nodes _by
reference_ or specify their own. "By reference" is the key word here.
If it wasn't for that you could just duplicate the necessary values
from the parent entitity upon creation.

Thanks!

- Jeff

Jul 20 '05 #10

"Lennart Jonsson" <le*****@kommunicera.umea.se> wrote in message
news:6d**************************@posting.google.c om...
jl***********@yahoo.com (Jeff Lanfield) wrote in message news:<23*************************@posting.google.c om>... Since youre mainly concerned in retrieving ancestors, (not
investigating subtree
), I think it is a bit clumsy. Assume that you have a node x with path
1.2.3.56.89.112 and want to investigate which nodes in the path that
have property p. How would you do that? Subtree queries are much
easier since you can query: where path like '1.2.4.6.%'


Small procedural part -- table function -- which returns ancestors
materialized path encodings set

1.2.3.56.89
1.2.3.56
1.2.3
1.2
1

for the input encoding 1.2.3.56.89.112 could be plugged in into the query
returning all the ancestors. This idea translated into the nested intervals
encoding (if you prefer to work with numbers instead of string parsing) is
implemented at the end of

http://arxiv.org/html/cs.DB/0401014

BTW, http://arxiv.org/abs/cs.DB/0402051 is significantly rewritten.
Essentially, it is Euclidean algorithm (the mother of all algorithms?) that
is employed in the numerical counterpart of the Materialized Path.


Jul 20 '05 #11
le*****@kommunicera.umea.se (Lennart Jonsson) wrote in message news:<6d**************************@posting.google. com>...
le*****@kommunicera.umea.se (Lennart Jonsson) wrote in message news:<6d************************@posting.google.co m>...
jl***********@yahoo.com (Jeff Lanfield) wrote in message news:<23**************************@posting.google. com>...


[...]

Hmmm, I must have been just a little bit tired there ;-) Should read
(diff in snd part of union):


and as you probably discovered already, it can be further simplified as

[...]
union all
select
d.nodeid, d.nodeid,
(select count(1) from ancestor where nodeid = d.nodeid) as
depth
from data d
where d.color is not null
)

[...]

/L
Jul 20 '05 #12
>> if an entity instance (one row) does not specify a value for one of
its field [sic]s it should inherit the values from its nearest parent
that does have a value for this field [sic]. <<

Rows are not records; fields are not columns; tables are not files.
It drives me nuts to see people screw up the terms and therefore their
mental model of how SQL works. Let's put this into a simplified
nested set model which has more NULL-able columns than the payroll
system of a major auto company:

CREATE TABLE Nodes
(node_id INTEGER NOT NULL PRIMARY KEY,
col_1 CHAR(5),
col_2 CHAR(5),
..);

CREATE TABLE Tree
(node_id INTEGER NOT NULL UNIQUE
REFERENCES Nodes(node_id),
lft INTEGER NOT NULL,
rgt INTEGER NOT NULL,
PRIMARY KEY (lft, rgt),
<< other tree constraints here >>
..);

Now update all the rows where any column has a NULL. You will have to
run this update once for every level in the tree at most.

UPDATE Nodes
SET col_1
= COALESCE(Nodes.col_1, -- keep non-nul
(SELECT B.col_1 -- find parent
FROM Tree AS E
LEFT OUTER JOIN
Tree AS B
ON B.lft
= (SELECT MAX(lft)
FROM Tree AS S
WHERE E.lft > S.lft
AND E.lft < S.rgt)),
col_2
= COALESCE(Nodes.col_2,
(SELECT B.col_2
FROM ..)),

Etc.
WHERE (col_1 || col_2 || ..||col_n) IS NULL;

This probably scared you. The WHERE clause uses the fact that NULLs
propagate; we don't care *which* column is NULL, so why look for
useless details? That left outer join scalar subquery is how to get
the immediate superiors (B= boss, E= employee) in a hierarchy. The
COALESCE() will retain an existing non-NULL value.
Jul 20 '05 #13

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by Rich Wahl | last post: by
10 posts views Thread by nimmi_srivastav | last post: by
6 posts views Thread by B0nj | last post: by
11 posts views Thread by Jon Shemitz | last post: by
9 posts views Thread by Rich | last post: by
37 posts views Thread by Tim N. van der Leeuw | last post: by
9 posts views Thread by Paddy | last post: by
4 posts views Thread by Wolfgang Draxinger | last post: by
reply views Thread by suresh191 | last post: by
4 posts views Thread by guiromero | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.