473,412 Members | 2,067 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,412 software developers and data experts.

Find root node in tree structure

Hey!

Can anyone give me a hint, how this problem is best implemented:
I have a table of users (see below), where every user has one
"superior user" (= parent node), this should be a fully unambigous
tree structure. The root node can have whatever value you prefer, I
suppose NULL would be good for a start. What I want to do is finding
the way from an arbitrary node in the tree.

Example:

ROOT
|
----------
A B
---------
C D
-------
E F G
The query should return C, B, ROOT if i give it the input of G for
example.

This is my users table:

users
=====================
user_id (primary key)
name
superior_id

Do you think there's any possibility to do this without firing several
queries? I mean I could send a query for each 'get-parent' lookup, but
that seams to be rather inefficient on large trees, no?

Thank you very much
Phil
Jul 23 '05 #1
12 6942
pillepop2003 wrote:
What I want to do is finding
the way from an arbitrary node in the tree.


This is a frequent question for SQL.
I usually suggest implementing a "path enumeration" table
to store each ancestor-descendant path in your tree.
Do this instead of storing simply the superiod_id for each user.

See also:
http://groups-beta.google.com/group/...51ff05edbbe71/
http://groups-beta.google.com/group/...b6d55c69e32ff/

Regards,
Bill K.
Jul 23 '05 #2
pillepop2003 wrote:
Hey!

Can anyone give me a hint, how this problem is best implemented:
I have a table of users (see below), where every user has one
"superior user" (= parent node), this should be a fully unambigous
tree structure. The root node can have whatever value you prefer, I
suppose NULL would be good for a start. What I want to do is finding
the way from an arbitrary node in the tree.

Example:

ROOT
|
----------
A B
---------
C D
-------
E F G
The query should return C, B, ROOT if i give it the input of G for
example.

This is my users table:

users
=====================
user_id (primary key)
name
superior_id

Do you think there's any possibility to do this without firing several
queries? I mean I could send a query for each 'get-parent' lookup, but
that seams to be rather inefficient on large trees, no?

Thank you very much
Phil


Phil,

The structure you are using is an adjacent list which has limitations, as
you've found already, a better model is that of nested sets. These may
appear more complicated at first however if you take the time to understand
it you will be amazed by it's power and it makes queries such as the one
above a piece of cake...

http://www.intelligententerprise.com...questid=327517

--
Thanks

Murph
Jul 23 '05 #3
It depends on how large a table you have and how it is used. Is it
prone to heavy updating?

I built a mechanism to store forests of trees, tied to web pages. My
needs involved slight denormalization, and I wanted to do my recursive
logic in PHP -- to avoid costly recursive queries whenever possible.

It worked like this:

page_id
element_id
parent_id (another element_id)
element_order (1-N, for the number of elements on the page)
gen_level (generational level)
{node data fields}

With this structure I'm actually able to do quite a bit of logic in PHP,
and avoid costly recursive queries a number of times. Also, I'm using
mySQL 3.23 which lacks subselects, so that informed my design as well.
Paul Bramscher

pillepop2003 wrote:
Hey!

Can anyone give me a hint, how this problem is best implemented:
I have a table of users (see below), where every user has one
"superior user" (= parent node), this should be a fully unambigous
tree structure. The root node can have whatever value you prefer, I
suppose NULL would be good for a start. What I want to do is finding
the way from an arbitrary node in the tree.

Example:

ROOT
|
----------
A B
---------
C D
-------
E F G
The query should return C, B, ROOT if i give it the input of G for
example.

This is my users table:

users
=====================
user_id (primary key)
name
superior_id

Do you think there's any possibility to do this without firing several
queries? I mean I could send a query for each 'get-parent' lookup, but
that seams to be rather inefficient on large trees, no?

Thank you very much
Phil

Jul 23 '05 #4
Paul Bramscher wrote:
It depends on how large a table you have and how it is used. Is it
prone to heavy updating?

I built a mechanism to store forests of trees, tied to web pages. My
needs involved slight denormalization, and I wanted to do my recursive
logic in PHP -- to avoid costly recursive queries whenever possible.

It worked like this:

page_id
element_id
parent_id (another element_id)
element_order (1-N, for the number of elements on the page)
gen_level (generational level)
{node data fields}

With this structure I'm actually able to do quite a bit of logic in PHP,
and avoid costly recursive queries a number of times. Also, I'm using
mySQL 3.23 which lacks subselects, so that informed my design as well.
Paul Bramscher

pillepop2003 wrote:
Hey!

Can anyone give me a hint, how this problem is best implemented:
I have a table of users (see below), where every user has one
"superior user" (= parent node), this should be a fully unambigous
tree structure. The root node can have whatever value you prefer, I
suppose NULL would be good for a start. What I want to do is finding
the way from an arbitrary node in the tree.

Example:

ROOT
|
----------
A B
---------
C D
-------
E F G
The query should return C, B, ROOT if i give it the input of G for
example.

This is my users table:

users
=====================
user_id (primary key)
name
superior_id

Do you think there's any possibility to do this without firing several
queries? I mean I could send a query for each 'get-parent' lookup, but
that seams to be rather inefficient on large trees, no?

Thank you very much
Phil


Have a look at this link:

http://www.intelligententerprise.com...questid=327517

I think here is the solution for your problem. It is a very clear
implementation of how to solve trees and getting information about them
in (my)SQL. It is more complex than your solution, but it has some big
advantages over your model, especially the easy way to get a lot of
information from the tree, not only all the levels above from somewhere
in the tree.

The method you use is called the adjacency list model, after the graph
theory technique of the same name; the pairs of nodes are adjacent to
each other, and is the textbook solution for creating trees. The author
of this article gives a better solution based on nested sets.

In example 1 the tree nodes of all the above levels are returned from
the database. If I were you I would go this way in stead of using the
adjacency model.

I hope this helps.

Jonathan

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 23 '05 #5
Jonathan wrote:
Paul Bramscher wrote:
It depends on how large a table you have and how it is used. Is it
prone to heavy updating?

I built a mechanism to store forests of trees, tied to web pages. My
needs involved slight denormalization, and I wanted to do my recursive
logic in PHP -- to avoid costly recursive queries whenever possible.

It worked like this:

page_id
element_id
parent_id (another element_id)
element_order (1-N, for the number of elements on the page)
gen_level (generational level)
{node data fields}

With this structure I'm actually able to do quite a bit of logic in
PHP, and avoid costly recursive queries a number of times. Also, I'm
using mySQL 3.23 which lacks subselects, so that informed my design as
well.
Paul Bramscher

pillepop2003 wrote:
Hey!

Can anyone give me a hint, how this problem is best implemented:
I have a table of users (see below), where every user has one
"superior user" (= parent node), this should be a fully unambigous
tree structure. The root node can have whatever value you prefer, I
suppose NULL would be good for a start. What I want to do is finding
the way from an arbitrary node in the tree.

Example:

ROOT
|
----------
A B
---------
C D
-------
E F G
The query should return C, B, ROOT if i give it the input of G for
example.

This is my users table:

users
=====================
user_id (primary key)
name
superior_id

Do you think there's any possibility to do this without firing several
queries? I mean I could send a query for each 'get-parent' lookup, but
that seams to be rather inefficient on large trees, no?

Thank you very much
Phil

Have a look at this link:

http://www.intelligententerprise.com...questid=327517

I think here is the solution for your problem. It is a very clear
implementation of how to solve trees and getting information about them
in (my)SQL. It is more complex than your solution, but it has some big
advantages over your model, especially the easy way to get a lot of
information from the tree, not only all the levels above from somewhere
in the tree.

The method you use is called the adjacency list model, after the graph
theory technique of the same name; the pairs of nodes are adjacent to
each other, and is the textbook solution for creating trees. The author
of this article gives a better solution based on nested sets.

In example 1 the tree nodes of all the above levels are returned from
the database. If I were you I would go this way in stead of using the
adjacency model.

I hope this helps.

Jonathan


Not quite, my model is better than a straight adjaceny list in a few
ways (offering denormalized values of generational level and overall
cardinal order). Though I agree, the nested method seems to be best of
all from a back-end perspective.

But let's say your application is about building websites that look like
newsgroup threads. You have root-level elements, child element,
grandchild element, etc. Each root-level element is a tree in itself,
so the page is a forest of trees.

In this nested model, what's a non-recursive way to quickly pull out the
trees and their elements for display in that fashion.

Imagine you've got a web site with 40,000 elements that gets hit 1-5
times/second. You're thinking more about denormalization and less about
perfection.

But I'll study your reference more. It looks like this traversal is
anticipated in the design, and a simple order by the left value will
accomplish this. Good article, wish I'd seen it 3-4 years ago.
Jul 23 '05 #6
Jonathan wrote:
Paul Bramscher wrote:
It depends on how large a table you have and how it is used. Is it
prone to heavy updating?

I built a mechanism to store forests of trees, tied to web pages. My
needs involved slight denormalization, and I wanted to do my recursive
logic in PHP -- to avoid costly recursive queries whenever possible.

It worked like this:

page_id
element_id
parent_id (another element_id)
element_order (1-N, for the number of elements on the page)
gen_level (generational level)
{node data fields}

With this structure I'm actually able to do quite a bit of logic in
PHP, and avoid costly recursive queries a number of times. Also, I'm
using mySQL 3.23 which lacks subselects, so that informed my design as
well.
Paul Bramscher

pillepop2003 wrote:
Hey!

Can anyone give me a hint, how this problem is best implemented:
I have a table of users (see below), where every user has one
"superior user" (= parent node), this should be a fully unambigous
tree structure. The root node can have whatever value you prefer, I
suppose NULL would be good for a start. What I want to do is finding
the way from an arbitrary node in the tree.

Example:

ROOT
|
----------
A B
---------
C D
-------
E F G
The query should return C, B, ROOT if i give it the input of G for
example.

This is my users table:

users
=====================
user_id (primary key)
name
superior_id

Do you think there's any possibility to do this without firing several
queries? I mean I could send a query for each 'get-parent' lookup, but
that seams to be rather inefficient on large trees, no?

Thank you very much
Phil

Have a look at this link:

http://www.intelligententerprise.com...questid=327517

I think here is the solution for your problem. It is a very clear
implementation of how to solve trees and getting information about them
in (my)SQL. It is more complex than your solution, but it has some big
advantages over your model, especially the easy way to get a lot of
information from the tree, not only all the levels above from somewhere
in the tree.

The method you use is called the adjacency list model, after the graph
theory technique of the same name; the pairs of nodes are adjacent to
each other, and is the textbook solution for creating trees. The author
of this article gives a better solution based on nested sets.

In example 1 the tree nodes of all the above levels are returned from
the database. If I were you I would go this way in stead of using the
adjacency model.

I hope this helps.

Jonathan


I've looked at this article again, and it commits a little dishonesty
when it produces an adjacency implementation which is not normalized.
That's a problem with that implementation, not adjacency models in
general. This portion of my example is fully normalized:

element_id
parent_id (another element_id)
{node data fields}

el_id name parent_id (another el_id)
1 bob NULL
2 george 1
3 bill 1

etc. This is a normalized adjacency implementation. If bob decided to
change his name to "Robert", then we don't have to update multiple
columns of his subordinates -- only one. So the example provided in the
article is a deliberately poor one.

Only one other concern comes to mind, and that's performance of
implementations in terms of where new nodes are added. Behaviorally,
are they generally added at the bottom? How many rows in a nested model
do you need to adjust to make room for a new node at the bottom? In the
middle somewhere? The top?

My implementation involves the least cost for nodes at the bottom, the
greatest for new nodes at the top. I'm not sure yet, but it seems that
the nested performance hit for new nodes works a little differently, and
is more complex.
Jul 23 '05 #7
Hey!

I'm so sorry it took me soooo long to answer my one thread. First of
all I want to thank you all for your extensive replies. I might be busy
for the next few days to work through all your tips, but as soon as I'm
done with that, I'll answer ;-)

So long
Philipp

Paul Bramscher wrote:
Jonathan wrote:
Paul Bramscher wrote:
It depends on how large a table you have and how it is used. Is it prone to heavy updating?

I built a mechanism to store forests of trees, tied to web pages. My needs involved slight denormalization, and I wanted to do my recursive logic in PHP -- to avoid costly recursive queries whenever possible.
It worked like this:

page_id
element_id
parent_id (another element_id)
element_order (1-N, for the number of elements on the page)
gen_level (generational level)
{node data fields}

With this structure I'm actually able to do quite a bit of logic in PHP, and avoid costly recursive queries a number of times. Also, I'm using mySQL 3.23 which lacks subselects, so that informed my design as well.
Paul Bramscher

pillepop2003 wrote:

Hey!

Can anyone give me a hint, how this problem is best implemented:
I have a table of users (see below), where every user has one
"superior user" (= parent node), this should be a fully unambigous tree structure. The root node can have whatever value you prefer, I suppose NULL would be good for a start. What I want to do is finding the way from an arbitrary node in the tree.

Example:

ROOT
|
----------
A B
---------
C D
-------
E F G
The query should return C, B, ROOT if i give it the input of G for example.

This is my users table:

users
=====================
user_id (primary key)
name
superior_id

Do you think there's any possibility to do this without firing several queries? I mean I could send a query for each 'get-parent' lookup, but that seams to be rather inefficient on large trees, no?

Thank you very much
Phil

Have a look at this link:

http://www.intelligententerprise.com...questid=327517
I think here is the solution for your problem. It is a very clear
implementation of how to solve trees and getting information about them in (my)SQL. It is more complex than your solution, but it has some big advantages over your model, especially the easy way to get a lot of information from the tree, not only all the levels above from somewhere in the tree.

The method you use is called the adjacency list model, after the graph theory technique of the same name; the pairs of nodes are adjacent to each other, and is the textbook solution for creating trees. The author of this article gives a better solution based on nested sets.

In example 1 the tree nodes of all the above levels are returned from the database. If I were you I would go this way in stead of using the adjacency model.

I hope this helps.

Jonathan


I've looked at this article again, and it commits a little dishonesty

when it produces an adjacency implementation which is not normalized. That's a problem with that implementation, not adjacency models in
general. This portion of my example is fully normalized:

element_id
parent_id (another element_id)
{node data fields}

el_id name parent_id (another el_id)
1 bob NULL
2 george 1
3 bill 1

etc. This is a normalized adjacency implementation. If bob decided to change his name to "Robert", then we don't have to update multiple
columns of his subordinates -- only one. So the example provided in the article is a deliberately poor one.

Only one other concern comes to mind, and that's performance of
implementations in terms of where new nodes are added. Behaviorally, are they generally added at the bottom? How many rows in a nested model do you need to adjust to make room for a new node at the bottom? In the middle somewhere? The top?

My implementation involves the least cost for nodes at the bottom, the greatest for new nodes at the top. I'm not sure yet, but it seems that the nested performance hit for new nodes works a little differently, and is more complex.


Jul 23 '05 #8
Paul,

I'm not sure if your "path enumeration" model is appropriate for my
problem: I was looking for a way to query *the_way* to the root node
from an arbitrary node. If you review my "graphical" example above, the
relation is transitive, so i want to yield "C, B, ROOT" as a result set
when I provide 'G'. Your solution can't handle this in a single query,
can it? (at least as long as I don't have a database that supports
recursive queries - mysql 4.0.x does not!).

I am expecting the whole tree to grow to about 50.000+ nodes and there
will be almost no updates but mostly select which should gain
high-performance.

Am I left to implement the recursion in php? Suggestions?
(I assume that nested sets are to complex in this case, what do you
think?)

Jul 23 '05 #9
Here's my method. Given a table structure like this:

page_id
element_id
parent_id (another element id)
gen_level
element_order
{node data}

Problem: I need to get a the root R of an arbitrary element E which has
cardinal order D on page P. I do it quite cleanly and simply like this:

SELECT element_id FROM element WHERE page_id = P AND element_order < D
AND gen_level = 0 ORDER BY element_order DESC LIMIT 1

This automatically gives you the answer, since we know that the root
element *must* occur in cardinal order prior to arbitrary element E, and
that it must have a generational level of 0. Indeed, it will always be
the first such element encountered if we reverse the resultset order.

Far simpler than the nested method, and quite elegant I might add. :-)

But don't get me wrong, I think the nested set method might still offer
something. I'll need to look more carefully into it. I still have to
do recursion at times, but I generally do all of that in PHP, or with
crafty cursor manipulations. You don't want to do that with queries
against a table if at all possible.
Paul Bramscher
pi**********@yahoo.de wrote:
Paul,

I'm not sure if your "path enumeration" model is appropriate for my
problem: I was looking for a way to query *the_way* to the root node
from an arbitrary node. If you review my "graphical" example above, the
relation is transitive, so i want to yield "C, B, ROOT" as a result set
when I provide 'G'. Your solution can't handle this in a single query,
can it? (at least as long as I don't have a database that supports
recursive queries - mysql 4.0.x does not!).

I am expecting the whole tree to grow to about 50.000+ nodes and there
will be almost no updates but mostly select which should gain
high-performance.

Am I left to implement the recursion in php? Suggestions?
(I assume that nested sets are to complex in this case, what do you
think?)

Jul 23 '05 #10
Paul,

your query only finds _THE_ root node, i need to find the _WAY_ to the
root node. This cannot be done with your model imho... at least i can't
see how - correct me if I'm wrong.

Jul 23 '05 #11
pi**********@yahoo.de wrote:
Paul,

your query only finds _THE_ root node, i need to find the _WAY_ to the
root node. This cannot be done with your model imho... at least i can't
see how - correct me if I'm wrong.


I do this a couple different ways. Here's the simplist: a single SQL
and walking through the recordset with PHP. Given an arbitrary element
with order of D and generation level of G on page P, do this:

resultset = SELECT element_id FROM element WHERE page_id = P AND
element_order < D AND gen_level < G ORDER BY G DESC

This produces more than what we need in most cases, especially if you
have a forest of trees for a given page P. So we can do it this way:

Pseudocode in PHP, C, Perl, etc.:
path_stack = empty array to store parentage of element id's

// When you hit 0, you've completed the path
while (gen_level > 0) {
push element_id onto path_stack.
move to next row
}
// exit condition is reached when root element is hit

You could write this in a SQL loop also, but I prefer to do this sort of
logic in a procedural language. This doesn't use the parent_id, which
is still there and essential for doing this recursively -- and that's
another option, either doing it recursively with SQL calls (not wise at
all) or recursively in code, based on a single SQL (much more elegant).
Cheers,

-Paul
Jul 23 '05 #12
Paul Bramscher wrote:
pi**********@yahoo.de wrote:
Paul,

your query only finds _THE_ root node, i need to find the _WAY_ to the
root node. This cannot be done with your model imho... at least i can't
see how - correct me if I'm wrong.


I do this a couple different ways. Here's the simplist: a single SQL
and walking through the recordset with PHP. Given an arbitrary element
with order of D and generation level of G on page P, do this:

resultset = SELECT element_id FROM element WHERE page_id = P AND
element_order < D AND gen_level < G ORDER BY G DESC

This produces more than what we need in most cases, especially if you
have a forest of trees for a given page P. So we can do it this way:

Pseudocode in PHP, C, Perl, etc.:
path_stack = empty array to store parentage of element id's

// When you hit 0, you've completed the path
while (gen_level > 0) {
push element_id onto path_stack.
move to next row
}
// exit condition is reached when root element is hit

You could write this in a SQL loop also, but I prefer to do this sort of
logic in a procedural language. This doesn't use the parent_id, which
is still there and essential for doing this recursively -- and that's
another option, either doing it recursively with SQL calls (not wise at
all) or recursively in code, based on a single SQL (much more elegant).
Cheers,

-Paul


Oops, make that query ORDER BY D, G DESC. There's a couple different
ways to do it, and you might need to set up a little flag in code as you
walk backwards through it.
Jul 23 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Steve Roberts | last post by:
I want to write a graphical view of an XML tree, using a DOM interface, that allows the XML tree to be edited, with nodes being added or removed. For this I was going to create a basic tree...
15
by: Stig Brautaset | last post by:
Hi group, I'm playing with a little generic linked list/stack library, and have a little problem with the interface of the pop() function. If I used a struct like this it would be simple: ...
3
by: Saradhi | last post by:
Hi All, Here I am facing a performance problem with the TreeView Node renaming. I am displaying a hierarchy Data in a treeview in my Windows C# Application. My tree view represents an...
3
by: Alan Silver | last post by:
Hello, I am just looking at the tree view control, which looks very good, but has some apparent limitations. This could easily be my lack of experience with it. Specifically, I have two...
3
by: Sergio Terenas | last post by:
Hi all, I've a Treeview control in a form load with 600 nodes. Each node has a text and tag associated to it at the time I add it. I need to find a node by either the text or the tag, make it...
8
by: VK | last post by:
Can be multiple instances of element used as the root element? That's a curly way of asking, but I did not come up with a better sentence, sorry. What I mean is with a document like: <?xml...
5
gekko3558
by: gekko3558 | last post by:
I am writing a simple binary search tree (nodes are int nodes) with a BSTNode class and a BST class. I have followed the instructions from my C++ book, and now I am trying to get a remove method...
0
by: sndive | last post by:
I have a weid problem. If i do this: import elementtree.ElementTree as ET .... tree = ET.parse("whatever") root = tree.getroot() r = root.find('last') print r return root where last is not an...
2
by: slizorn | last post by:
hi guys, i need to make a tree traversal algorithm that would help me search the tree.. creating a method to search a tree to find the position of node and to return its pointer value basically i...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.