469,632 Members | 1,730 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,632 developers. It's quick & easy.

Recursive Query Quesiton - Table Function

I have a database which stores information about organisms collected
during sediment toxicology research. For each sample, organisms in
sediment are collected and identified taxonomically (Order, Family,
Genus, Species).

Taxonomy lookup information in the database is stored in a recursive
table in the form:

TSN (taxa serial number)
Rank (Order, Family, Genus, Species)
Name
Parent_TSN (related Taxa at higher taxonomic level)

When the number of a particlar organism collected is entered into the
database, the count is stored along with the lowest level TSN the
organisms were identified to.

Okay - now the problem. Depending on the type of analysis being done,
a user may want organism counts at the lowest level, or rolled up to a
higher taxonomic level (usually Family). Can I write a recursive
function which will cycle through the Taxonomy database, and provide
the name of the organism at the appropriate taxonomic level? Is this a
reasonable approach with regard to speed and efficiency?

Something Like:
SELECT sample_id, 'Get Name Function(Rank, TSN)', Sum([count]) AS
NoTaxa FROM dbo.tblbenthic

Results could then be grouped and summed on the Name, to summarise
data for each sample/taxa.

Is this a reasonable approach? Or is there a better one? Did I explain
the problem well enough?

Thanks in advance,

Tim
Jul 20 '05 #1
6 6148
I do not kow what a TSN (taxa serial number) looks like, but are there
organism that do not have (Order, Family, Genus, Species)? It would
seem be a better design to make them into columns then use NULLs for
the missing classification data.

CREATE TABLE LabNotes
(..
order INTEGER,
family INTEGER,
genus INTEGER,
species INTEGER,
CHECK (...),
org_counts INTEGER NOT NULL CHECK (org_c0tns >= 0),
..);

Okay - now the problem. Depending on the type of analysis being

done,
a user may want organism counts at the lowest level, or rolled up to a
higher taxonomic level (usually Family). <<

Now that is easy to do

SELECT .. org_count
FROM LabNotes
WHERE <level> IS NOT NULL;

Did I miss something?
Jul 20 '05 #2
Sorry, I forgot to post the constraint and test data:

CREATE TABLE LabNotes
(ord INTEGER,
family INTEGER,
genus INTEGER,
species INTEGER,
CHECK (CASE WHEN (ord + family + genus + species) IS NOT NULL THEN 1
WHEN COALESCE (ord, family, genus, species) IS NULL THEN 1
WHEN COALESCE(species, family, genus) IS NULL THEN 1
WHEN COALESCE(species, genus) IS NULL THEN 1
WHEN species IS NULL THEN 1
ELSE 0 END = 1)
);
SELECT * FROM LabNotes;

--good data
INSERT INTO Labnotes VALUES (1, 2, 3, 4);
INSERT INTO Labnotes VALUES (1, 2, 3, NULL);
INSERT INTO Labnotes VALUES (1, 2, NULL, NULL);
INSERT INTO Labnotes VALUES (1, NULL, NULL, NULL);
INSERT INTO Labnotes VALUES (NULL, NULL, NULL, NULL);
-- bad data
INSERT INTO Labnotes VALUES (1, 2, NULL, 4);
INSERT INTO Labnotes VALUES (1, NULL, 3, 4);
INSERT INTO Labnotes VALUES (1, 2, NULL, 3);
INSERT INTO Labnotes VALUES (NULL, 2, NULL, 4);
INSERT INTO Labnotes VALUES (NULL, NULL, NULL, 4);
Jul 20 '05 #3
Tim Pascoe (ti********@cciw.ca) writes:
Okay - now the problem. Depending on the type of analysis being done,
a user may want organism counts at the lowest level, or rolled up to a
higher taxonomic level (usually Family). Can I write a recursive
function which will cycle through the Taxonomy database, and provide
the name of the organism at the appropriate taxonomic level? Is this a
reasonable approach with regard to speed and efficiency?

Something Like:
SELECT sample_id, 'Get Name Function(Rank, TSN)', Sum([count]) AS
NoTaxa FROM dbo.tblbenthic

Results could then be grouped and summed on the Name, to summarise
data for each sample/taxa.

Is this a reasonable approach? Or is there a better one? Did I explain
the problem well enough?


I don't think so, I only understand bits of it. :-)

It may help if you post:

o CREATE TABLE statement for your table.
o INSERT statements with sample data.
o The desired result from that sample data.
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #4

CELKO,

The move to a recursive table structue was to get away from the example
you suggested as an alternative :) Since organisms may only be IDed down
to one of the possible taxonomic levels, data may be entered to Order,
Family, Genus, or Species levels. If a row is used to identify this,
with a NULL for the 'missing' data, you end up needing to store 4
records for the complete teaxonomy of a single organism, and names must
be repeated hundreds of times (say the family has 10 Genus, then that is
20 repetitions of the Family name, to store Genus and Species)

E.G. Order, NULL, NULL, NULL
Order Family, NULL, NULL
Oder Family Genus, NULL etc.

It becomes a management nightmare when an organism changes from one
Family to another, or one Genus to another (it happens very often,
supprisingly).

4 records are still required for each level in the recursion, but there
are no NULL values, fewer columns, and the changes in taxonomy can be
altered with the change of a single relational value (the Parent TSN).
Also, repetition of higher categories does not occur, due to the
relaitonal nature of the links.

Thanks for the reply, however. I will need to look up the COALESCE
key-word, as I'm sure it will come in handy!

Tim

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #5
>> Since organisms may only be IDed down to one of the possible
taxonomic levels, data may be entered to Order, Family, Genus, or
Species levels. If a row is used to identify this, with a NULL for the
'missing' data, you end up needing to store 4 records [sic] for the
complete taxonomy of a single organism, and names must be repeated
hundreds of times (say the family has 10 Genus, then that is 20
repetitions of the Family name, to store Genus and Species) <<

If you get more detailed information on an organism, then you update
the NULLs with the values you just discovered, don't you? One way or
the other, each organism is going to be modeled once in my table
design.
It becomes a management nightmare when an organism changes from one

Family to another, or one Genus to another (it happens very often,
surprisingly). <<

So that is just one update on poor old "Omosis Jones" to his new
taxonomy. If I have to switch him to another family and I don't know
any more about him yet, I just fill in (genus, species) with NULLs in
the same single update.

I think I might see what I am missing in this problem. Off to the
side of the lab work, you can keep a nested sets model of the taxonomy
apart from particular organisms. You can Google the basics on that
model (or I can bore the regulars by posting it again).
Jul 20 '05 #6
CELKO,

It is indeed a Nested Sets type of a problem. While there are more
elegant ways of storing the data, a simple heirarchy in this case is
very functional. The original question was not intended to ask if the
table structure was effective (I've goen through that exercise already),
but what the implications were for applying a user-defined function to
extract recursive data when a user requests summary counts at a level
higher than what the data was entered at (e.g. data entered at
Genus/Species, but summary counts requested for Orders).

I have built a SP which returns the appropriate taxonomic name at the
requested level, now I need to figure out a way to modify this into a
function, and use it in place of a column name in a query for the result
set.

I think I'm on the right track - thanks again for your input.

Tim
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by replace-this-with-my-name | last post: by
9 posts views Thread by JP SIngh | last post: by
2 posts views Thread by Perttu Pulkkinen | last post: by
2 posts views Thread by Tim Pascoe | last post: by
2 posts views Thread by muzamil | last post: by
2 posts views Thread by Jim Devenish | last post: by
reply views Thread by gheharukoh7 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.