# Theoretical definition for the number of unique values?

 P: n/a Hi Everyone, Here is a theoretical, and definition question for you. In databases, we have: Relation a table with columns and rows Attribute a named column/field of a relation Domain a set of allowable values for one or more attributes Tuple a row of a relation Degree the number of attributes a relation contains Number of fields in a table Cardinality the number of tuples/rows a relation contains But! What is the definition for the number of unique values in a field? So, if you have 100 rows in a table, and the field is the gender field, with only values of: Y, N. You have 2 unique values. What do we call this concept? "the number of unique values in a column?" Is there one? Thanks a lot! Apr 12 '07 #1
 P: n/a sq*************@yahoo.com wrote: Hi Everyone, Here is a theoretical, and definition question for you. In databases, we have: Relation a table with columns and rows Attribute a named column/field of a relation Domain a set of allowable values for one or more attributes Tuple a row of a relation Degree the number of attributes a relation contains Number of fields in a table Cardinality the number of tuples/rows a relation contains But! What is the definition for the number of unique values in a field? So, if you have 100 rows in a table, and the field is the gender field, with only values of: Y, N. You have 2 unique values. What do we call this concept? "the number of unique values in a column?" Is there one? It is the cardinality of the projection onto the attribute, which may or may not equal the cardinality of the domain. Same concept just qualified differently. Apr 12 '07 #2

 P: n/a On Apr 12, 7:19 am, sqlservernew...@yahoo.com wrote: "the number of unique values in a column?" NDV - number of distinct values. There is nothing theoretical about it. Apr 12 '07 #3

 P: n/a On Apr 12, 11:19 am, sqlservernew...@yahoo.com wrote: Hi Everyone, Here is a theoretical, and definition question for you. In databases, we have: Relation a table with columns and rows Attribute a named column/field of a relation Domain a set of allowable values for one or more attributes Tuple a row of a relation Degree the number of attributes a relation contains Number of fields in a table Cardinality the number of tuples/rows a relation contains But! What is the definition for the number of unique values in a field? So, if you have 100 rows in a table, and the field is the gender field, with only values of: Y, N. You have 2 unique values. What do we call this concept? "the number of unique values in a column?" Is there one? Thanks a lot! The Oracle statistics refer to this as the number of DISTINCT values. Off the top of my head I do not remember any relational theory concept that applies. The range of valid values for the attribute would be the DOMAIN and each value in the domain would be distinct since the domain concept has no relation to the actual number of occurrences for real data. Maybe someone else will remember a concept that applies to one of the versions of relational theory. HTH -- Mark D Powell -- Apr 12 '07 #4

 P: n/a On Apr 12, 10:19 am, sqlservernew...@yahoo.com wrote: Hi Everyone, Here is a theoretical, and definition question for you. In databases, we have: Relation a table with columns and rows Attribute a named column/field of a relation Domain a set of allowable values for one or more attributes Tuple a row of a relation Degree the number of attributes a relation contains Number of fields in a table Cardinality the number of tuples/rows a relation contains But! What is the definition for the number of unique values in a field? So, if you have 100 rows in a table, and the field is the gender field, with only values of: Y, N. You have 2 unique values. What do we call this concept? "the number of unique values in a column?" Is there one? Thanks a lot! I believe it is referred to as 'cardinality'. Which should be covered in your text and by your instructor. David Fitzjarrell Apr 12 '07 #5

 P: n/a I found out. It is called "COLUMN CARDINALITY" Sorry, no prizes. http://www.informatik.uni-trier.de/~...WhangVT90.html (1) obtaining the column cardinality (the number of unique values in a column of a relation) and (2) obtaining the join selectivity (the number of unique values in the join column resulting from an unconditional join divided by the number of unique join column values in the relation to Be joined). These two parameters are important statistics that are used in relational query optimization and physical database design. http://www.idig.za.net/mysqlindexes/2006/11/09/ Column cardinality. This is the number of unique values contained in a column. Indexes work best when there is a high cardinality. Put another way, the more unique values there are (fewer duplicates) the better that column will be for indexing. Consider the ID number column of the previous example. Here there are no duplicates, only unique values. This column will be ideal for indexing. On the other end of the scale may be the first names column. Here there will probably be a number of duplicate names (fewer unique values) and a lower cardinality compared to the ID column. Apr 13 '07 #6

 P: n/a On Apr 12, 4:19 pm, sqlservernew...@yahoo.com wrote: Hi Everyone, Here is a theoretical, and definition question for you. In databases, we have: Relation a table with columns and rows Attribute a named column/field of a relation Domain a set of allowable values for one or more attributes Tuple a row of a relation Degree the number of attributes a relation contains Number of fields in a table Cardinality the number of tuples/rows a relation contains But! What is the definition for the number of unique values in a field? So, if you have 100 rows in a table, and the field is the gender field, with only values of: Y, N. You have 2 unique values. What do we call this concept? "the number of unique values in a column?" Is there one? Thanks a lot! I do not believe there is a specific terminology or usefulness for such concept. You may call it *domain attribute projection cardinality* (I just made it up but it could be a description of underlying concepts). Apr 13 '07 #7

 P: n/a On Apr 12, 4:19 pm, sqlservernew...@yahoo.com wrote: Degree the number of attributes a relation contains Number of fields in a table A table does not have "fields". Apr 14 '07 #8

 P: n/a On 12 Apr 2007 18:23:07 -0700, sq*************@yahoo.com wrote: >I found out. It is called "COLUMN CARDINALITY"Sorry, no prizes. http://www.informatik.uni-trier.de/~...WhangVT90.html (1) obtaining the column cardinality (the number of unique values in acolumn of a relation) and(2) obtaining the join selectivity (the number of unique values in thejoin column resulting from an unconditional join divided by the numberof unique join column values in the relation to Be joined).These two parameters are important statistics that are used inrelational query optimization and physical database design. http://www.idig.za.net/mysqlindexes/2006/11/09/ Column cardinality. This is the number of unique values contained in acolumn. Indexes work best when there is a high cardinality. Putanother way, the more unique values there are (fewer duplicates) thebetter that column will be for indexing. Consider the ID number columnof the previous example. Here there are no duplicates, only uniquevalues. This column will be ideal for indexing. On the other end ofthe scale may be the first names column. Here there will probably be anumber of duplicate names (fewer unique values) and a lowercardinality compared to the ID column. Yes, cardinality is the correct term. Now, for bonus credits: can anyone tell me the correct term for someone who posts a homework question here, gets an answer, and then pretends he worked the answer out for himself? Lemming -- Curiosity *may* have killed Schrodinger's cat. Apr 28 '07 #9

 P: n/a On Apr 27, 8:21 pm, Lemming wrote: On 12 Apr 2007 18:23:07 -0700, sqlservernew...@yahoo.com wrote: I found out. It is called "COLUMN CARDINALITY" Sorry, no prizes. http://www.informatik.uni-trier.de/~...WhangVT90.html (1) obtaining the column cardinality (the number of unique values in a column of a relation) and (2) obtaining the join selectivity (the number of unique values in the join column resulting from an unconditional join divided by the number of unique join column values in the relation to Be joined). These two parameters are important statistics that are used in relational query optimization and physical database design. http://www.idig.za.net/mysqlindexes/2006/11/09/ Column cardinality. This is the number of unique values contained in a column. Indexes work best when there is a high cardinality. Put another way, the more unique values there are (fewer duplicates) the better that column will be for indexing. Consider the ID number column of the previous example. Here there are no duplicates, only unique values. This column will be ideal for indexing. On the other end of the scale may be the first names column. Here there will probably be a number of duplicate names (fewer unique values) and a lower cardinality compared to the ID column. Yes, cardinality is the correct term. Now, for bonus credits: can anyone tell me the correct term for someone who posts a homework question here, gets an answer, and then pretends he worked the answer out for himself? Lemming -- Curiosity *may* have killed Schrodinger's cat Way to jump all over a thread that died 2 weeks ago. Apr 28 '07 #10

 P: n/a On 27 Apr 2007 17:52:00 -0700, hpuxrac wrote: >On Apr 27, 8:21 pm, Lemming wrote: >On 12 Apr 2007 18:23:07 -0700, sqlservernew...@yahoo.com wrote: >I found out. It is called "COLUMN CARDINALITY" >Sorry, no prizes. >http://www.informatik.uni-trier.de/~...WhangVT90.html >(1) obtaining the column cardinality (the number of unique values in acolumn of a relation) and(2) obtaining the join selectivity (the number of unique values in thejoin column resulting from an unconditional join divided by the numberof unique join column values in the relation to Be joined). >These two parameters are important statistics that are used inrelational query optimization and physical database design. >http://www.idig.za.net/mysqlindexes/2006/11/09/ >Column cardinality. This is the number of unique values contained in acolumn. Indexes work best when there is a high cardinality. Putanother way, the more unique values there are (fewer duplicates) thebetter that column will be for indexing. Consider the ID number columnof the previous example. Here there are no duplicates, only uniquevalues. This column will be ideal for indexing. On the other end ofthe scale may be the first names column. Here there will probably be anumber of duplicate names (fewer unique values) and a lowercardinality compared to the ID column. Yes, cardinality is the correct term.Now, for bonus credits: can anyone tell me the correct term forsomeone who posts a homework question here, gets an answer, and thenpretends he worked the answer out for himself?Lemming--Curiosity *may* have killed Schrodinger's cat Way to jump all over a thread that died 2 weeks ago. Mate, most of usenet died more than 2 years ago. What does it matter if I'm reviving someone's fortnight-old homework? Especially if I am taking the piss. Do try to keep up. Unless, of course, it was *your* homework? Forgive me if so; I can't be bothered to read back. But I can understand why you might be feeling a bit sensitive about it. Lemming -- Curiosity *may* have killed Schrodinger's cat. Apr 28 '07 #11

 P: n/a On Apr 27, 9:01 pm, Lemming wrote: On 27 Apr 2007 17:52:00 -0700, hpuxrac wrote: On Apr 27, 8:21 pm, Lemming wrote: On 12 Apr 2007 18:23:07 -0700, sqlservernew...@yahoo.com wrote: I found out. It is called "COLUMN CARDINALITY" Sorry, no prizes. http://www.informatik.uni-trier.de/~...WhangVT90.html (1) obtaining the column cardinality (the number of unique values in a column of a relation) and (2) obtaining the join selectivity (the number of unique values in the join column resulting from an unconditional join divided by the number of unique join column values in the relation to Be joined). These two parameters are important statistics that are used in relational query optimization and physical database design. http://www.idig.za.net/mysqlindexes/2006/11/09/ Column cardinality. This is the number of unique values contained in a column. Indexes work best when there is a high cardinality. Put another way, the more unique values there are (fewer duplicates) the better that column will be for indexing. Consider the ID number column of the previous example. Here there are no duplicates, only unique values. This column will be ideal for indexing. On the other end of the scale may be the first names column. Here there will probably be a number of duplicate names (fewer unique values) and a lower cardinality compared to the ID column. Yes, cardinality is the correct term. Now, for bonus credits: can anyone tell me the correct term for someone who posts a homework question here, gets an answer, and then pretends he worked the answer out for himself? Lemming -- Curiosity *may* have killed Schrodinger's cat Way to jump all over a thread that died 2 weeks ago. Mate, most of usenet died more than 2 years ago. What does it matter if I'm reviving someone's fortnight-old homework? Especially if I am taking the piss. Do try to keep up. Unless, of course, it was *your* homework? Forgive me if so; I can't be bothered to read back. But I can understand why you might be feeling a bit sensitive about it. Not exactly. Different people pick different tools to read these posting. The cdos group is still very active. This item was cross posted to various groups and was effectively dead until you chimed in. Personally I use the google groups interface. If you take a look at that tool you might have a different opinion about the health of what used to be usenet. Plus it allows you to see the question from the op, the replies and the thread in context. Many of the other people responding in cdos use other tools. Apr 28 '07 #12

