473,699 Members | 2,598 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Sequence of columns in primary key

I am currently undertaking a review of the primary keys in a SQL Server
2000 database with a view to improving performance of queries.

I have heard that, in the case of compound primary keys, it is
important to select the correct order for the columns within the key.
For instance, imagine a table called OrderLine which has primary key
columns as follows-

Ledger
OrderNumber
OrderLineNumber

The theory I have heard is that columns with the most distinct values
should come first. In this case, Ledger is likely to have a maximum of
6 distinct values, OrderNumber a maximum of 10 million and OrderLine up
to 99. Based on this supposition I believe the best order would be-

OrderNumber
OrderLineNumber
Ledger

I have performed a few rudimentary tests that appear to bear this out.
I was wondering if anyone else has tried something similar and if so
what was the result?
Thanks,

Ross

Oct 17 '05 #1
14 4757
>> columns with the most distinct values should come first. <<

This is true.

But you also have to remember that an index (a,b,c) in effect gives
you indexng for (a,b) and (a). You might also consider a covering
index to speed up queries. A covering index has all the columns needed
to answer a query so that you do not have to read the base table at
all. Also be sure that the FK and PK lists are in the same order.

Oct 17 '05 #2
>The theory I have heard is that columns with the most distinct values
should come first


There are quite a few exceptions. For instanse, if you have a table
ORDERS with PK on (CUSTOMERS_PK, some_other_colu mn) and a FK constraint
referencing table CUSTOMERS, you might want to have CUSTOMERS_PK column
first even if it is less selective than the other column.

There aren't that many hard and fast rules in our trade

Oct 17 '05 #3
(gi******@talk2 1.com) writes:
I have heard that, in the case of compound primary keys, it is
important to select the correct order for the columns within the key.
For instance, imagine a table called OrderLine which has primary key
columns as follows-

Ledger
OrderNumber
OrderLineNumber

The theory I have heard is that columns with the most distinct values
should come first. In this case, Ledger is likely to have a maximum of
6 distinct values, OrderNumber a maximum of 10 million and OrderLine up
to 99. Based on this supposition I believe the best order would be-

OrderNumber
OrderLineNumber
Ledger

I have performed a few rudimentary tests that appear to bear this out.
I was wondering if anyone else has tried something similar and if so
what was the result?


Nah, the order in a primary key should rather reflect the logical
hierarchy. If an OrderLine can be broken into pieces called ledgers
the above would be the right model. I would guess, though, that different
ledgers have different series for order numbers, and thus the ledger is
the main key here.

I don't know the business rules here, but it seems to be that
SELECT * FROM OrderLine WHERE OrderNumber = 1234 is not even
meaninful, because you may get to see data from six different and
unrelated orders.

Now, not all tables with a compound key has a clear hiearchy. Consider
this table:

CREATE TABLE bankholidays(
coucode aba_coucode NOT NULL,
bhodate aba_date NOT NULL,
reguser aba_upduser NOT NULL,
moddate aba_updtime NOT NULL,
CONSTRAINT pk_bho PRIMARY KEY (coucode, bhodate)
)

This table lists days Mon-Fri that are not business days in various
countries. If you want to know in which countries Christmas Eve is
not a business day, you would say

SELECT coucode FROM bankholidays WHERE bhodate = '20041224'

But if you want to compute the settledate for a given trade date
and instrument, the country code is the top-level key. (And since
we mainly use the table for the latter purpose, this is why coucode
comes first.)

Thus, for such a table, it is very common to have a index on the
reverse order of the key, or some reverse order for a key with three
or more columns.

What you should make special consideration, is which index should be
the clustered index. This is necessarily not the primary key. It
could be a different combination of the PK - or on a completely
different column.

Finally, one thing also to keep in mind is how you use the columns.
If you say things like:

SELECT * FROM tbl WHERE a = 1 AND b BETWEEN 12 AND 19

An index on (a,b) is better, even if a only has 10 possible values and
b has 100.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Oct 17 '05 #4
Erland,

You're right to say that Ledger is the "main" key. I should have
explained but it looks like you figured it out yourself. I also
incorrectly stated the current column order in the PK. It is Ledger,
OrderLineNumber , OrderNumber and NOT Ledger, OrderNumber,
OrderLineNumber .

If I use your hierarchy theory the best order is Ledger, OrderNumber,
OrderLineNumber . This also fits in nicely with Joe's statement that "an
index (a, b, c) in effect gives you indexing for (a, b) and (a)",
because a common SELECT or JOIN on this table would involve Ledger and
OrderNumber as this points to the PK in the Orders table.

I am running SQL Profiler against a customer's database today so I hope
to have more information about the queries that our users run. This
will enable me to make a more informed decision about the sequence of
the columns in each PK in the database. It seems to me that it is more
important to understand usage of the database rather than ordering them
by number of distinct values.

Ross

Oct 18 '05 #5
(gi******@talk2 1.com) writes:
You're right to say that Ledger is the "main" key. I should have
explained but it looks like you figured it out yourself. I also
incorrectly stated the current column order in the PK. It is Ledger,
OrderLineNumber , OrderNumber and NOT Ledger, OrderNumber,
OrderLineNumber .


Assuming that OrderLineNumber refers to a detail line in an order,
having OrderLineNumber before OrderNumber sounds like someone had
a brainmelt. There is all reason to change that key.

(Unless queries like "Show me all line 5 on all orders in one ledger"
are common.)


--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Oct 18 '05 #6
Our review of the primary keys was triggered by one of our customers
complaining about the performance of the database. They claimed to have
"re-sequenced the primary key which reduced Index reads by a factor of
5". Do you have any idea as to how they arrived at this conclusion?

Oct 25 '05 #7
I created a large table (1306677 rows, 151888 KB) with the
characteristics as originally described for Ledger, OrderNumber and
OrderLineNumber . I then tried different primary key column orders (all 6
combinations). After reindexing, each combination resulted in the same
number of pages and same index depth. In other words: there will be no
performance difference for exact primary key matches.

Maybe the table was very fragmented and had much unused space. Changing
the primary key column order would effectively reindex the table, and
thus remove fragmentation and restore the original fill factor. This
could easily decrease the index depth and result in the performance
gain.

Gert-Jan
gi******@talk21 .com wrote:

Our review of the primary keys was triggered by one of our customers
complaining about the performance of the database. They claimed to have
"re-sequenced the primary key which reduced Index reads by a factor of
5". Do you have any idea as to how they arrived at this conclusion?

Oct 25 '05 #8
(gi******@talk2 1.com) writes:
Our review of the primary keys was triggered by one of our customers
complaining about the performance of the database. They claimed to have
"re-sequenced the primary key which reduced Index reads by a factor of
5". Do you have any idea as to how they arrived at this conclusion?


Sounds like hogwash to me. But if you are an evil man, please ask them
for a repro that demonstrates this. (For the sake of customer
relations, you might prefer to avoid it though. :-)

Gert-Jan's theory about fragmentation being fixed by reindexing sounds
plausible to me.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Oct 25 '05 #9
Erland Sommarskog wrote:
(gi******@talk2 1.com) writes:
Our review of the primary keys was triggered by one of our customers
complaining about the performance of the database. They claimed to
have "re-sequenced the primary key which reduced Index reads by a
factor of 5". Do you have any idea as to how they arrived at this
conclusion?
Sounds like hogwash to me.


I don't know the table structure and queries but there actually are
scenarios with different queries against the same table where the overall
benefit of reordering columns is significant.
But if you are an evil man, please ask them
for a repro that demonstrates this. (For the sake of customer
relations, you might prefer to avoid it though. :-)
That depends. From my experience customers actually honor showing
interest in their findings. Remember that they experienced a success
because they were able to implement such a dramatic improvement. In the
end both sides benefit which in turn can improve customer satisfaction.
Gert-Jan's theory about fragmentation being fixed by reindexing sounds
plausible to me.


Yes, that's another option.

Kind regards

robert

Oct 26 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
36770
by: Ken1 | last post by:
I am going to drop a primary key from one column and create a new column to be used as primary key in an existing database. The old column was a date column which someone earlier though was a good candidate for a primary key which we all know it's not. Now I want to add a new field, i.e. called ID, with a normal number sequence as primary key. I have dropped the primary key, created the new column, created the new sequence and created...
0
1445
by: CSN | last post by:
I was just wondering, what are the "cache_value", "long_cnt", and "is_cycled" sequence columns used for? I couldn't find anything in the docs or Google. select * from test_id_seq; sequence_name | last_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called ---------------+------------+--------------+---------------------+-----------+-------------+---------+-----------+-----------
10
2622
by: Anthony Best | last post by:
I'm working on an idea that uses sequences. I'm going to create a table like this: id serial, sequence int, keyword varchar(32), text text for every keyword there will be a uniq sequence for it eg:
15
2786
by: Steffen Ramlow | last post by:
Hi there, is there any way besides a trigger to create a unique number for a column in a "insert into... select from" statement? Thx, sr -- Kein Plan überlebt die erste Feindberührung.
6
8962
by: Emma Middlebrook | last post by:
Hi there, I have created an XmlSchema to represent the tables in my database and I have specified a primary key for each of the tables. However, to be able to execute the following code: DatabaseTables.CategoryInfoRow categoryRow = m_dtCategories.Rows.Find(selected.category_ID) as DatabaseTables.CategoryInfoRow;
9
3389
by: Steven C. | last post by:
Hello: I'm getting an error, "primary key not defined" when trying to use the FIND method on the DataTable Rows collection. I have a typed dataset called 'MortgagesDS' that I created with the XSL builder tool in .NET. I define the dataset at the beginning the the main class: MortgageDS _MortgageDS = new MortgageDS();
115
6235
by: LurfysMa | last post by:
Most of the reference books recommend autonum primary keys, but the Access help says that any unique keys will work. What are the tradeoffs? I have several tables that have unique fields. Can I use them as primary keys or should I define an autonum primary key? One table has information about the 50 states in the US. The table looks like this:
7
3119
by: rAinDeEr | last post by:
I have a Parts Table with the following structure.. I need to decide which makes the primary key of the table. I cant have any sequences nor can I add any new fields in this Table. Table Structure :: Create table Parts (Parts_desig char(10), Sequence_no integer, Effective_Date Date,
0
3031
by: Frank Swarbrick | last post by:
So we're trying to decide if it's better to use IDENTITY columns or sequences to create a surrogate key as the primary key for our tables. I kind of like the identity column, because it's more 'tightly integrated' in to the table. With a sequence you have to make sure that each application that inserts records uses the same sequence. (Probably not likely that it wouldn't, but...) One thing where it seems like a SEQUENCE would be...
0
8687
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8617
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9174
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9035
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
6534
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5875
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
3057
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2347
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2009
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.