By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,846 Members | 2,082 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,846 IT Pros & Developers. It's quick & easy.

SQLServer Table Partitioning

P: n/a
Hi!

I have a question:

I already have a DB that uses partitions to divide data in US
Counties, partitioned by state.

Can I use TWO levels of partitioning?

I mean... 3077 filegroups and 50 partition functions that address
them, but can I use another function to group the 50 states?

Thanks!

Piero

Jun 5 '07 #1
Share this Question
Share on Google+
15 Replies


P: n/a
Piero 'Giops' Giorgi (gi**********@gmail.com) writes:
I already have a DB that uses partitions to divide data in US
Counties, partitioned by state.

Can I use TWO levels of partitioning?

I mean... 3077 filegroups and 50 partition functions that address
them, but can I use another function to group the 50 states?
Do I understand it correctly that you already have 50 partitions, and
now you want even more? About what size do you expect per partition?

I'm not sure that partitioning by state is the best strategy. The partition
for Californina will be a lot bigger than the ones for Alaska and Rhode
Island.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Jun 5 '07 #2

P: n/a
Do I understand it correctly that you already have 50 partitions, and
now you want even more? About what size do you expect per partition?

I'm not sure that partitioning by state is the best strategy. The partition
for California will be a lot bigger than the ones for Alaska and Rhode
Island.
I know that, but partitioning by county makes the DB a lot easier to
maintain.
I have to work that way because I'm dealing with criminal records, and
they are separated by county with a ton of different files, so for
many of them I have to clear the table and reload the whole county
every time I get an update. Easier on partitions... :-)

Table size can be anywhere from 8000 to 3 million records, depending
on the county.

The best way to do that would be having a table partitioned over 3077
filegroups, so storing the data will go by COUNTY in this way:

CA_ALAMEDA
CA_ALPINE
CA_AMADOR
CA_BUTTE
CA_CALAVERAS
CA_COLUSA
CA_CONTRA_COSTA
CA_DEL_NORTE
CA_EL_DORADO
CA_FRESNO
CA_GLENN
CA_HUMBOLDT
CA_IMPERIAL
CA_INYO

With the COUNTY as the partition Parameter.

But, before trying, can I have 3077 files in ONE partition, and drop
all the states stuff?

Thanks!

Piero

Jun 5 '07 #3

P: n/a
Piero 'Giops' Giorgi (gi**********@gmail.com) writes:
I know that, but partitioning by county makes the DB a lot easier to
maintain.
I have to work that way because I'm dealing with criminal records, and
they are separated by county with a ton of different files, so for
many of them I have to clear the table and reload the whole county
every time I get an update. Easier on partitions... :-)

Table size can be anywhere from 8000 to 3 million records, depending
on the county.
Deleting 8000 rows is a breeze, but deleting 3 million rows takes
some resources, particularly if the rows are wide. But it still only
a matter of minutes.
But, before trying, can I have 3077 files in ONE partition, and drop
all the states stuff?
No, in the topic for CREATE PARTITION FUNCTION, I found that you
cannot have more than 999 boundary values.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Jun 5 '07 #4

P: n/a
On Jun 5, 3:24 pm, Erland Sommarskog <esq...@sommarskog.sewrote:
>But, before trying, can I have 3077 files in ONE partition, and drop
all the states stuff?
No, in the topic for CREATE PARTITION FUNCTION, I found that you
cannot have more than 999 boundary values.
Dang it... I kew there was a catch.

So, I'll be forced to have 3077 filegroups, grouped with 50
partitions.
Is there a way to have a partition function/scheme that sees other
schemes, instead of filegroups?

I mean Filegroups Counties (3077) - grouped by state (50) - all
together in ONE partitioned table.

Any Ideas?

Thank you!

Piero

Jun 6 '07 #5

P: n/a
Erland Sommarskog wrote:
Piero 'Giops' Giorgi (gi**********@gmail.com) writes:
>I know that, but partitioning by county makes the DB a lot easier to
maintain.
I have to work that way because I'm dealing with criminal records, and
they are separated by county with a ton of different files, so for
many of them I have to clear the table and reload the whole county
every time I get an update. Easier on partitions... :-)

Table size can be anywhere from 8000 to 3 million records, depending
on the county.

Deleting 8000 rows is a breeze, but deleting 3 million rows takes
some resources, particularly if the rows are wide. But it still only
a matter of minutes.
I do assume that (state, county) is an index. If not, then get
that fixed yesterday.
Jun 6 '07 #6

P: n/a
So, I'll be forced to have 3077 filegroups, grouped with 50
partitions.
Is there a way to have a partition function/scheme that sees other
schemes, instead of filegroups?
Why do you need separate filegoups? It seems to me that the main purpose of
partitioning here is for manageability and all those files/filegroups only
add to administration complexity and wasted space.

You might consider a hybrid solution with 50 individual state tables
included in a partitioned view, with each state table partitioned by county.
This approach would leverage partitioning to quickly reload individual
counties yet provide a seamless view of the entire country.
--
Hope this helps.

Dan Guzman
SQL Server MVP

"Piero 'Giops' Giorgi" <gi**********@gmail.comwrote in message
news:11**********************@j4g2000prf.googlegro ups.com...
On Jun 5, 3:24 pm, Erland Sommarskog <esq...@sommarskog.sewrote:
>>But, before trying, can I have 3077 files in ONE partition, and drop
all the states stuff?
>No, in the topic for CREATE PARTITION FUNCTION, I found that you
cannot have more than 999 boundary values.

Dang it... I kew there was a catch.

So, I'll be forced to have 3077 filegroups, grouped with 50
partitions.
Is there a way to have a partition function/scheme that sees other
schemes, instead of filegroups?

I mean Filegroups Counties (3077) - grouped by state (50) - all
together in ONE partitioned table.

Any Ideas?

Thank you!

Piero
Jun 6 '07 #7

P: n/a
You might consider a hybrid solution with 50 individual state tables
included in a partitioned view, with each state table partitioned by county.
This approach would leverage partitioning to quickly reload individual
counties yet provide a seamless view of the entire country.
THANK YOU!
That is exactly what I want to do, but unfortunately I'm not (YET)
able to do it.

How can I have a partitioned view of partitioned tables?
I have the 50 state tables partitioned by county, but I can't get to
the next step.

Can someone post a small example of the thing?

Thanks

Piero

Jun 6 '07 #8

P: n/a
On Jun 5, 10:02 pm, Ed Murphy <emurph...@socal.rr.comwrote:
I do assume that (state, county) is an index. If not, then get
that fixed yesterday.
Of course!
Actually it was fixes the day BEFORE yesterday... :-)

Piero

Jun 6 '07 #9

P: n/a
Piero 'Giops' Giorgi (gi**********@gmail.com) writes:
>You might consider a hybrid solution with 50 individual state tables
included in a partitioned view, with each state table partitioned by
county. This approach would leverage partitioning to quickly reload
individual counties yet provide a seamless view of the entire country.

That is exactly what I want to do, but unfortunately I'm not (YET)
able to do it.

How can I have a partitioned view of partitioned tables?
I have the 50 state tables partitioned by county, but I can't get to
the next step.

Can someone post a small example of the thing?
To me that sounds like a managability nightmare. While you can query
the beast in one query, when you need to flush the rows for Orange
County, you would have to explicitly to go to the CA table to
switch partitions, which would mean a lot of dynamic SQL.

I don't know if there is any catch with partition views over partitioned
tables (I really need to find some time to play with partitioned tables
to learn them!), but in a normal partitioned view you would have:

CREATE TABLE CA (state char(2) DEFAULT 'CA' CHECK (state = 'CA'),
-- other columns
PRIMARY KEY (state, county, whatever))

CREATE TABLE RI (state char(2) DEFAULT 'RI' CHECK (state = 'RI'),
...

CREATE VIEW thewholebunch AS
SELECT state, county, .....
FROM CA
UNION ALL
SELECT state, county, .....
FROM RI
....

But personally I would look into make the merging of new files more
effective than just dropping all existing rows.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Jun 6 '07 #10

P: n/a
On Jun 6, 2:13 pm, Erland Sommarskog <esq...@sommarskog.sewrote:
To me that sounds like a managability nightmare. While you can query
the beast in one query, when you need to flush the rows for Orange
County, you would have to explicitly to go to the CA table to
switch partitions, which would mean a lot of dynamic SQL.
That can be, but the issue is the way counties update their records.

Some (VERY few) counties send a monthly "Update" file, but almost all
of them just send the whole dump of their DB.

This gives us two problems:

1) Normalizing the data in a common format (Naturally, no two counties
have the same record format)

2) Updating the db in an "Online" state for Orange County only (With
more than 3mil rows) checking for "Pre-Existing" records would just
kill the server, and there are 3077 counties...

Any better ideas?

Piero

Jun 6 '07 #11

P: n/a
Piero 'Giops' Giorgi (gi**********@gmail.com) writes:
1) Normalizing the data in a common format (Naturally, no two counties
have the same record format)
But this is a problem you have no matter the solution to load the data,
right? Since you work with a partitioned table, you have the same schemas
for all counties. Once you have normalised the format, you can load it
into a staging table and work from there.
2) Updating the db in an "Online" state for Orange County only (With
more than 3mil rows) checking for "Pre-Existing" records would just
kill the server, and there are 3077 counties...
Did you actually try it? With proper indexing this does not have to be
that painful.

Then again, if you have code to normalise the data for 3077 counties, I
guess 50 tables with a total of 3077 partitions is a smaller headache.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Jun 7 '07 #12

P: n/a
Erland, thanks for providing Piero with he partitioned view example.

To me that sounds like a managability nightmare. While you can query
the beast in one query, when you need to flush the rows for Orange
County, you would have to explicitly to go to the CA table to
switch partitions, which would mean a lot of dynamic SQL.
It's true that Piero's county import process will need to be state-table
aware. Rather than traditional dynamic SQL, an alternative is SQLCMD
scripts with variables executed via SQLCMD.EXE or SSMS in SQLCMD mode.
IMHO, the SQLCMD variable approach is a bit cleaner.

The example below assumes the state tables, partition functions and
partition schemes all have a state code suffix. The archive and staging
tables use the same partitioning scheme as the primary table to simply
things. This ensures those objects are on the same file groups and also
eliminates the need to create a check constraint on the staging table county
column.

I don't know if there is any catch with partition views over partitioned
tables
I must admit I had not considered a partitioned view over partitioned tables
before this thread. I did some cursory testing with Piero's state/county
scenario and it seems to work as expected but there might be gochas. My
biggest concern here is with query complexity when joining the partitioned
view. This approach is probably rarely used so Piero should probably test
thoroughly before committing.

(I really need to find some time to play with partitioned tables
to learn them!)
I'm fortunate because I had to develop a complex sliding-window partitioning
scheme for one of our applications that gave me the opportunity to learn the
finer points of SQL 2005 partitioning. There's just so many cool features
in the product nowadays that it's hard to find the time to thoroughly learn
most, let alone all. Now if I could only get fully up to speed on the new
features before SQL Server 2008 ;-)
--sample script for county import process

--define and initialize SQLCMD variables
:setvar StateCode WA
:setvar County Spokane

--create archive table for county
IF OBJECT_ID(N'dbo.CountyArchive', 'U') IS NOT NULL
DROP TABLE dbo.CountyArchive
GO
CREATE TABLE dbo.CountyArchive
(
StateCode char(2) NOT NULL,
County varchar(50) NOT NULL,
CountyData varchar(100),
CONSTRAINT PK_CountyArchive
PRIMARY KEY CLUSTERED (StateCode, County)
ON PS_State_$(StateCode)(county)
)
GO

--create staging table for county
IF OBJECT_ID(N'dbo.CountyStaging', 'U') IS NOT NULL
DROP TABLE dbo.CountyStaging
GO

CREATE TABLE dbo.CountyStaging
(
StateCode char(2) NOT NULL,
County varchar(50) NOT NULL,
CountyData varchar(100),
CONSTRAINT PK_CountyStaging
PRIMARY KEY CLUSTERED (StateCode, County)
ON PS_State_$(StateCode)(county)
)
GO

---------------------------------
--load dbo.CountyStaging table here
---------------------------------

--add constraint needed for parttitioned view
ALTER TABLE dbo.CountyStaging
ADD CONSTRAINT CK_CountyStaging_State CHECK (StateCode = '$(StateCode)')

--move old county data to archive table
ALTER TABLE dbo.State_$(StateCode)
SWITCH PARTITION $PARTITION.PF_State_$(StateCode)('$(County)') TO
dbo.CountyArchive PARTITION
$PARTITION.PF_State_$(StateCode)('$(County)')

--move new county data into state table
ALTER TABLE dbo.CountyStaging
SWITCH PARTITION $PARTITION.PF_State_$(StateCode)('$(County)') TO
dbo.State_$(StateCode) PARTITION
$PARTITION.PF_State_$(StateCode)('$(County)')
--
Hope this helps.

Dan Guzman
SQL Server MVP

"Erland Sommarskog" <es****@sommarskog.sewrote in message
news:Xn**********************@127.0.0.1...
Piero 'Giops' Giorgi (gi**********@gmail.com) writes:
>>You might consider a hybrid solution with 50 individual state tables
included in a partitioned view, with each state table partitioned by
county. This approach would leverage partitioning to quickly reload
individual counties yet provide a seamless view of the entire country.

That is exactly what I want to do, but unfortunately I'm not (YET)
able to do it.

How can I have a partitioned view of partitioned tables?
I have the 50 state tables partitioned by county, but I can't get to
the next step.

Can someone post a small example of the thing?

To me that sounds like a managability nightmare. While you can query
the beast in one query, when you need to flush the rows for Orange
County, you would have to explicitly to go to the CA table to
switch partitions, which would mean a lot of dynamic SQL.

I don't know if there is any catch with partition views over partitioned
tables (I really need to find some time to play with partitioned tables
to learn them!), but in a normal partitioned view you would have:

CREATE TABLE CA (state char(2) DEFAULT 'CA' CHECK (state = 'CA'),
-- other columns
PRIMARY KEY (state, county, whatever))

CREATE TABLE RI (state char(2) DEFAULT 'RI' CHECK (state = 'RI'),
...

CREATE VIEW thewholebunch AS
SELECT state, county, .....
FROM CA
UNION ALL
SELECT state, county, .....
FROM RI
....

But personally I would look into make the merging of new files more
effective than just dropping all existing rows.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Jun 7 '07 #13

P: n/a
On Jun 7, 5:15 am, "Dan Guzman" <guzma...@nospam-online.sbcglobal.net>
wrote:
Hope this helps.
That's exactly what I had in mind. Thank you!

One thing... is there any way to partition the states, too?
I mean Query the whole beast with only one SQL Query?

Piero

Jun 7 '07 #14

P: n/a
Piero 'Giops' Giorgi (gi**********@gmail.com) writes:
One thing... is there any way to partition the states, too?
I mean Query the whole beast with only one SQL Query?
You would query the view.

One idea to occurred to me is that you could have a mix, so that big
counties like Orange County(*) have a single partition, where as
smaller counties and states would be gathered in the same partition.
This would mean that you would have different models to load the files.
For Orange County you switch table in and out, whereas for smaller
counties you delete and insert. Of course, this means more complex
code since there would be two code paths. But there would be far
fewer partitions to care about.
(*) When I picked Orange County as an example, I did not know that it
was one of the biggies. I just picked it as it was one of the county
names I knew; the name appears in a few Zappa tracks.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Jun 7 '07 #15

P: n/a
On Jun 7, 2:16 pm, Erland Sommarskog <esq...@sommarskog.sewrote:
Piero 'Giops' Giorgi (giorgi.pi...@gmail.com) writes:
One thing... is there any way to partition the states, too?
I mean Query the whole beast with only one SQL Query?

You would query the view.
Yes, that is what I'm doing. It's not the optimal, but it's working
enough.
One idea to occurred to me is that you could have a mix, so that big
counties like Orange County(*) have a single partition, where as
smaller counties and states would be gathered in the same partition.
That would give problems while updating the DB.
Ok, I drop the big ones and update the small ones, but IMHO it would
be better stay with one partition per County, grouped by state.
The BEST thing would be to be able to partition the states (Already
partitioned by County) in one big table.

Possible?
(*) When I picked Orange County as an example, I did not know that it
was one of the biggies. I just picked it as it was one of the county
names I knew; the name appears in a few Zappa tracks.
It's a BIG one!
But you should see New York...
I'm not saying that there are more criminals in NY, but the city is
bigger and there is a lot more people, so...

For now, I'm experimenting with 50 States, partitioned over 3077
Counties. Seems promising, but the UNION query is a resource hog. :-
(

Thank you all!

P

Piero

Jun 8 '07 #16

This discussion thread is closed

Replies have been disabled for this discussion.