473,785 Members | 3,349 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Copy 60GB table?

I'm trying to copy all 440 million rows from one table in my SQL Server
2005 db to another table with a different clustering scheme. After a
few test inserts that were successful (up to a million rows in 20
seconds with no problem), I crossed my fingers and went whole-hog with
the naive "insert into dest_table (column1, column2, ...) select
column1, column2, ... from source_table" but of course it ran out of
space in *both* the db log file (at about 130 GB) and the tempdb log
file (at about 54GB) and rolled it back, wrapping up 10 hours after I
kicked it off.

I think it would work if I could insert in batches and/or disable
logging, like with a bulk insert, so I tried exporting the source table
to a flat file in the hopes of bulk-importing it back into the
destination table, but the export just halted for no reason after 176
million rows, using all the CPU and not doing anything at all, just
sitting there. (The fact that this operation not only failed, but
brought down the whole server - any operation on any database in the
instance timed out, no clients could connect, the whole deal - I find
very disappointing for what MS claimed was a scalable, robust product
that could rival Oracle.)

What I'm wondering is, does anyone know of a way to copy a table that
is apparently just too large for SQL Server 2005 to handle?

Thanks,
Seth

May 10 '06 #1
6 4195
sql_server_user (ka*******@gmai l.com) writes:
I'm trying to copy all 440 million rows from one table in my SQL Server
2005 db to another table with a different clustering scheme. After a
few test inserts that were successful (up to a million rows in 20
seconds with no problem), I crossed my fingers and went whole-hog with
the naive "insert into dest_table (column1, column2, ...) select
column1, column2, ... from source_table" but of course it ran out of
space in *both* the db log file (at about 130 GB) and the tempdb log
file (at about 54GB) and rolled it back, wrapping up 10 hours after I
kicked it off.


That operation is definitely not a walk in a park.

I think I would attempt this strategy:

1) Set recovery for the database to simple recovery.
2) Create the clustered index on the target table with a relatively low
fill factor, say 30%. (Of course, no NC indexes would be in place.)
3) Insert batchwise from the old table, where batches are defined by
intervals of the clustered index of the source table. For batchsize,
I would use 100000 to 500000 rows, depending on wide the table is.
4) I would make the control loop so that it is easy to stop it. For
instance on each iteration read from a control table. Possibly also
a table the defines the batches, so you can keep track of what has
been copied. But the table should not be joined with the INSERT;
read interval into variales.
5) Occassionally stop the process and monitor fragmentation in
target table. It should first decrease, as holes are filled in,
but then it will increase again. (But you will probably see this
from the fact that the time for iteration increases.)
6) Once everything is done, switch to full recovery and backup the
database.

My thinking here is that creating the clustered index on this monster
is going require a lot of log and tempdb - you can guess what happens.
So let's have the clustered index in place from the start, even if
that will take longer time.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
May 10 '06 #2
Thanks, Erland - I'm gonna have to pay you a consulting fee soon.

The recovery model is set to simple, there are no nonclustered indexes
on the table, and I agree that clustering on insert is the way to go,
since I think trying to recluster the new table with all the data in it
would just blow up the logs and tempdb like my initial insert attempt.
I like the idea of the batched insert with the low fill factor
clustered index, however, I ran a bunch of tests and the amount of log
and tempdb space required seems to be a function of the size of the
destination table, not the number of rows in the batch: I would insert
a batch (10 million rows), shrink the log and tempdb, insert another
batch, and so on, and every time, the log and tempdb would require more
space, roughly linearly as my very approximate data below indicate:

after 1st batch: log = 1GB, tempdb = 2GB
after shrinking and 2nd batch: log = 2GB, tempdb = 4GB
after shrinking and 3rd batch: log = 3 GB, tempdb = 6GB

So I'm worried about the batch approach for the full insert.

I'm currently trying an SSIS copy from one table to the other, and
since it's doing a bulk insert it doesn't seem to be touching the db
log. However, tempdb is getting pretty fat, 12GB and going strong, but
I've cleared a lot of room for it this time - it has about 179GB to go,
so hopefully that will be enough. If this doesn't work I'll definitely
try your batched insert suggestion.

Thanks again,
Seth

Erland Sommarskog wrote:
sql_server_user (ka*******@gmai l.com) writes:
I'm trying to copy all 440 million rows from one table in my SQL Server
2005 db to another table with a different clustering scheme. After a
few test inserts that were successful (up to a million rows in 20
seconds with no problem), I crossed my fingers and went whole-hog with
the naive "insert into dest_table (column1, column2, ...) select
column1, column2, ... from source_table" but of course it ran out of
space in *both* the db log file (at about 130 GB) and the tempdb log
file (at about 54GB) and rolled it back, wrapping up 10 hours after I
kicked it off.


That operation is definitely not a walk in a park.

I think I would attempt this strategy:

1) Set recovery for the database to simple recovery.
2) Create the clustered index on the target table with a relatively low
fill factor, say 30%. (Of course, no NC indexes would be in place.)
3) Insert batchwise from the old table, where batches are defined by
intervals of the clustered index of the source table. For batchsize,
I would use 100000 to 500000 rows, depending on wide the table is.
4) I would make the control loop so that it is easy to stop it. For
instance on each iteration read from a control table. Possibly also
a table the defines the batches, so you can keep track of what has
been copied. But the table should not be joined with the INSERT;
read interval into variales.
5) Occassionally stop the process and monitor fragmentation in
target table. It should first decrease, as holes are filled in,
but then it will increase again. (But you will probably see this
from the fact that the time for iteration increases.)
6) Once everything is done, switch to full recovery and backup the
database.

My thinking here is that creating the clustered index on this monster
is going require a lot of log and tempdb - you can guess what happens.
So let's have the clustered index in place from the start, even if
that will take longer time.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


May 10 '06 #3

"sql_server_use r" <ka*******@gmai l.com> wrote in message
news:11******** **************@ u72g2000cwu.goo glegroups.com.. .
Thanks, Erland - I'm gonna have to pay you a consulting fee soon.

BTW, might be worth looking at using BCP to dump it out and then insert it
into the new table.

Then build your indices.

The recovery model is set to simple, there are no nonclustered indexes
on the table, and I agree that clustering on insert is the way to go,
since I think trying to recluster the new table with all the data in it
would just blow up the logs and tempdb like my initial insert attempt.
I like the idea of the batched insert with the low fill factor
clustered index, however, I ran a bunch of tests and the amount of log
and tempdb space required seems to be a function of the size of the
destination table, not the number of rows in the batch: I would insert
a batch (10 million rows), shrink the log and tempdb, insert another
batch, and so on, and every time, the log and tempdb would require more
space, roughly linearly as my very approximate data below indicate:

after 1st batch: log = 1GB, tempdb = 2GB
after shrinking and 2nd batch: log = 2GB, tempdb = 4GB
after shrinking and 3rd batch: log = 3 GB, tempdb = 6GB

So I'm worried about the batch approach for the full insert.

I'm currently trying an SSIS copy from one table to the other, and
since it's doing a bulk insert it doesn't seem to be touching the db
log. However, tempdb is getting pretty fat, 12GB and going strong, but
I've cleared a lot of room for it this time - it has about 179GB to go,
so hopefully that will be enough. If this doesn't work I'll definitely
try your batched insert suggestion.

Thanks again,
Seth

Erland Sommarskog wrote:
sql_server_user (ka*******@gmai l.com) writes:
I'm trying to copy all 440 million rows from one table in my SQL Server 2005 db to another table with a different clustering scheme. After a
few test inserts that were successful (up to a million rows in 20
seconds with no problem), I crossed my fingers and went whole-hog with
the naive "insert into dest_table (column1, column2, ...) select
column1, column2, ... from source_table" but of course it ran out of
space in *both* the db log file (at about 130 GB) and the tempdb log
file (at about 54GB) and rolled it back, wrapping up 10 hours after I
kicked it off.


That operation is definitely not a walk in a park.

I think I would attempt this strategy:

1) Set recovery for the database to simple recovery.
2) Create the clustered index on the target table with a relatively low
fill factor, say 30%. (Of course, no NC indexes would be in place.)
3) Insert batchwise from the old table, where batches are defined by
intervals of the clustered index of the source table. For batchsize,
I would use 100000 to 500000 rows, depending on wide the table is.
4) I would make the control loop so that it is easy to stop it. For
instance on each iteration read from a control table. Possibly also
a table the defines the batches, so you can keep track of what has
been copied. But the table should not be joined with the INSERT;
read interval into variales.
5) Occassionally stop the process and monitor fragmentation in
target table. It should first decrease, as holes are filled in,
but then it will increase again. (But you will probably see this
from the fact that the time for iteration increases.)
6) Once everything is done, switch to full recovery and backup the
database.

My thinking here is that creating the clustered index on this monster
is going require a lot of log and tempdb - you can guess what happens.
So let's have the clustered index in place from the start, even if
that will take longer time.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

May 11 '06 #4
> 2) Create the clustered index on the target table with a relatively low
fill factor, say 30%. (Of course, no NC indexes would be in place.)
As the table is empty there is no point putting on the fillfactor because
its not enforced for new pages.

Only good for creating indexes on existing data.

I agree though - clustered index should be in place before you start.

--
Tony Rogerson
SQL Server MVP
http://sqlserverfaq.com - free video tutorials
"Erland Sommarskog" <es****@sommars kog.se> wrote in message
news:Xn******** **************@ 127.0.0.1... sql_server_user (ka*******@gmai l.com) writes:
I'm trying to copy all 440 million rows from one table in my SQL Server
2005 db to another table with a different clustering scheme. After a
few test inserts that were successful (up to a million rows in 20
seconds with no problem), I crossed my fingers and went whole-hog with
the naive "insert into dest_table (column1, column2, ...) select
column1, column2, ... from source_table" but of course it ran out of
space in *both* the db log file (at about 130 GB) and the tempdb log
file (at about 54GB) and rolled it back, wrapping up 10 hours after I
kicked it off.


That operation is definitely not a walk in a park.

I think I would attempt this strategy:

1) Set recovery for the database to simple recovery.
2) Create the clustered index on the target table with a relatively low
fill factor, say 30%. (Of course, no NC indexes would be in place.)
3) Insert batchwise from the old table, where batches are defined by
intervals of the clustered index of the source table. For batchsize,
I would use 100000 to 500000 rows, depending on wide the table is.
4) I would make the control loop so that it is easy to stop it. For
instance on each iteration read from a control table. Possibly also
a table the defines the batches, so you can keep track of what has
been copied. But the table should not be joined with the INSERT;
read interval into variales.
5) Occassionally stop the process and monitor fragmentation in
target table. It should first decrease, as holes are filled in,
but then it will increase again. (But you will probably see this
from the fact that the time for iteration increases.)
6) Once everything is done, switch to full recovery and backup the
database.

My thinking here is that creating the clustered index on this monster
is going require a lot of log and tempdb - you can guess what happens.
So let's have the clustered index in place from the start, even if
that will take longer time.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

May 11 '06 #5
Tony Rogerson (to**********@s qlserverfaq.com ) writes:
2) Create the clustered index on the target table with a relatively low
fill factor, say 30%. (Of course, no NC indexes would be in place.)


As the table is empty there is no point putting on the fillfactor because
its not enforced for new pages.

Only good for creating indexes on existing data.

I agree though - clustered index should be in place before you start.


Eh, I guess that I will have to stick with my general lame excuse that
the hour was late.

However, it could maybe be idea to reindex every now and then with a low
fill-factor.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
May 11 '06 #6
SSIS worked. It looks like it just put the *entire* table in tempdb -
to sort it into the new clustering order, I guess. The key was
eliminating logging with the bulk insert. Thanks for the help,
everybody.

May 11 '06 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
8371
by: Tlm | last post by:
Hello All, I have a form (FrmA) with a subform (SubFrmB) embedded in it. SubFrmB also has a subform embedded in it (SubFrmC) The form's recordsource is based on a table (TblA). SubFrmB's recordsource is also based on a table (TblB). SubFrmC's recordsource is also based on a table (TblC). There is a one-to-many relationship between TblA (one) and TblB (many).
19
3479
by: davidgordon | last post by:
Hi, I need some pointers/help on how to do the following if it possible: In my access db, I have the following: Tables: Products, Sub-Assembly, Product-Pack Table, Products
7
11639
by: lgbjr | last post by:
Hello All, I¡¯m using a context menu associated with some pictureboxes to provide copy/paste functionality. Copying the image to the clipboard was easy. But pasting an image from the clipboard is proving to be more difficult. These pictureboxes are bound to an AccessDB. If the user wants to add an image, they select an image using an OpenFileDialog: Dim result As DialogResult = Pic_Sel.ShowDialog() If (result = DialogResult.OK) Then
2
2672
by: Janning Vygen | last post by:
hi PGurus, i searched the archives and read the docs, because this problem shouldn't be new. But i really don't know what to search for. i am populating a database (v7.4.1) with COPY. Some triggers in the db schema fire on inserts and do some additional data manipulations, which i dont want to take place by running COPY because my data don't need those additional manipulation. Therefor and for performance reasons, i disable all...
2
4849
by: Clodoaldo Pinto Neto | last post by:
Hi all, I'm trying to copy a table with a text field column containing a new line char to a file: ksDesenv=# create table page(line text) without oids; CREATE TABLE ksDesenv=# insert into page (line) values('1stline'||chr(10)||'2ndline'); INSERT 0 1
2
2473
by: Marcin Zmyslowski | last post by:
Hello all! How to copy a table in MS SQL Server 2000 without chaning a structure? I mean, I have one table, which has autoincrement numeric field (ID). When I copy this table by exporting this table into the same database folder I loose the specification of the field ID. Now it is not autoincrement field but usual int field. Is it possible to copy this table without changing data and structure of a table? Kindest regards
2
3469
by: Swinky | last post by:
I hope someone can help...I feel like I'm walking in the dark without a flashlight (I'm NOT a programmer but have been called to task to do some work in Access that is above my head). I have code that will successfully copy a record and append the information to a new record in the same table (parent table) within a form. However, there are related child tables with primary keys (set to Autonumber) stored in sub-forms. That information...
4
4978
by: Jim Devenish | last post by:
I wish to copy a table on a SQL Server backend to a table on an Access (.mdb) front end in the simplest way. I have the following to get the recordset but am seeking something easier. Dim theConnectionString theConnectionString = "Driver=SQL Server; Server=Server;Database=myData;Trusted_Connection=Yes"
9
15148
by: fniles | last post by:
I would like to copy a table(s) from SQL Server 2005 to a CVS file and vice versa. I was thinking to use the BCP command line utility, but I have a few questions: 1. The machine where I am going to run BCP utility does not have SQL Server installed, so BCP.exe and its underlying DLLs are not there. I copied BCP.EXE, then it tells me that it expects sqlncli.dll. After I copied that DLL, it now says "Unable to load BCP resource DLL. BCP...
0
9643
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10315
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10147
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10085
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8968
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7494
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6737
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
4045
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3645
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.