473,881 Members | 1,653 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Question about move large amount of data from database to database

Lee
guys,

I have a project need to move more than 100,000 records from one
database table to another database table every week. Currently, users
input date range from web UI, my store procedure will take those date
ranges to INSERT records to a table in another database, then delete
the records, but it will take really long time to finish this action
(up to 1 or 2 hours).

My question is if there is some other way I should do to speed up the
action, I am thinking about use bcp to copy those records to datafile
and then use bcp to insert it into SQL Server table. Is this the right
way to do it or should I consider other solution (then, what is the
solution.)

Thanks a lot!

Apr 23 '07 #1
8 4082
On Apr 23, 2:23 pm, Lee <lee.jenkins... @gmail.comwrote :
guys,

I have a project need to move more than 100,000 records from one
database table to another database table every week. Currently, users
input date range from web UI, my store procedure will take those date
ranges to INSERT records to a table in another database, then delete
the records, but it will take really long time to finish this action
(up to 1 or 2 hours).

My question is if there is some other way I should do to speed up the
action, I am thinking about use bcp to copy those records to datafile
and then use bcp to insert it into SQL Server table. Is this the right
way to do it or should I consider other solution (then, what is the
solution.)

Thanks a lot!
Use a Select Into statement and make sure the destination db is set to
a simple recovery model.

Apr 23 '07 #2
Yes, BCP will be a good option for fast data transfer. All of the BULK
operations (BULK INSERT, SELECT INTO, BCP) are minimally logged when a non
FULL recovery model is set.

Another issue could be the purging of the archived records from your main
table. If you have it as a single DELETE and it takes long time to complete,
then you can break it into smaller DELETE chunks.

If you have SQL Server 2005 Enterprise Edition, an interesting alternative
is to use partitioned tables. Specifically range partitions based on date
ranges (in your case could be weekly) can help with archiving. Take a look
at the following article (in particular the section about Range Partitions):
http://msdn2.microsoft.com/en-us/library/ms345146.aspx

HTH,

Plamen Ratchev
http://www.SQLStudio.com
Apr 23 '07 #3
Lee (le************ @gmail.com) writes:
I have a project need to move more than 100,000 records from one
database table to another database table every week. Currently, users
input date range from web UI, my store procedure will take those date
ranges to INSERT records to a table in another database, then delete
the records, but it will take really long time to finish this action
(up to 1 or 2 hours).
It shouldn't take 1-2 hours to move 100.000 rows. It sounds like the
process is not well implemented, or that there are indexes missing. Yes,
you can gain speed by using BCP, but you also add complexity to the
solution that I can't really see should be needed with the volumes you
indicate?

Would it be possible for you to post the definition of the tables, including
indexes and the stored procedure?
--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Apr 23 '07 #4
Lee
Plamen, Thanks a lot. I will try it and let you know the result.
Thanks again!

On Apr 23, 2:08 pm, "Plamen Ratchev" <Pla...@SQLStud io.comwrote:
Yes, BCP will be a good option for fast data transfer. All of the BULK
operations (BULK INSERT, SELECT INTO, BCP) are minimally logged when a non
FULL recovery model is set.

Another issue could be the purging of the archived records from your main
table. If you have it as a single DELETE and it takes long time to complete,
then you can break it into smaller DELETE chunks.

If you have SQL Server 2005 Enterprise Edition, an interesting alternative
is to use partitioned tables. Specifically range partitions based on date
ranges (in your case could be weekly) can help with archiving. Take a look
at the following article (in particular the section about Range Partitions):http://msdn2.microsoft.com/en-us/library/ms345146.aspx

HTH,

Plamen Ratchevhttp://www.SQLStudio.c om

Apr 23 '07 #5
Lee
Erland, Thanks a lot for the reply, also forgot to say thanks to Brad,
Here is the table:

CREATE TABLE [dbo].[tbl_record](
[record_id] [int] IDENTITY(1,1) NOT NULL,
[record_CC_id] [int] NOT NULL,
[record_content] [varchar](500) COLLATE SQL_Latin1_Gene ral_CP1_CI_AS
NOT NULL CONSTRAINT [DF_tbl_record_r ecord_content] DEFAULT (''),
[record_date] [datetime] NOT NULL CONSTRAINT
[DF_tbl_record_r ecord_date] DEFAULT (getdate()),
[record_ip] [varchar](20) COLLATE SQL_Latin1_Gene ral_CP1_CI_AS NOT
NULL CONSTRAINT [DF_tbl_record_r ecord_ip] DEFAULT (''),
[record_active] [bit] NOT NULL CONSTRAINT
[DF_tbl_record_r ecord_archive] DEFAULT (1),
CONSTRAINT [PK_tbl_record] PRIMARY KEY CLUSTERED
(
[record_id] ASC
)WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF, FILLFACTOR = 90) ON
[PRIMARY]
) ON [PRIMARY]

And The stored procedure is here:

ALTER PROCEDURE [dbo].[ArchiveRecords]
(
@ddate datetime
)
AS
BEGIN TRAN
SET IDENTITY_INSERT record_archive. dbo.tbl_record_ archive ON;
INSERT INTO record_archive. dbo.tbl_record_ archive
(
record_id,
record_CC_id,
record_content,
record_date,
record_ip,
record_active
)
SELECT
record_id,
record_CC_id,
record_content,
record_date,
record_ip,
record_active
FROM tbl_record WHERE record_date <= @ddate;
DELETE FROM tbl_record WHERE record_date <= @ddate;
SET IDENTITY_INSERT record_archive. dbo.tbl_record_ archive OFF;
IF @@ERROR = 0 BEGIN COMMIT TRAN END ELSE BEGIN ROLLBACK TRAN END

On Apr 23, 2:31 pm, Erland Sommarskog <esq...@sommars kog.sewrote:
Lee (lee.jenkins... @gmail.com) writes:
I have a project need to move more than 100,000 records from one
database table to another database table every week. Currently, users
input date range from web UI, my store procedure will take those date
ranges to INSERT records to a table in another database, then delete
the records, but it will take really long time to finish this action
(up to 1 or 2 hours).

It shouldn't take 1-2 hours to move 100.000 rows. It sounds like the
process is not well implemented, or that there are indexes missing. Yes,
you can gain speed by using BCP, but you also add complexity to the
solution that I can't really see should be needed with the volumes you
indicate?

Would it be possible for you to post the definition of the tables, including
indexes and the stored procedure?

--
Erland Sommarskog, SQL Server MVP, esq...@sommarsk og.se

Books Online for SQL Server 2005 athttp://www.microsoft.c om/technet/prodtechnol/sql/2005/downloads/books...
Books Online for SQL Server 2000 athttp://www.microsoft.c om/sql/prodinfo/previousversion s/books.mspx

Apr 23 '07 #6
Lee
Should I remove the clusterd index on the record_id field and create
nonclustered index on this field and create a clustered index on
record_date field since in my query, I always select a range of data
by date.

On Apr 23, 3:16 pm, Lee <lee.jenkins... @gmail.comwrote :
Erland, Thanks a lot for the reply, also forgot to say thanks to Brad,
Here is the table:

CREATE TABLE [dbo].[tbl_record](
[record_id] [int] IDENTITY(1,1) NOT NULL,
[record_CC_id] [int] NOT NULL,
[record_content] [varchar](500) COLLATE SQL_Latin1_Gene ral_CP1_CI_AS
NOT NULL CONSTRAINT [DF_tbl_record_r ecord_content] DEFAULT (''),
[record_date] [datetime] NOT NULL CONSTRAINT
[DF_tbl_record_r ecord_date] DEFAULT (getdate()),
[record_ip] [varchar](20) COLLATE SQL_Latin1_Gene ral_CP1_CI_AS NOT
NULL CONSTRAINT [DF_tbl_record_r ecord_ip] DEFAULT (''),
[record_active] [bit] NOT NULL CONSTRAINT
[DF_tbl_record_r ecord_archive] DEFAULT (1),
CONSTRAINT [PK_tbl_record] PRIMARY KEY CLUSTERED
(
[record_id] ASC
)WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF, FILLFACTOR = 90) ON
[PRIMARY]
) ON [PRIMARY]

And The stored procedure is here:

ALTER PROCEDURE [dbo].[ArchiveRecords]
(
@ddate datetime
)
AS
BEGIN TRAN
SET IDENTITY_INSERT record_archive. dbo.tbl_record_ archive ON;
INSERT INTO record_archive. dbo.tbl_record_ archive
(
record_id,
record_CC_id,
record_content,
record_date,
record_ip,
record_active
)
SELECT
record_id,
record_CC_id,
record_content,
record_date,
record_ip,
record_active
FROM tbl_record WHERE record_date <= @ddate;
DELETE FROM tbl_record WHERE record_date <= @ddate;
SET IDENTITY_INSERT record_archive. dbo.tbl_record_ archive OFF;
IF @@ERROR = 0 BEGIN COMMIT TRAN END ELSE BEGIN ROLLBACK TRAN END

On Apr 23, 2:31 pm, Erland Sommarskog <esq...@sommars kog.sewrote:
Lee (lee.jenkins... @gmail.com) writes:
I have a project need to move more than 100,000 records from one
database table to another database table every week. Currently, users
input date range from web UI, my store procedure will take those date
ranges to INSERT records to a table in another database, then delete
the records, but it will take really long time to finish this action
(up to 1 or 2 hours).
It shouldn't take 1-2 hours to move 100.000 rows. It sounds like the
process is not well implemented, or that there are indexes missing. Yes,
you can gain speed by using BCP, but you also add complexity to the
solution that I can't really see should be needed with the volumes you
indicate?
Would it be possible for you to post the definition of the tables, including
indexes and the stored procedure?
--
Erland Sommarskog, SQL Server MVP, esq...@sommarsk og.se
Books Online for SQL Server 2005 athttp://www.microsoft.c om/technet/prodtechnol/sql/2005/downloads/books...
Books Online for SQL Server 2000 athttp://www.microsoft.c om/sql/prodinfo/previousversion s/books.mspx- Hide quoted text -

- Show quoted text -

Apr 23 '07 #7
Lee
Brad, Thanks for the reply, my situation is the target table already
have lots of records and I will just append the data to that table.

On Apr 23, 1:58 pm, Brad <Brad.Marsh...@ Teksouth.comwro te:
On Apr 23, 2:23 pm, Lee <lee.jenkins... @gmail.comwrote :


guys,
I have a project need to move more than 100,000 records from one
database table to another database table every week. Currently, users
input date range from web UI, my store procedure will take those date
ranges to INSERT records to a table in another database, then delete
the records, but it will take really long time to finish this action
(up to 1 or 2 hours).
My question is if there is some other way I should do to speed up the
action, I am thinking about use bcp to copy those records to datafile
and then use bcp to insert it into SQL Server table. Is this the right
way to do it or should I consider other solution (then, what is the
solution.)
Thanks a lot!

Use a Select Into statement and make sure the destination db is set to
a simple recovery model.- Hide quoted text -

- Show quoted text -

Apr 23 '07 #8
Lee (le************ @gmail.com) writes:
Should I remove the clusterd index on the record_id field and create
nonclustered index on this field and create a clustered index on
record_date field since in my query, I always select a range of data
by date.
Yes, that was precisely my reaction when I saw the table. Make the primary
key on record_id non-clustered, and add a clustered index on the date
column. I would guess you should do this on the archive table as well.

Also, I don't see the point with having the IDENTITY property on the
archive table. Just make it a normal column, and you don't need that
SET IDENTITY_INSERT . Not that it affects performance, but it looks cleaner.
However to change this, you would need rename the existing table, create
it a new and copy over. There is no ALTER syntax for changing the
IDENTITY property.


--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Apr 24 '07 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
2962
by: Frnak McKenney | last post by:
Back when computer dinosaurs roamed the earth and the precursors to today's Internet were tiny flocks of TDMs living symbiotically with the silicon giants, tracking access to data processing resources was much simpler: you logged in with a userID and password, and when you were done you ended your session by logging out (or occasionally by being disconnected). Connection time was easy to measure, and it made sense to both the customer...
1
2919
by: DJTB | last post by:
zodb-dev@zope.org] Hi, I'm having problems storing large amounts of objects in a ZODB. After committing changes to the database, elements are not cleared from memory. Since the number of objects I'd like to store in the ZODB is too large to fit in RAM, my program gets killed with signal 11 or signal 9... Below a minimal working (or actually: it doesn't work because of memory
55
4697
by: Steve Jorgensen | last post by:
In a recent thread, RKC (correctly, I believe), took issue with my use of multiple parameters in a Property Let procedure to pass dimensional arguments on the basis that, although it works, it's not obvious how the code works if you don't know the intricacies of the Property Let/Get syntax. Likewise, I dislike (and code to minimize the use of) the VB/VBA syntax of returning a value by referring to the function name as if it were a...
4
357
by: Bas Hamer | last post by:
I guess I don't know how to word it better than that. Our company has machines that generate log file in our own proprietary language. A while back I wrote a class that took one of these files and loaded all the data into a dataset of some predefined tables and some dynamic tables. This worked well for a while and gave me the ability to do searches, although each search ended up being a lot of custom code. Now I'm getting to the point...
4
257
by: CSharpguy | last post by:
I'm not sure if this is the correct forum or not, but I have a basic question. Currently we have are doing calculations via stored procedures and then returning the results back to the client in either a web page or a winForm style application. There are times that the stored procedure takes 20 minutes or 3 hours to run. Now, our database person says that all the calculations needs to be removed from SQL and put in either the web...
8
1123
by: Flack | last post by:
Hey guys, I'm in the process of designing a small app that will be used by myself and a few other people. Now, there is some data that this app will need to run. I know it is possible to put this data, for example, in an Access database and include it with the app. However, I was wondering if it would be a better idea to just include the data directly in my app. There isn't a huge amount of data and it will rarely, if ever, change. ...
16
2596
by: Jack | last post by:
I need to process large amount of data. The data structure fits well in a dictionary but the amount is large - close to or more than the size of physical memory. I wonder what will happen if I try to load the data into a dictionary. Will Python use swap memory or will it fail? Thanks.
5
5901
by: jehugaleahsa | last post by:
Hello: What is the point of using a DataTable in ASP .NET? We are unsure how you can use them without 1) rebuilding them every postback, or 2) taking up precious memory. We are not sure how to store a DataTable in any other way outside of our servers. In doing so, we leave ourselves open to large memory requirements. Furthermore, most web pages do not really support multiple changes per transaction. In other words, when the user submits...
7
1812
by: CSharper | last post by:
Yesterday I had a heated discussion with my colleagues on what is a data centric application and having business logic in sql. I have group of people who wants to include all the business logic in the sql stored proc and I for one consider to have most of the business logic in the C# code (specifically in a controller class of MVC). These are my points to them. 1. Our application depends a lot on the data and we have huge amount of data...
0
9928
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
11100
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10718
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10401
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9554
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7953
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7110
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5977
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
4196
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.