473,386 Members | 1,652 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

To insert 60million rows to database

12
hi all,

how to insert 60million rows to the postgres datase. i want to increase the performance as it is taking 1min to insert every 1000rows. is there any easier way to do it. thanks in advance.
Oct 12 '06 #1
4 11402
hi all,

how to insert 60million rows to the postgres datase. i want to increase the performance as it is taking 1min to insert every 1000rows. is there any easier way to do it. thanks in advance.
hello dusty,actually i wanted to know that whether u have got a solution to yhis problem or not.i m facing the same problem,so u if dont mind helping me in this,then do send me the solution.thanks
Jan 25 '07 #2
michaelb
534 Expert 512MB
The insert/delete/update performance depends on how many idexes you have on the table, make sure you have only as many as you really need.

Perhaps even more importantly, when doing a bulk insert you should consider using the COPY command instead of the SQL "Insert into my_table ..."
This should speed up the process very significantly:

http://www.postgresql.org/docs/8.0/i.../sql-copy.html
Jan 25 '07 #3
iburyak
1,017 Expert 512MB
For bulk inserts I would suggest dropping indexes totally and then recreate them back after insert complete.
When data inserted with indexes on table especially clustered it rebuilds indexes after each insert. If it is a clustered index it physically moves data which takes even more time.
Rebuild indexes will take some time at the end but it will be done once and not each time record inserted.
Also check if you have triggers\constraints which should be dropped\recreated if possible.
Only triggers\constraints that check data integrity should stay in place.

Bulk insert should work faster because it doesn’t use transaction log and speeds up tremendously.
If you still want to use straight insert use Commit transaction after each 1000 records.
It will help clearing transaction log.

Good Luck.
Jan 29 '07 #4
For bulk inserts I would suggest dropping indexes totally and then recreate them back after insert complete.
This assumes that you don't have concurrent queries trying to read from those tables. If you do, and they depend on the index you won't like the effect.

When data inserted with indexes on table especially clustered it rebuilds indexes after each insert. If it is a clustered index it physically moves data which takes even more time.
This is incorrect. PostgreSQL does not dynamically maintain clustering. AFAIK there are no plans to support dynamically maintained clustering at any time in the near future.

Rebuild indexes will take some time at the end but it will be done once and not each time record inserted.
Also check if you have triggers\constraints which should be dropped\recreated if possible.
Only triggers\constraints that check data integrity should stay in place.

Bulk insert should work faster because it doesn’t use transaction log and speeds up tremendously.
If you still want to use straight insert use Commit transaction after each 1000 records.
It will help clearing transaction log.

Good Luck.
Nobody seems to have mentioned using prepared transactions yet. Which should make a significant improvement.
Jan 29 '07 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

7
by: RotterdamStudents | last post by:
Hello there, i have a strange problem. I can't get php to insert multiple rows at once in a MySQL database. I use the $sql = "INSERT INTO database (a,b,c,d,e) VALUES ('$a', '$b' ,'$c', '$d',...
7
by: iqbal | last post by:
Hi all, We have an application through which we are bulk inserting rows into a view. The definition of the view is such that it selects columns from a table on a remote server. I have added the...
9
by: Martin | last post by:
Hello, I'm new with triggers and I can not find any good example on how to do the following: I have two tables WO and PM with the following fields: WO.WONUM, VARCHAR(10) WO.PMNUM,...
16
by: Philip Boonzaaier | last post by:
I want to be able to generate SQL statements that will go through a list of data, effectively row by row, enquire on the database if this exists in the selected table- If it exists, then the colums...
16
by: robert | last post by:
been ruminating on the question (mostly in a 390/v7 context) of whether, and if so when, a row update becomes an insert/delete. i assume that there is a threshold on the number of columns of the...
7
by: jafastinger | last post by:
In our shop we have a 28,000,000 row insert that gets reloaded every night. I am not looking to change this process. I would like to know why in UDB Vs 7.2 we were getting an insert rate of 3300...
11
by: Sezai YILMAZ | last post by:
Hello I need high throughput while inserting into PostgreSQL. Because of that I did some PostgreSQL insert performance tests. ------------------------------------------------------------ --...
46
by: dunleav1 | last post by:
I have a process that does inserts that runs 50% slower that Oracle and Mssql. Queries normally take 50% slower than normal. DB2, Oracle, Mssql all are configured on same os, same disk array...
2
by: cday119 | last post by:
Can someone help me out with this one. I am trying to insert a row of data into an access database. The primary key is set to auto increment. It will work once but the next time you try an insert...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.