473,383 Members | 1,789 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

remark on a slowdown of COPY

Hi,

I would like to remark on a problem described by Stephen Livesey
almost
3 years ago, about the slowdown he had experienced with an upload of
several
millions of rows.

http://www.geocrawler.com/mail/threa...ecords&list=12
The first 100,000 records took 15mins.
The next 100,000 records took 30mins
The last 100,000 records took 4hours.


I'm actually uploading data from a pg_dump file with the COPY command,
it's about 2.5 mil. rows on a 1.6 GHz Linux PC, 512MB, with raiserfs.

I had to dump schema and tables separately ending up in the following
series of steps:

CREATE TABLE keys (
crc integer NOT NULL,
tablenr integer NOT NULL,
tableid integer NOT NULL,
tableref integer NOT NULL,
"key" character varying(250) NOT NULL,
batchid integer NOT NULL
);

ALTER TABLE ONLY keys ADD CONSTRAINT keys_pkey PRIMARY KEY (crc);

COPY keys (crc, tablenr, tableid, tableref, "key", batchid) FROM
stdin;
-265889347 1 2 0
1_1_1982_1_101_1011_NULL_NULL_NULL_102_NULL 1

....

\.

With created index (prim.key) I stopped it half-way through after 2
hours,
getting progressively slower. Strangely, in top the CPU usage and
IO were < 5%, jumping up a bit every now and then, but system load
showed steadily values of over 2 (something internal?, Tom Lane once
mentioned fsync?).

Then without an index (when I removed the ADD CONSTRAINT line), the
upload
time soared to 11 minutes, including index creation afterwards, load
around 1.

The problem with me was that I was dumping schema and tables
separatelly,
thus letting ADD CONSTRAINT be issued in the shown sequence (before
data
were uploaded), otherwise ADD CONSTRAINT goes at the end of the table
dump file, not affecting the perfromance.

--Vojtech
Nov 12 '05 #1
0 1188

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Trevor Perrin | last post by:
After running a simple loop around 100-200 million times, there's a speed drop of about a factor of 7. This happens within several minutes (on a 1.7 Ghz machine), so it's not that hard to see. ...
3
by: Nicke | last post by:
Hi, I've used vb.net for 3 hours(!) now and have probably a very easy question. In my VB6 programs I use For Each Cell in Range... very often which not seems to work in vb.net. Ex. Dim Cell...
8
by: Sebastian Werner | last post by:
Howdy, I currently develop the javascript toolkit qooxdoo (http://qooxdoo.sourceforge.net), some of you heard it already. We have discovered a slowdown on Internet Explorers performance when...
22
by: Bradley | last post by:
Has anyone else noticed this problem? I converted the back-end to A2000 and the performance problem was fixed. We supply a 97 and 2000 version of our software so we kept the backend in A97 to make...
3
by: tac-tics | last post by:
I have an application written in jython which has to process a number of records. It runs fine until it gets to about 666 records (and maybe that's a sign), and then, it's performance and...
4
by: =?Utf-8?B?SmVzcGVyLCBEZW5tYXJr?= | last post by:
Hi, On a simple form, I have a ListBox control. This listbox control is loaded with around 800 text items of somewhat short length. There are also some other controls on the form, buttons. When...
7
by: Tom wilson | last post by:
Hi! I have a massive SQL 2000 database that needs records extracted into another database. so I write a VB.Net (2005) program that simply queries the records I need, populates a new record in...
0
by: guillaume weymeskirch | last post by:
Hello everybody, To test the python 2.5 garbage collector, I wrote a trivial script allocating dummy objects of various sizes, then forgetting them in a loop. The garbage collector seems...
0
by: skip | last post by:
guillaumeBut I've noticed a near linear slowdown of the execution : guillaumeafter a few minutes - and several millions of allocated and guillaumefreed objects, each iteration take more and more...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.