473,836 Members | 1,491 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

PostgreSQL insert speed tests

Hello

I need high throughput while inserting into PostgreSQL. Because of that I
did some PostgreSQL insert performance tests.

------------------------------------------------------------
-- Test schema
create table logs (
logid serial primary key,
ctime integer not null,
stime integer not null,
itime integer not null,
agentid integer not null,
subagentid integer not null,
ownerid integer not null,
hostid integer not null,
appname varchar(64) default null,
logbody varchar(1024) not null
);

create index ctime_ndx on logs using btree (ctime);
create index stime_ndx on logs using btree (stime);
create index itime_ndx on logs using btree (itime);
create index agentid_ndx on logs using hash (agentid);
create index ownerid_ndx on logs using hash (ownerid);
create index hostid_ndx on logs using hash (hostid);
------------------------------------------------------------

Test Hardware:
IBM Thinkpad R40
CPU: Pentium 4 Mobile 1993 Mhz (full powered)
RAM: 512 MB
OS: GNU/Linux, Fedora Core 1, kernel 2.4.24

A test program developed with libpq inserts 200.000 rows into table
logs. Insertions are made with 100 row per transaction (total 2.000
transactions).

Some parameter changes from postgresql.conf file follows:
----------------------------------------------------------------
shared_buffers = 2048 # min max_connections *2 or 16, 8KB each
max_fsm_relatio ns = 20000 # min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 200000 # min 1000, fsm is free space map, ~6 bytes
max_locks_per_t ransaction = 256 # min 10
wal_buffers = 64 # min 4, typically 8KB each
sort_mem = 32768 # min 64, size in KB
vacuum_mem = 16384 # min 1024, size in KB
checkpoint_segm ents = 6 # in logfile segments, min 1, 16MB each
checkpoint_time out = 900 # range 30-3600, in seconds
fsync = true
wal_sync_method = fsync # the default varies across platforms:
enable_seqscan = true
enable_indexsca n = true
enable_tidscan = true
enable_sort = true
enable_nestloop = true
enable_mergejoi n = true
enable_hashjoin = true
effective_cache _size = 2000 # typically 8KB each
geqo = true
geqo_selection_ bias = 2.0 # range 1.5-2.0
geqo_threshold = 11
geqo_pool_size = 0 # default based on tables in statement,
# range 128-1024
geqo_effort = 1
geqo_generation s = 0
geqo_random_see d = -1 # auto-compute seed
----------------------------------------------------------------

The test was made with both of PostgreSQL 7.3.4 and PostgreSQL 7.4.1 (the
test program was recompiled during version changes).

The results are below (average inserted rows per second).

speed for speed for
# of EXISTING RECORDS PostgreSQL 7.3.4 PostgreSQL 7.4.1
=============== =============== =============== =============== =============

0 initial records 1086 rows/s 1324 rows/s
200.000 initial records 781 rows/s 893 rows/s
400.000 initial records 576 rows/s 213 rows/s
600.000 initial records 419 rows/s 200 rows/s
800.000 initial records 408 rows/s not tested because of bad
results
When the logs table reconstructed with only one index (primary key) then
2941 rows/s speed is reached. But I need all the seven indexes.

The question is why the PostgreSQL 7.4.1 is so slow under heavy work?

Is there a way to speed up inserts without eliminating indexes?

What about concurrent inserts (cocurrent spare test program execution)
into the same table? It did not work.

-sezai

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #1
11 9223
Sezai YILMAZ wrote:
Test Hardware:
IBM Thinkpad R40
CPU: Pentium 4 Mobile 1993 Mhz (full powered)
RAM: 512 MB
OS: GNU/Linux, Fedora Core 1, kernel 2.4.24

A test program developed with libpq inserts 200.000 rows into table
logs. Insertions are made with 100 row per transaction (total 2.000
transactions).

Some parameter changes from postgresql.conf file follows:
----------------------------------------------------------------
shared_buffers = 2048 # min max_connections *2 or 16, 8KB each
I suggest you up that to say 10000 buffers..
max_fsm_relatio ns = 20000 # min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 200000 # min 1000, fsm is free space map, ~6 bytes
max_locks_per_t ransaction = 256 # min 10
wal_buffers = 64 # min 4, typically 8KB each
sort_mem = 32768 # min 64, size in KB
You need to pull it down a little, I guess. How about 8/16MB?
vacuum_mem = 16384 # min 1024, size in KB
Not required. 1024 could be done since you are testing inserts anyways. Of
course, it matters only when you run vacuum..
effective_cache _size = 2000 # typically 8KB each
Is that true? It tells postgresql that it has around 16MB memory. Set it up
around 15000 so that around 100MB+ is used. Might change the results of index
scans.. I always prefer to set it to whatever available.
The test was made with both of PostgreSQL 7.3.4 and PostgreSQL 7.4.1 (the
test program was recompiled during version changes).

The results are below (average inserted rows per second).

speed for speed for
# of EXISTING RECORDS PostgreSQL 7.3.4 PostgreSQL 7.4.1
=============== =============== =============== =============== =============

0 initial records 1086 rows/s 1324 rows/s
200.000 initial records 781 rows/s 893 rows/s
400.000 initial records 576 rows/s 213 rows/s
600.000 initial records 419 rows/s 200 rows/s
800.000 initial records 408 rows/s not tested because of bad
results
Do you mean 800000? I believe the '.' is a thousands separator here but not too
sure..:-)
When the logs table reconstructed with only one index (primary key) then
2941 rows/s speed is reached. But I need all the seven indexes.

The question is why the PostgreSQL 7.4.1 is so slow under heavy work?
Can you run vmstat and see where things get stalled? Probably you can up the
number of WAL segments and attempt.
Is there a way to speed up inserts without eliminating indexes?

What about concurrent inserts (cocurrent spare test program execution)
into the same table? It did not work.


What does it mean, it didn't work? Any errors?

HTH

Shridhar

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 22 '05 #2
I don't know the answer to the question of why 7.4 is slower, but I have
some suggestions on additional things to test, and how to make it faster.

First off, try 200 transactions of 1000 records each, you might even want
to try 20 transactions of 10,000 records each. Postgres seems to run much
faster the less commits you have, but different configs may change the
sweet spot.

Secondly, one possible solution to your problem is to drop the indexes,
insert the new rows and recreate the indexes. Of course, for testing,
you'll want to time the entire process of drop/insert/create and compare
it to the raw insert time with indexes intact. I use a stored procedure
on my databases, i.e.:

select drop_foo_indexe s();
....
<commands to insert many rows into table foo>
....
select create_foo_inde xes();

Another thing to consider is vacuums. You don't mention how often you
vacuumed the database during testing, I would recommend a "vacuum full"
between each test (unless, of course, you're testing how much a lack
of vacuum hurts performance ;)

Hope this helps.

Sezai YILMAZ wrote:
Hello

I need high throughput while inserting into PostgreSQL. Because of that I
did some PostgreSQL insert performance tests.

------------------------------------------------------------
-- Test schema
create table logs (
logid serial primary key,
ctime integer not null,
stime integer not null,
itime integer not null,
agentid integer not null,
subagentid integer not null,
ownerid integer not null,
hostid integer not null,
appname varchar(64) default null,
logbody varchar(1024) not null
);

create index ctime_ndx on logs using btree (ctime);
create index stime_ndx on logs using btree (stime);
create index itime_ndx on logs using btree (itime);
create index agentid_ndx on logs using hash (agentid);
create index ownerid_ndx on logs using hash (ownerid);
create index hostid_ndx on logs using hash (hostid);
------------------------------------------------------------

Test Hardware:
IBM Thinkpad R40
CPU: Pentium 4 Mobile 1993 Mhz (full powered)
RAM: 512 MB
OS: GNU/Linux, Fedora Core 1, kernel 2.4.24

A test program developed with libpq inserts 200.000 rows into table
logs. Insertions are made with 100 row per transaction (total 2.000
transactions).

Some parameter changes from postgresql.conf file follows:
----------------------------------------------------------------
shared_buffers = 2048 # min max_connections *2 or 16, 8KB each
max_fsm_relatio ns = 20000 # min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 200000 # min 1000, fsm is free space map, ~6 bytes
max_locks_per_t ransaction = 256 # min 10
wal_buffers = 64 # min 4, typically 8KB each
sort_mem = 32768 # min 64, size in KB
vacuum_mem = 16384 # min 1024, size in KB
checkpoint_segm ents = 6 # in logfile segments, min 1, 16MB each
checkpoint_time out = 900 # range 30-3600, in seconds
fsync = true
wal_sync_method = fsync # the default varies across platforms:
enable_seqscan = true
enable_indexsca n = true
enable_tidscan = true
enable_sort = true
enable_nestloop = true
enable_mergejoi n = true
enable_hashjoin = true
effective_cache _size = 2000 # typically 8KB each
geqo = true
geqo_selection_ bias = 2.0 # range 1.5-2.0
geqo_threshold = 11
geqo_pool_size = 0 # default based on tables in statement,
# range 128-1024
geqo_effort = 1
geqo_generation s = 0
geqo_random_see d = -1 # auto-compute seed
----------------------------------------------------------------

The test was made with both of PostgreSQL 7.3.4 and PostgreSQL 7.4.1 (the
test program was recompiled during version changes).

The results are below (average inserted rows per second).

speed for speed for
# of EXISTING RECORDS PostgreSQL 7.3.4 PostgreSQL 7.4.1
=============== =============== =============== =============== =============

0 initial records 1086 rows/s 1324 rows/s
200.000 initial records 781 rows/s 893 rows/s
400.000 initial records 576 rows/s 213 rows/s
600.000 initial records 419 rows/s 200 rows/s
800.000 initial records 408 rows/s not tested because of bad
results
When the logs table reconstructed with only one index (primary key) then
2941 rows/s speed is reached. But I need all the seven indexes.

The question is why the PostgreSQL 7.4.1 is so slow under heavy work?

Is there a way to speed up inserts without eliminating indexes?

What about concurrent inserts (cocurrent spare test program execution)
into the same table? It did not work.


--
Bill Moran
Potential Technologies
http://www.potentialtech.com
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 22 '05 #3
Sezai YILMAZ wrote:
create index agentid_ndx on logs using hash (agentid);
create index ownerid_ndx on logs using hash (ownerid);
create index hostid_ndx on logs using hash (hostid);
------------------------------------------------------------
speed for speed for
# of EXISTING RECORDS PostgreSQL 7.3.4 PostgreSQL 7.4.1
=============== =============== =============== =============== =============

0 initial records 1086 rows/s 1324 rows/s
200.000 initial records 781 rows/s 893 rows/s
400.000 initial records 576 rows/s 213 rows/s
600.000 initial records 419 rows/s 200 rows/s
800.000 initial records 408 rows/s not tested because of bad
results


I changed the three hash indexes to btree.

The performance is increased about 2 times (in PostgreSQL 7.3.4 1905
rows/s).

Concurrent inserts now work.

Changed indexes are more suitable for hash type. Because, there is no
ordering on them, instead exact values are matched which is more natural
for hash type of indexes. But hash indexes has possible dead lock
problems on multiple concurrent inserts. I think I can live with btree
indexes. They work better. :-)

-sezai

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 22 '05 #4

create index agentid_ndx on logs using hash (agentid);
create index ownerid_ndx on logs using hash (ownerid);
create index hostid_ndx on logs using hash (hostid); What about concurrent inserts (cocurrent spare test program execution) into
the same table? It did not work.


Hash indexes have relatively poor concurrency, though I think it should still
work. You probably want to be using btree indexes for everything though,
unless you can actually profile the two and show hash indexes being a big win.

Note that there were bugs in the hash index code at least through most 7.3
versions.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 22 '05 #5
Sezai YILMAZ <se**********@p ro-g.com.tr> writes:
I changed the three hash indexes to btree.
The performance is increased about 2 times (in PostgreSQL 7.3.4 1905
rows/s).
Concurrent inserts now work.
Concurrent inserts should work with hash indexes in 7.4, though not 7.3.

The slowdown you report probably is due to the rewrite of hash indexing
to allow more concurrency --- the locking algorithm is more complex than
it used to be. I am surprised that the effect is so large though.
Could you make your test program available?
Changed indexes are more suitable for hash type.


Are they? How many distinct values are there in those columns?
I suspect that your test may be stressing the case where only a few hash
buckets are used and each bucket chain gets to be very long.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 22 '05 #6
Tom Lane wrote:
Sezai YILMAZ <se**********@p ro-g.com.tr> writes:

I changed the three hash indexes to btree.
The performance is increased about 2 times (in PostgreSQL 7.3.4 1905
rows/s).
Concurrent inserts now work.

Concurrent inserts should work with hash indexes in 7.4, though not 7.3.

I notice this condition. I do not get dead locks with 7.4 on schema with
hash indexes. 7.4 solves this problem but is very slow.
The slowdown you report probably is due to the rewrite of hash indexing
to allow more concurrency --- the locking algorithm is more complex than
it used to be. I am surprised that the effect is so large though.
Could you make your test program available?

The test program and .SQL script is attached

Comiple and link scenarios:

without transactions (where each insert is a transaction)
$ gcc -o tester tester.c -lpq

with default 400 inserts per transaction blocks
$ gcc -DTRANSACTION -o tester tester.c -lpq

with 200 inserts per transaction blocks
$ gcc -DTRANSACTION -DINSERTPERTRANS ACTION=200 -o tester tester.c -lpq

I do concurrent tests by starting seperate tester programs from
different xterm windows.
Changed indexes are more suitable for hash type.


Are they? How many distinct values are there in those columns?
I suspect that your test may be stressing the case where only a few hash
buckets are used and each bucket chain gets to be very long.

The biggest one gets 200 distinct values, the others are 5, and 10. More
information is in "tester.c" where INSERT query string is built.

Regards,

-sezai
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #7
On Saturday 28 February 2004 13:59, Sezai YILMAZ wrote:
Tom Lane wrote:
Sezai YILMAZ <se**********@p ro-g.com.tr> writes:
I changed the three hash indexes to btree.
The performance is increased about 2 times (in PostgreSQL 7.3.4 1905
rows/s).
Concurrent inserts now work.


Concurrent inserts should work with hash indexes in 7.4, though not 7.3.


I notice this condition. I do not get dead locks with 7.4 on schema with
hash indexes. 7.4 solves this problem but is very slow.
The slowdown you report probably is due to the rewrite of hash indexing
to allow more concurrency --- the locking algorithm is more complex than
it used to be. I am surprised that the effect is so large though.
Could you make your test program available?


The test program and .SQL script is attached

Comiple and link scenarios:

without transactions (where each insert is a transaction)
$ gcc -o tester tester.c -lpq

with default 400 inserts per transaction blocks
$ gcc -DTRANSACTION -o tester tester.c -lpq

with 200 inserts per transaction blocks
$ gcc -DTRANSACTION -DINSERTPERTRANS ACTION=200 -o tester tester.c -lpq

I do concurrent tests by starting seperate tester programs from
different xterm windows.


Some tests on CVS head in case somebody finds the data interesting. It is a
single IDE disk system with linux 2.6.2 running. It has 512MB RAM and 2.66GHz
P-IV. The file system is reiserfs.

I pulled CVS head couple of days back.

Everything default except for shared_buffers= 100 and effective cache=25000, I
got 1980 inserts/sec in a single run.

With checkpoint segments 10, I got 1923 inserts per sec.

With two concurrent processes and 10 checkpoint segments, I got 1673 req/sec.

I noted that in vmstat, the IO wasn't pushed really hard. The block out were
varying about 1000-5000 per sec. However occasionally that would spike to
18000 blocks. I guess that would be some checkpoint going on.

and I could not find sort_mem in postgresql.conf . Is work_mem new name for it?
I recall the discussion to sanitize the name but not the result of it..

Shridhar

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #8
Shridhar Daithankar <sh******@frodo .hserus.net> writes:
Everything default except for shared_buffers= 100 and effective cache=25000,
100?
and I could not find sort_mem in postgresql.conf . Is work_mem new name for it?


Yeah.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #9
On Saturday 28 February 2004 21:27, Tom Lane wrote:
Shridhar Daithankar <sh******@frodo .hserus.net> writes:
Everything default except for shared_buffers= 100 and effective
cache=25000,


100?


1000.. That was a typo..

Shridhar

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2742
by: Ralph | last post by:
My company is researching the idea of moving from a very large, old and complex UniData DB to something else that is newer and quicker. I was thinking about PostgreSQL, but am not sure PostgreSQL could import the data, and / or what kinds of issues could arrise, not to mention if it's even possible in the first place. Your thoughs would be greatly appreciated! thanks! Ralph
0
1908
by: Flash | last post by:
I have been playing with postgresql at home. This is on a fedora core 2 linux box. Here are the packages yum installed: $ rpm -qa | grep postgres postgresql-libs-7.4.2-1 postgresql-server-7.4.2-1 postgresql-test-7.4.2-1 postgresql-7.4.2-1
0
3495
by: Marek Lewczuk | last post by:
Hello group, For everyone who thinks about moving from MySQL to PostgreSQL I have a realy bad news - It's not worth. Why, You may ask... A few days ago I have installed and tested PostgreSQL, becouse I realy need UTF-8 support and subselects. I thought that PostgreSQL will be as good as MySQL but also will give me that features, which aren't availble in MySQL. Well, after installation and moving my MySQL dbs into PostgreSQL I decided to...
74
8069
by: John Wells | last post by:
Yes, I know you've seen the above subject before, so please be gentle with the flamethrowers. I'm preparing to enter a discussion with management at my company regarding going forward as either a MySql shop or a Postgresql shop. It's my opinion that we should be using PG, because of the full ACID support, and the license involved. A consultant my company hired before bringing me in is pushing hard for MySql, citing speed and community...
11
2950
by: Errol Neal | last post by:
Hi all, Not sure if this is a question for a php list or this one, but I'll give it a shot and if I am wrong, please do not crucify me. :-) There is a php based sourceforge project called mailwatch. (http://www.sourceforge.net/projects/mailwatch) It logs data from the excellent Mailscanner security product into a mysql database. Now, I am not a php programmer,
0
2699
by: Janning Vygen | last post by:
Hi, i have a question about how to handle postgresql constraint errors in the client app. I found some mails in the archive about it, too. But i have still so many questions about how to do it, where to check it and how to display a good error message. I would love to hear some comments about my ideas:
8
4169
by: Kaarel | last post by:
Would PostgreSQL be a good enough choise for GnuCash (or Quickbooks or the likes) type of program? What could be the potential drawbacks of using PostgreSQL (perhaps its big size)? What would be a better database for that kind of job? Kaarel ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your
17
3455
by: Jim Strickland | last post by:
We currently are running a data intensive web service on a Mac using 4D. The developers of our site are looking at converting this web service to PostgreSQL. We will have a backup of our three production servers at our location. The developers are recommending that I purchase a 2GHz Dual Processor G5 with between 2GB and 4 GB RAM. They say that this configuration would be able to easily run a copy of all three production servers. My...
11
2065
by: mdboldin | last post by:
I hav read on this forum that SQL coding (A) below is preferred over (B), but I find (B) is much faster (20-40% faster) (A) sqla= 'INSERT INTO DTABLE1 VALUES (%d, %d, %d, %f)' % values curs.execute(sqla) (B) pf= '?, ?, ?, ?'
0
9825
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, weíll explore What is ONU, What Is Router, ONU & Routerís main usage, and What is the difference between ONU and Router. Letís take a closer look ! Part I. Meaning of...
0
9673
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10859
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10560
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10602
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10260
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5653
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5829
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
3116
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.