473,672 Members | 2,588 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

A Question About Insertions -- Performance

I am doing to large dataset performance tests with 7.3.4b2 today and I noticed an interesting phenomenon. My shared memory buffers are set at 128MB. Peak postmaster usage appears to be around 90MB.

My test app performs inserts across 4 related tables, each set of 4 inserts representing a single theoretical "device" object. I report how many "devices" I have inserted, per second, for example...

[...]
41509 devices inserted, 36/sec
[1 second later]
41544 devices inserted, 35/sec
[...]

(to be clear, 41509 devices inserted equals 166036 actual, related rows in the db)

Performance follows an odd "peak and valley" pattern. It will start out with a high insertion rate (commits are performed after each "device set"), then after a few thousand device sets, performance will drop to 1 device/second for about 5 seconds. Then it will slowly ramp up over the next 10 seconds to /just below/ the previous high water mark. A few thousand inserts later, it will drop to 1 device/second again for 5 seconds, then slowly ramp up to just below the last high water mark.

Ad infinitum.

I am wondering:

1) What am I seeing here? This is on a 4-processor machine and postmaster has a CPU all to itself, so I ruled out processor contention.

2) Is there more performance tuning I could perform to flatten this out, or is this just completely normal? Postmaster never busts over 100MB out of the 128MB shared memory I've allocated to it, and according to <mumble mumble webpage mumble>, this is just about perfect for shared memory settings (100 to 120% high water mark).

Thanks.

---
Clay
Cisco Systems, Inc.
cl*****@cisco.c om
(972) 813-5004
I've stopped 19,647 spam messages. You can too!
One month FREE spam protection at http://www.cloudmark.c om/spamnetsig/}

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #1
8 1761
"Clay Luther" <cl*****@cisco. com> writes:
Performance follows an odd "peak and valley" pattern. It will start
out with a high insertion rate (commits are performed after each
"device set"), then after a few thousand device sets, performance will
drop to 1 device/second for about 5 seconds. Then it will slowly ramp
up over the next 10 seconds to /just below/ the previous high water
mark. A few thousand inserts later, it will drop to 1 device/second
again for 5 seconds, then slowly ramp up to just below the last high
water mark.


My best guess is that the dropoffs occur because of background checkpoint
operations, but there's not enough info here to prove it. Four inserts
per second seems horrendously slow in any case.

What are the table schemas (in particular, are there any foreign-key
constraints to check)?

Are you doing any vacuuming in this sequence? If so where?

What's the disk hardware like? Do you have WAL on its own disk drive?

regards, tom lane

PS: pgsql-performance would be a better list for this sort of issue.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #2
"Clay Luther" <cl*****@cisco. com> writes:
Performance follows an odd "peak and valley" pattern. It will start
out with a high insertion rate (commits are performed after each
"device set"), then after a few thousand device sets, performance will
drop to 1 device/second for about 5 seconds. Then it will slowly ramp
up over the next 10 seconds to /just below/ the previous high water
mark. A few thousand inserts later, it will drop to 1 device/second
again for 5 seconds, then slowly ramp up to just below the last high
water mark.


My best guess is that the dropoffs occur because of background checkpoint
operations, but there's not enough info here to prove it. Four inserts
per second seems horrendously slow in any case.

What are the table schemas (in particular, are there any foreign-key
constraints to check)?

Are you doing any vacuuming in this sequence? If so where?

What's the disk hardware like? Do you have WAL on its own disk drive?

regards, tom lane

PS: pgsql-performance would be a better list for this sort of issue.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #3
>>>>> "TL" == Tom Lane <tg*@sss.pgh.pa .us> writes:

TL> My best guess is that the dropoffs occur because of background checkpoint
TL> operations, but there's not enough info here to prove it. Four inserts
TL> per second seems horrendously slow in any case.

I'll concur with this diagnosis. I've been doing a bunch of
performance testing with various parameter settings, and the
checkpoint frequency is a big influence. For me, by making the
checkpoints occur as far apart as possible, the overall speed
improvement was incredible. Try bumping the number of
checkpoint_segm ents in your postgresql.conf file. For my tests I
compared the default 3 with 50 segments.

Check your logs to see if you are checkpointing too frequently.

Another thing that *realy* picks up speed is to batch your inserts in
transactions. I just altered an application yesterday that had a loop
like this:

foreach row fetched from table c:
update table a where id=row.id
update table b where id2=row.id2
send notice to id
end

there were several such loops going on for distinct sets of rows in
the same tables.

changing it so that it was inside a transaction, and every 100 times
thru the loop to do a commit pretty much made the time it took to run
on a large loop from 2.5 hours down to 1 hour, and another that took 2
hours down to 40 minutes.

I had to put in a bunch of additional error checking and rollback
logic, but in the last two years none of those error conditions have
ever triggered so I think I'm pretty safe even with having to redo up
to 100 records on a transaction error (ie, it is unlikely to happen).
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D. Khera Communications, Inc.
Internet: kh***@kciLink.c om Rockville, MD +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #4
>>>>> "TL" == Tom Lane <tg*@sss.pgh.pa .us> writes:

TL> My best guess is that the dropoffs occur because of background checkpoint
TL> operations, but there's not enough info here to prove it. Four inserts
TL> per second seems horrendously slow in any case.

I'll concur with this diagnosis. I've been doing a bunch of
performance testing with various parameter settings, and the
checkpoint frequency is a big influence. For me, by making the
checkpoints occur as far apart as possible, the overall speed
improvement was incredible. Try bumping the number of
checkpoint_segm ents in your postgresql.conf file. For my tests I
compared the default 3 with 50 segments.

Check your logs to see if you are checkpointing too frequently.

Another thing that *realy* picks up speed is to batch your inserts in
transactions. I just altered an application yesterday that had a loop
like this:

foreach row fetched from table c:
update table a where id=row.id
update table b where id2=row.id2
send notice to id
end

there were several such loops going on for distinct sets of rows in
the same tables.

changing it so that it was inside a transaction, and every 100 times
thru the loop to do a commit pretty much made the time it took to run
on a large loop from 2.5 hours down to 1 hour, and another that took 2
hours down to 40 minutes.

I had to put in a bunch of additional error checking and rollback
logic, but in the last two years none of those error conditions have
ever triggered so I think I'm pretty safe even with having to redo up
to 100 records on a transaction error (ie, it is unlikely to happen).
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D. Khera Communications, Inc.
Internet: kh***@kciLink.c om Rockville, MD +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #5
Vivek Khera wrote:
>> "TL" == Tom Lane <tg*@sss.pgh.pa .us> writes:


TL> My best guess is that the dropoffs occur because of background checkpoint
TL> operations, but there's not enough info here to prove it. Four inserts
TL> per second seems horrendously slow in any case.

I'll concur with this diagnosis. I've been doing a bunch of
performance testing with various parameter settings, and the
checkpoint frequency is a big influence. For me, by making the
checkpoints occur as far apart as possible, the overall speed
improvement was incredible. Try bumping the number of
checkpoint_segm ents in your postgresql.conf file. For my tests I
compared the default 3 with 50 segments.

Check your logs to see if you are checkpointing too frequently.


That warning message is only in 7.4.

--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.ph a.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 11 '05 #6
Vivek Khera wrote:
>> "TL" == Tom Lane <tg*@sss.pgh.pa .us> writes:


TL> My best guess is that the dropoffs occur because of background checkpoint
TL> operations, but there's not enough info here to prove it. Four inserts
TL> per second seems horrendously slow in any case.

I'll concur with this diagnosis. I've been doing a bunch of
performance testing with various parameter settings, and the
checkpoint frequency is a big influence. For me, by making the
checkpoints occur as far apart as possible, the overall speed
improvement was incredible. Try bumping the number of
checkpoint_segm ents in your postgresql.conf file. For my tests I
compared the default 3 with 50 segments.

Check your logs to see if you are checkpointing too frequently.


That warning message is only in 7.4.

--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.ph a.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 11 '05 #7
>>>>> "BM" == Bruce Momjian <pg***@candle.p ha.pa.us> writes:
Check your logs to see if you are checkpointing too frequently.


BM> That warning message is only in 7.4.

Yes, but the checkpoint activity is still logged. On my 7.2 system,
I'm checkpointing about every 1.5 minutes at peak with 3 checkpoint
segments. I think I can speed it up even more by increasing them.
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 11 '05 #8
>>>>> "BM" == Bruce Momjian <pg***@candle.p ha.pa.us> writes:
Check your logs to see if you are checkpointing too frequently.


BM> That warning message is only in 7.4.

Yes, but the checkpoint activity is still logged. On my 7.2 system,
I'm checkpointing about every 1.5 minutes at peak with 3 checkpoint
segments. I think I can speed it up even more by increasing them.
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 11 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
387
by: Clay Luther | last post by:
I am doing to large dataset performance tests with 7.3.4b2 today and I noticed an interesting phenomenon. My shared memory buffers are set at 128MB. Peak postmaster usage appears to be around 90MB. My test app performs inserts across 4 related tables, each set of 4 inserts representing a single theoretical "device" object. I report how many "devices" I have inserted, per second, for example... 41509 devices inserted, 36/sec 41544...
1
3376
by: Isaac Blank | last post by:
Hi. We've run into a concurrency issue I do not have a clear solution for. In a DB2 UDB 7.2 database, we have several tables with a chain of foreign key relarionships: Table1 primary key x table2 foreign key x references table1(x) table3 foreign key x references table2(x) .........................................
9
37190
by: Paul Steele | last post by:
I am writing a C# app that needs to periodically poll for cdroms and usb storage device insertions. I've looked at the WMI functions but haven't found anything all that useful. The closest is Win32_DiskDrive, but it doesn't seem to return any information on cdrom devices. I suspect there might be a Win32 API call, but I haven't found any info yet using Google. Is there a way to achieve this in C#?
6
2484
by: wASP | last post by:
Hello everyone, I'm new to C# and ASP.NET, so pardon my stupidity on this one. I'm having a problem with referencing methods/functions external to a class member function. My code is as follows: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
3143
by: Edward Yang | last post by:
A few days ago I started a thread "I think C# is forcing us to write more (redundant) code" and got many replies (more than what I had expected). But after reading all the replies I think my question about local variable initialization is still not solved. And some of the replies forked into talking about out parameters. And the thread is becoming way too deep. So I open a new thread here. My question in the previous thead has turned...
2
2935
by: Diego | last post by:
Hi everybody! I'm using DB2 PE v8.2.3 for linux. I've defined a database with the following schema: ANNOTATION(ID,AUTHOR,TEXT) ANNOTATION_BOOK(ANNOTATION_ID,OBJECT_ID) BOOK(ID,AUTHOR,TITLE). Between the book and annotation entities there is a many-to-many
6
301
by: ziman137 | last post by:
Hello all, I have a question and am seeking for some advice. I am currently working to implement an algorithmic library. Because the performance is the most important factor in later applications, I decide to write it in C instead of C++. However, I thought it might be convenient to use some C++ code at some misc places. I'm aware that, I could always use the C++ compiler to get it work.
10
6738
by: Ruan | last post by:
My confusion comes from the following piece of code: memo = {1:1, 2:1} def fib_memo(n): global memo if not n in memo: memo = fib_memo(n-1) + fib_memo(n-2) return memo I used to think that the time complexity for this code is O(n) due to its
4
1348
by: sinoodle | last post by:
Hello, I need to build a large database that has roughly 500,000 keys, and a variable amount of data for each key. The data for each key could range from 100 bytes to megabytes.The data under each will grow with time as the database is being built. Are there some flags I should be setting when opening the database to handle large amounts of data per key? Is hash or binary tree recommended for this type of job, I'll be building the...
5
3367
by: frankw | last post by:
Hi, I have a hash_map with string as key and an object pointer as value. the object is like class{ public: float a; float b; ...
0
8931
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8608
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8680
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6238
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5705
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4227
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4418
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2819
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1816
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.