473,320 Members | 1,993 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

MVCC for massively parallel inserts

How good is Postgres' performance for massive simultaneous insertions
into the same heavily-indexed table? Are there any studies or
benchmarks I can look at for that?

I understand Postgres uses MVCC rather than conventional locking, which
makes it easier to do parallel inserts. In my environment, I will have
so many inserts that it is unworkable to have one machine do all the
inserting -- it would max out the CPU of even a very powerful machine,
and in any case I'd like to avoid spending that much money. One option
is to use a cluster of commodity Intel machines running Linux or one of
the BSD's. In many database environments, that wouldn't buy me much
because only one machine could do inserts while all the others would be
used for selects. But I'm going to have tons of inserts and few
selects. So I really need many of the machines in the cluster to be
able to simultaneously insert. Is this practicable in a clustered
environment for Postgres?

Thanks in advance for any insight or references,
steve
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 12 '05 #1
7 4202
"Steven D.Arnold" <st*****@neosynapse.net> writes:
So I really need many of the machines in the cluster to be
able to simultaneously insert. Is this practicable in a clustered
environment for Postgres?


No.

You are mistaken to think that CPU is the bottleneck, anyway. It is
going to be disk bandwidth that's your problem, and throwing more CPUs
at it would not help. If your insert load exceeds the available disk
bandwidth, I think you have little choice but to figure out a way to
divide the workload among multiple independent databases.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #2
You are mistaken to think that CPU is the bottleneck, anyway. It is
going to be disk bandwidth that's your problem, and throwing more CPUs
at it would not help. If your insert load exceeds the available disk
bandwidth, I think you have little choice but to figure out a way to
divide the workload among multiple independent databases.
If you have the money to purchase multiple machines, you can throw more
hard drives at it...

Hard drives are cheap... (well ide/sata anyway).

Sincerely,

Joshua D. Drake

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #3
Joshua D. Drake wrote:
If you have the money to purchase multiple machines, you can throw more
hard drives at it...

Hard drives are cheap... (well ide/sata anyway).


Properly set up in a RAID-1/0 array, you'll get much better "bandwidth"
out of those drives. Whether you RAID in software or hardware is up to
you and your budget - but if you choose hardware, make sure you actually
get a hardware RAID controller, not one of the cheapies which are just
multiple-IDE-controllers-with-special-drivers.

We use a 3Ware 7500 (can't remember the model name for the life of me),
with 8 drives in a 4x2 RAID 1/0.

IIRC, if the inserts are done in a transaction, the indexing gets done
at the end of the batch rather than after each insert.

Regards
Alex Satrapa
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #4
Alex Satrapa <al**@lintelsys.com.au> writes:
Properly set up in a RAID-1/0 array, you'll get much better "bandwidth" out of
those drives. Whether you RAID in software or hardware is up to you and your
budget - but if you choose hardware, make sure you actually get a hardware RAID
controller, not one of the cheapies which are just
multiple-IDE-controllers-with-special-drivers.

We use a 3Ware 7500 (can't remember the model name for the life of me), with 8
drives in a 4x2 RAID 1/0.
I would agree and if you really need the I/O bandwidth you can go to much
larger stripe sets than even this. The documentation I've seen before
suggested there were benefits up to stripe sets as large as twelve disks
across. That would be 24 drives if you're also doing mirroring.

Ideally separating WAL, index, and heap files is good, but you would have to
experiment to see which works out fastest for a given number of drives.

There are also some alternative approaches that could increase your
throughput. For example, you could have your multiple machines receiving the
data log the data to text files. Then you could copy the text files over to
the database periodically and load the with COPY which is faster than a
database insert.

Also, if it fits your model you could load the data into fresh unindexed
tables and then build a new index. Building a new index is a quicker operation
than handling individual inserts. That would make selects more complex though,
but you perhaps that's not a concern.
IIRC, if the inserts are done in a transaction, the indexing gets done at the
end of the batch rather than after each insert.


I believe this is wrong. The whole point of postgres's style of MVCC is that
each transaction can go ahead and do whatever modifications it needs and mark
it with its transaction id, any other transaction simply ignores the data
marked with transaction ids of uncommitted transactions. When commit time
arrives there's very little work to do to do the commit beyond simply marking
the transaction as committed.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #5
We use a 3Ware 7500 (can't remember the model name for the life of me), with 8
drives in a 4x2 RAID 1/0.
I would agree and if you really need the I/O bandwidth you can go to much
larger stripe sets than even this. The documentation I've seen before
suggested there were benefits up to stripe sets as large as twelve disks
across. That would be 24 drives if you're also doing mirroring.


Something I have been toying with is getting two of the 12 drive 3Ware cards
and running RAID 0+1 across them (with LVM). At just under 300 for the cards
and only 80 bucks a drive (80 Gig)... that is alot of space, and a lot
of speed for
not a lot of money.

Sincerely,

Joshua D
Ideally separating WAL, index, and heap files is good, but you would have to
experiment to see which works out fastest for a given number of drives.

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #6
"Joshua D. Drake" <jd@commandprompt.com> writes:
Something I have been toying with is getting two of the 12 drive 3Ware cards
and running RAID 0+1 across them (with LVM). At just under 300 for the cards
and only 80 bucks a drive (80 Gig)... that is alot of space, and a lot of
speed for not a lot of money.


Unless I have things backwards, raid 0+1 means if any drive fails the whole
side of the mirror fails. if two drives fail you have a 50/50 chance of the
them being on opposite sides of the mirror and losing the whole thing.

Even if you don't have a double failure, resyncing seems like it would be a
pain in this situation. LVM wouldn't know about the stripe set so it would
mean resyncing the entire 12-disk array on the failed side of the mirror.

I thought it was generally preferable to do Raid 1+0 (aka "raid 10") where any
two drives can fail and as long as they aren't precisely opposite each other
you're still ok. And resyncing just means resyncing the one replacement drive,
not the whole array.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #7
>>>>> "GS" == Greg Stark <gs*****@mit.edu> writes:

GS> I would agree and if you really need the I/O bandwidth you can go
GS> to much larger stripe sets than even this. The documentation I've
GS> seen before suggested there were benefits up to stripe sets as
GS> large as twelve disks across. That would be 24 drives if you're
GS> also doing mirroring.

I did a bunch of testing with a 14 disk SCSI array. I found that RAID5 was
best over RAID10 and RAID50.

GS> Ideally separating WAL, index, and heap files is good, but you
GS> would have to experiment to see which works out fastest for a
GS> given number of drives.

I found that putting the WAL on its own array (in my case a mirror on
the other RAID controller channel) helped quite a bit. I don't think
it is easy to split off index files to alternate locations with Postgres.

Increasing the number of checkpoint segments was one of the biggest
improvements I observed for mass-insert performance (as tested while
doing a restore on a multi-million row database.)

The combination of having the WAL on a separate disk, and letting that
grow to be quite large has been very good for my performance and also
for reducing disk bandwidth requirements.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D. Khera Communications, Inc.
Internet: kh***@kciLink.com Rockville, MD +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 12 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Google Mike | last post by:
I had a problem today where I had an editor page (edit2.php) that posted data to PostgreSQL, and then a view page (view.php) that read it in and displayed it. At first I thought this was an MVCC...
3
by: paytam | last post by:
Hi all, Is it possible to write parallel programming in C? I mean for example a simple program like I have a clock on a program that show me current time and and at the same time another job like...
126
by: ramyach | last post by:
Hi friends, I need to write a parallel code in 'C' on the server that is running SGI Irix 6.5. This server supports MIPS Pro C compiler. I don't have any idea of parallel C languages. I looked...
5
by: nd02tsk | last post by:
Hello Harrison Fisk from MySQL claims in this thread: http://forums.mysql.com/read.php?35,3981,4245#msg-4245 That there are no major differences between InnoDB and MVCC concurrency. Is...
11
by: lovecreatesbeauty | last post by:
For example, line L1 and line L2 are two lines in two-dimensional space, the start-points and end-points can be described with following the `point_t' type. The start-points and end-points are:...
20
by: Troels Arvin | last post by:
Hello, In a rather large database, I face this problem: Several tables are regularly updated through rather long-running DELETE +INSERTs. Using uncommitted read as isolation level seems unsafe,...
6
by: Abandoned | last post by:
Hi.. I use the threading module for the fast operation. But i have some problems.. This is my code sample: ================= conn =...
4
by: Soren | last post by:
Hi, I want to control some motors using the parallel port.. however, my laptop does not have any parallel ports (very few do). What I do have is a USB->Parallel converter... I thought about...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.