By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,691 Members | 2,041 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,691 IT Pros & Developers. It's quick & easy.

MVCC for massively parallel inserts

P: n/a
How good is Postgres' performance for massive simultaneous insertions
into the same heavily-indexed table? Are there any studies or
benchmarks I can look at for that?

I understand Postgres uses MVCC rather than conventional locking, which
makes it easier to do parallel inserts. In my environment, I will have
so many inserts that it is unworkable to have one machine do all the
inserting -- it would max out the CPU of even a very powerful machine,
and in any case I'd like to avoid spending that much money. One option
is to use a cluster of commodity Intel machines running Linux or one of
the BSD's. In many database environments, that wouldn't buy me much
because only one machine could do inserts while all the others would be
used for selects. But I'm going to have tons of inserts and few
selects. So I really need many of the machines in the cluster to be
able to simultaneously insert. Is this practicable in a clustered
environment for Postgres?

Thanks in advance for any insight or references,
steve
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 12 '05 #1
Share this Question
Share on Google+
7 Replies


P: n/a
"Steven D.Arnold" <st*****@neosynapse.net> writes:
So I really need many of the machines in the cluster to be
able to simultaneously insert. Is this practicable in a clustered
environment for Postgres?


No.

You are mistaken to think that CPU is the bottleneck, anyway. It is
going to be disk bandwidth that's your problem, and throwing more CPUs
at it would not help. If your insert load exceeds the available disk
bandwidth, I think you have little choice but to figure out a way to
divide the workload among multiple independent databases.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #2

P: n/a
You are mistaken to think that CPU is the bottleneck, anyway. It is
going to be disk bandwidth that's your problem, and throwing more CPUs
at it would not help. If your insert load exceeds the available disk
bandwidth, I think you have little choice but to figure out a way to
divide the workload among multiple independent databases.
If you have the money to purchase multiple machines, you can throw more
hard drives at it...

Hard drives are cheap... (well ide/sata anyway).

Sincerely,

Joshua D. Drake

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #3

P: n/a
Joshua D. Drake wrote:
If you have the money to purchase multiple machines, you can throw more
hard drives at it...

Hard drives are cheap... (well ide/sata anyway).


Properly set up in a RAID-1/0 array, you'll get much better "bandwidth"
out of those drives. Whether you RAID in software or hardware is up to
you and your budget - but if you choose hardware, make sure you actually
get a hardware RAID controller, not one of the cheapies which are just
multiple-IDE-controllers-with-special-drivers.

We use a 3Ware 7500 (can't remember the model name for the life of me),
with 8 drives in a 4x2 RAID 1/0.

IIRC, if the inserts are done in a transaction, the indexing gets done
at the end of the batch rather than after each insert.

Regards
Alex Satrapa
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #4

P: n/a
Alex Satrapa <al**@lintelsys.com.au> writes:
Properly set up in a RAID-1/0 array, you'll get much better "bandwidth" out of
those drives. Whether you RAID in software or hardware is up to you and your
budget - but if you choose hardware, make sure you actually get a hardware RAID
controller, not one of the cheapies which are just
multiple-IDE-controllers-with-special-drivers.

We use a 3Ware 7500 (can't remember the model name for the life of me), with 8
drives in a 4x2 RAID 1/0.
I would agree and if you really need the I/O bandwidth you can go to much
larger stripe sets than even this. The documentation I've seen before
suggested there were benefits up to stripe sets as large as twelve disks
across. That would be 24 drives if you're also doing mirroring.

Ideally separating WAL, index, and heap files is good, but you would have to
experiment to see which works out fastest for a given number of drives.

There are also some alternative approaches that could increase your
throughput. For example, you could have your multiple machines receiving the
data log the data to text files. Then you could copy the text files over to
the database periodically and load the with COPY which is faster than a
database insert.

Also, if it fits your model you could load the data into fresh unindexed
tables and then build a new index. Building a new index is a quicker operation
than handling individual inserts. That would make selects more complex though,
but you perhaps that's not a concern.
IIRC, if the inserts are done in a transaction, the indexing gets done at the
end of the batch rather than after each insert.


I believe this is wrong. The whole point of postgres's style of MVCC is that
each transaction can go ahead and do whatever modifications it needs and mark
it with its transaction id, any other transaction simply ignores the data
marked with transaction ids of uncommitted transactions. When commit time
arrives there's very little work to do to do the commit beyond simply marking
the transaction as committed.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #5

P: n/a
We use a 3Ware 7500 (can't remember the model name for the life of me), with 8
drives in a 4x2 RAID 1/0.
I would agree and if you really need the I/O bandwidth you can go to much
larger stripe sets than even this. The documentation I've seen before
suggested there were benefits up to stripe sets as large as twelve disks
across. That would be 24 drives if you're also doing mirroring.


Something I have been toying with is getting two of the 12 drive 3Ware cards
and running RAID 0+1 across them (with LVM). At just under 300 for the cards
and only 80 bucks a drive (80 Gig)... that is alot of space, and a lot
of speed for
not a lot of money.

Sincerely,

Joshua D
Ideally separating WAL, index, and heap files is good, but you would have to
experiment to see which works out fastest for a given number of drives.

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #6

P: n/a
"Joshua D. Drake" <jd@commandprompt.com> writes:
Something I have been toying with is getting two of the 12 drive 3Ware cards
and running RAID 0+1 across them (with LVM). At just under 300 for the cards
and only 80 bucks a drive (80 Gig)... that is alot of space, and a lot of
speed for not a lot of money.


Unless I have things backwards, raid 0+1 means if any drive fails the whole
side of the mirror fails. if two drives fail you have a 50/50 chance of the
them being on opposite sides of the mirror and losing the whole thing.

Even if you don't have a double failure, resyncing seems like it would be a
pain in this situation. LVM wouldn't know about the stripe set so it would
mean resyncing the entire 12-disk array on the failed side of the mirror.

I thought it was generally preferable to do Raid 1+0 (aka "raid 10") where any
two drives can fail and as long as they aren't precisely opposite each other
you're still ok. And resyncing just means resyncing the one replacement drive,
not the whole array.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #7

P: n/a
>>>>> "GS" == Greg Stark <gs*****@mit.edu> writes:

GS> I would agree and if you really need the I/O bandwidth you can go
GS> to much larger stripe sets than even this. The documentation I've
GS> seen before suggested there were benefits up to stripe sets as
GS> large as twelve disks across. That would be 24 drives if you're
GS> also doing mirroring.

I did a bunch of testing with a 14 disk SCSI array. I found that RAID5 was
best over RAID10 and RAID50.

GS> Ideally separating WAL, index, and heap files is good, but you
GS> would have to experiment to see which works out fastest for a
GS> given number of drives.

I found that putting the WAL on its own array (in my case a mirror on
the other RAID controller channel) helped quite a bit. I don't think
it is easy to split off index files to alternate locations with Postgres.

Increasing the number of checkpoint segments was one of the biggest
improvements I observed for mass-insert performance (as tested while
doing a restore on a multi-million row database.)

The combination of having the WAL on a separate disk, and letting that
grow to be quite large has been very good for my performance and also
for reducing disk bandwidth requirements.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D. Khera Communications, Inc.
Internet: kh***@kciLink.com Rockville, MD +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 12 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.