By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,779 Members | 1,277 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,779 IT Pros & Developers. It's quick & easy.

Performance question

P: n/a
I'm trying to convince another open-source project (phpOpenTracker) to
modify their current INSERT sql queries. Currently they just do an
INSERT into a table without first checking if their might be a record
with the same primary key.

The reason for this that they need fast inserts and most user I assume
are using MySQL which silently drops INSERT queries that violate primary
key constraints. But postgres on the other hand (and rightly so) issues
and logs an error.

I have suggested that their current INSERT INTO t VALUES() be changed to:

INSERT INTO
T
SELECT 'v1', 'v2'
WHERE
NOT EXISTS (
SELECT NULL FROM t WHERE pk='v1'
)

However one of the developers is worried that this would cause a
performance drop for MySQL users b/c of the extra SELECT in my version
of the INSERT query.

I had thought that the 'extra' SELECT isn't really extra at all since
*any* DB still has to check that there is a record with the primary key
that we are trying to insert. So whereas in my query the SELECT is
explicitly stated in the regular version of a simple INSERT, the select
is still there but implicit since there was a primary key defined on the
table. So there really shouldn't be much, if any of a performance drop.

Is there any truth to my logic in the previous paragraph? :)

I'd like to test my theory on a real MySQL database but I don't have
access to one right now, and I am not sure how to go about testing a
MySQL db or even what kind of testing. If I can get a spare machine I
will give it a go though.

Thanks,

Jean-Christian Imbeault
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #1
Share this Question
Share on Google+
8 Replies

P: n/a
Tom Lane wrote:

That doesn't really buy anything in safety terms: if two backends
execute this sort of command concurrently, it's perfectly likely
that both sub-SELECTS will find no row matching 'v1', and so they'll
both try the INSERT anyway.


I wasn't looking for safety so much as avoiding the insert if possible.
I've just finished a long thread about this and the consensus was that
this query would be:

1- the insert would always happen if there was no record with the same
primary key
2- is concurrency safe. Even if this query is executed by multiple
backends at the same time one will always succeed.

The reason I had asked for a query that could check before insert was to
reduce the number of log errors the phpOpenTracker tool generates from
trying to insert duplicate records into a table.

The above would reduce (but not eliminate) the number attempts to insert
duplicates and as such reduce the number of log error entries.

At least that's what I am hoping for ...

Jean-Christian Imbeault
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #2

P: n/a
Jean-Christian Imbeault <jc@mega-bucks.co.jp> writes:
I wasn't looking for safety so much as avoiding the insert if possible.
...
The above would reduce (but not eliminate) the number attempts to insert
duplicates and as such reduce the number of log error entries.


Fair enough, it would do that. But you still need the surrounding retry
logic to cope when a collision does happen.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #3

P: n/a
Tom Lane wrote:

Fair enough, it would do that. But you still need the surrounding retry
logic to cope when a collision does happen.


Why retry logic? I thought that if there was a collision one of the
insert attempts would necessarily succeed and all the others fails. So
no need for a retry since the record (pk) was inserted.

What I am trying to duplicate is if an inserts into a primary key field
is a duplicate it fails silently. If it isn't or there are multiple
attempts at the same time to insert the *same* new primary key, the new
pk is inserted.

The suggested method does guarantee an insert if it there is a collision
right? If not I need to find a new solution :(

Jean-Christian Imbeault


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 11 '05 #4

P: n/a
Jean-Christian Imbeault <jc@mega-bucks.co.jp> writes:
Tom Lane wrote:
Fair enough, it would do that. But you still need the surrounding retry
logic to cope when a collision does happen.
Why retry logic? I thought that if there was a collision one of the
insert attempts would necessarily succeed and all the others fails. So
no need for a retry since the record (pk) was inserted.


Well, if that's how your application logic works then you're done.
I was envisioning a case where you'd prefer to generate a new key
value and try the insert again.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #5

P: n/a
On Wed, 2 Jul 2003, Tom Lane wrote:
Jean-Christian Imbeault <jc@mega-bucks.co.jp> writes:
I have suggested that their current INSERT INTO t VALUES() be changed to:
INSERT INTO
T
SELECT 'v1', 'v2'
WHERE
NOT EXISTS (
SELECT NULL FROM t WHERE pk='v1'
)


That doesn't really buy anything in safety terms: if two backends
execute this sort of command concurrently, it's perfectly likely
that both sub-SELECTS will find no row matching 'v1', and so they'll
both try the INSERT anyway.

IMO the best way to do this (assuming that you have a unique index
defined on the primary key column) is to just go ahead and try the
INSERT, but be prepared to roll back your transaction and retry
if you get a failure.

You might find it useful to read the slides from my talk at last
year's O'Reilly conference about this and related concurrency
problems:
http://conferences.oreillynet.com/cs...ew/e_sess/2681


I'd like to see all presentations in one collections. It'd be nice
addition to documentation.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: ol**@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #6

P: n/a
Oleg Bartunov <ol**@sai.msu.su> writes:
On Wed, 2 Jul 2003, Tom Lane wrote:
You might find it useful to read the slides from my talk at last
year's O'Reilly conference about this and related concurrency
problems:
http://conferences.oreillynet.com/cs...ew/e_sess/2681
I'd like to see all presentations in one collections. It'd be nice
addition to documentation.


Yes. Last year I asked Vince to put those slides up somewhere on the
postgresql.org website, but he never got around to it (I think he got
stuck wondering where they should go). Bruce has materials for several
different talks he's given that should be there somewhere, too.

Perhaps someone in the new webmastering crew would like to give the
idea some thought. Or would techdocs be the right place to go?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #7

P: n/a
Tom Lane wrote:
Yes. Last year I asked Vince to put those slides up somewhere on the
postgresql.org website, but he never got around to it (I think he got
stuck wondering where they should go). Bruce has materials for several
different talks he's given that should be there somewhere, too.

Perhaps someone in the new webmastering crew would like to give the
idea some thought. Or would techdocs be the right place to go?


I've always thought what the PHP group does with presentation materials
is nice (simple but sufficient). See:

http://conf.php.net/

Could we have a conf.postgresql.org with links from the home page?

Joe
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #8

P: n/a
Tom Lane wrote:
Oleg Bartunov <ol**@sai.msu.su> writes:
On Wed, 2 Jul 2003, Tom Lane wrote:
You might find it useful to read the slides from my talk at last
year's O'Reilly conference about this and related concurrency
problems:
http://conferences.oreillynet.com/cs...ew/e_sess/2681

I'd like to see all presentations in one collections. It'd be nice
addition to documentation.


Yes. Last year I asked Vince to put those slides up somewhere on the
postgresql.org website, but he never got around to it (I think he got
stuck wondering where they should go). Bruce has materials for several
different talks he's given that should be there somewhere, too.

Perhaps someone in the new webmastering crew would like to give the
idea some thought. Or would techdocs be the right place to go?


All my stuff is on my home page under Writings/Computer.

--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #9

This discussion thread is closed

Replies have been disabled for this discussion.