By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,422 Members | 1,615 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,422 IT Pros & Developers. It's quick & easy.

Duplicate key insert question

P: n/a
I have a table with a primary field and a few other fields. What is the
fastest way to do an insert into that table assuming that sometimes I
might try to insert a record with a duplicate primary key and want that
to fail?

I know that if I try a plain insert this will work, but in the case
where I am trying to insert a duplicate key, the insert fails (as it
should) and an error is logged.

I could first do a check to see if there is already an entry with the
same key as the one I am trying to insert but that would mean quite a
few extra operations.

Is there a quick and clean way of doing something like:

INSERT into table values(...) IF there isn't already a row with pkey=x

Thanks,

Jean-Christian Imbeault

PS The reason I am asking is that an open source project is using MySQL
as their DB and they have a port to PG that isn't very clean b/c the DB
code keeps trying to insert duplicate primary keys. According to them
MySQL doesn't complain and just drops the insert whereas PG (as is
right) complains. I've offered to clean up their PG insertion code but
they say that they don't want too many extra checks as their app writes
to the DB a *lot* and any extra check is going to slow down the
application noticeably ...
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 11 '05 #1
Share this Question
Share on Google+
11 Replies


P: n/a
On Tuesday 01 July 2003 08:10 pm, Jean-Christian Imbeault wrote:
I have a table with a primary field and a few other fields. What is the
fastest way to do an insert into that table assuming that sometimes I
might try to insert a record with a duplicate primary key and want that
to fail?

I know that if I try a plain insert this will work, but in the case
where I am trying to insert a duplicate key, the insert fails (as it
should) and an error is logged.

I could first do a check to see if there is already an entry with the
same key as the one I am trying to insert but that would mean quite a
few extra operations.

Is there a quick and clean way of doing something like:

INSERT into table values(...) IF there isn't already a row with pkey=x

Thanks,

Jean-Christian Imbeault


Hi, not sure if this is answering your question, but I just asked similar
questions here. I asked about using INSERT WHERE NOT EXISTS (which you can do
in PostgreSQL). Here is what you can do:

INSERT INTO mytable
SELECT 'value1', 'value2'
WHERE NOT EXISTS
(SELECT NULL FROM mytable
WHERE mycondition)

This will just return 0 when fails, but it does check first. Don't know if you
can really afford that. Just for reference, this brought up some discussion
here. Here is a link to the archive:
http://marc.theaimsgroup.com/?l=post...NOT+EXISTS&q=b

Hope that helps.
RDB
--
Reuben D. Budiardja
Department of Physics and Astronomy
The University of Tennessee, Knoxville, TN
-------------------------------------------------
/"\ ASCII Ribbon Campaign against HTML
\ / email and proprietary format
X attachments.
/ \
-------------------------------------------------
Have you been used by Microsoft today?
Choose your life. Choose freedom.
Choose LINUX.
-------------------------------------------------
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 11 '05 #2

P: n/a
On Wed, Jul 02, 2003 at 09:58:28AM +0900, Jean-Christian Imbeault wrote:
Alvaro Herrera wrote:

No, only the "second" one will fail (though it's difficult which one is
the second)


I couldn't get the link to work so I couldn't read why Tom shot it down.
But if Tom shot down this idea down ... then it mustn't be correct.


The thread is here:
http://groups.google.com/groups?hl=e...EXISTS%26ie%3D

The solution is not correct in that there _is_ a race condition.
You should check the returned value from the insertion
function to see if it succeeded or not.


No, what I want if to have one query that will *always* insert if there
is no record with this primary key and *always* do nothing (not fail,
not generate an error) if there is already a record with this primary
key. I don't want to check return values :)


No way.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"No es bueno caminar con un hombre muerto"

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #3

P: n/a
Alvaro Herrera wrote:

No, only the "second" one will fail (though it's difficult which one is
the second)
From:

http://marc.theaimsgroup.com/?l=post...6988915991&w=2

Ian Barwick wrote:

[...]

I proposed that same solution 3 years ago. Tom shoots it down:

[...]

I couldn't get the link to work so I couldn't read why Tom shot it down.
But if Tom shot down this idea down ... then it mustn't be correct.
If I followed all the arguments correctly according to the thread there
is *no* way to do what I (and you ;) want in one simple query.

No, there's not.


You say no, but at first you say that the proposed method works. The
proposed method, if it is correct, is simple enough for me. By simple I
mean all can be done with one query.
You should check the returned value from the insertion
function to see if it succeeded or not.


No, what I want if to have one query that will *always* insert if there
is no record with this primary key and *always* do nothing (not fail,
not generate an error) if there is already a record with this primary
key. I don't want to check return values :)

Thanks,

Jean-Christian Imbeault
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 11 '05 #4

P: n/a
On Tuesday 01 July 2003 08:45 pm, Jean-Christian Imbeault wrote:
Reuben D. Budiardja wrote:
Hi, not sure if this is answering your question, but I just asked similar
questions here. I asked about using INSERT WHERE NOT EXISTS (which you
can do in PostgreSQL). Here is what you can do:

INSERT INTO mytable
SELECT 'value1', 'value2'
WHERE NOT EXISTS
(SELECT NULL FROM mytable
WHERE mycondition)

http://marc.theaimsgroup.com/?l=post...s=WHERE+NOT+EX
ISTS&q=b


Thanks for the link!

I read the thread and it looks like even the above solution is not
perfect because of a possible race condition where two inserts trying to
insert a row with a pk not in the table will both get think it is ok to
do so, try it and then both will fail?


No, onlu *one* of them will fail, but yes, the other will then generate error.
So it really is a trade off. Another way would be to lock the table, as other
has suggested. But then there is disadvantages to that also.

RDB
--
Reuben D. Budiardja
Department of Physics and Astronomy
The University of Tennessee, Knoxville, TN
-------------------------------------------------
/"\ ASCII Ribbon Campaign against HTML
\ / email and proprietary format
X attachments.
/ \
-------------------------------------------------
Have you been used by Microsoft today?
Choose your life. Choose freedom.
Choose LINUX.
-------------------------------------------------
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 11 '05 #5

P: n/a
Reuben D. Budiardja wrote:

No, onlu *one* of them will fail, but yes, the other will then generate error.
So it really is a trade off. Another way would be to lock the table, as other
has suggested. But then there is disadvantages to that also.


Really? I just got a post form Alvaro Herrera saying;

"The solution is not correct in that there _is_ a race condition."

Maybe I misunderstood, but "not correct" doesn't sound good :)

Jean-Christian Imbeault
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #6

P: n/a
Alvaro Herrera wrote:

Well, he is right. One will fail, the other will not. The race
condition is for the application. If you want to ignore it, you can do
that, but there _will_ be an ERROR thrown and the transaction will be
aborted.


Ah ... then maybe this solution is 'good enough'. It will still generate
an error message some of the time (when there is a race condition) but
will definitely generate fewer error messages than the current method
used which is just to do the insert and let it fail if there is already
a record with the same primary key.

Thanks for the help!

Jean-Christian Imbeault
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #7

P: n/a
On Wed, Jul 02, 2003 at 10:25:54AM +0900, Jean-Christian Imbeault wrote:
Reuben D. Budiardja wrote:

No, onlu *one* of them will fail, but yes, the other will then generate error.
So it really is a trade off. Another way would be to lock the table, as other
has suggested. But then there is disadvantages to that also.


Really? I just got a post form Alvaro Herrera saying;

"The solution is not correct in that there _is_ a race condition."

Maybe I misunderstood, but "not correct" doesn't sound good :)


Well, he is right. One will fail, the other will not. The race
condition is for the application. If you want to ignore it, you can do
that, but there _will_ be an ERROR thrown and the transaction will be
aborted. The other transaction _will_ insert the tuple, though, and it
won't be aborted.

Note that for the race condition to show there has to be a race, i.e.
two backends trying to insert the same primary key at the same time. If
one finishes half a second before the other, they will behave that way
you want, i.e. there will one tuple inserted and no error generated.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"No deja de ser humillante para una persona de ingenio saber
que no hay tonto que no le pueda enseñar algo." (Jean B. Say)

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #8

P: n/a
On Tuesday 01 July 2003 09:25 pm, Jean-Christian Imbeault wrote:
Reuben D. Budiardja wrote:
No, onlu *one* of them will fail, but yes, the other will then generate
error. So it really is a trade off. Another way would be to lock the
table, as other has suggested. But then there is disadvantages to that
also.


Really? I just got a post form Alvaro Herrera saying;

"The solution is not correct in that there _is_ a race condition."

Maybe I misunderstood, but "not correct" doesn't sound good :)

If you want to avoid the race condition as well, then use the locking
mechanism for transaction. Combine it with the previous INSERT ... SELECT ...
WHERE NOT EXISTS, it should give you what you want. I suspect it's slower
though. Eg:

BEGIN WORK;

INSERT INTO mytable
SELECT 'value1', 'value2'
WHERE NOT EXISTS
(SELECT NULL FROM mytable
WHERE mycondition)

COMMIT WORK;

This should solve the Race Condition, since other transaction have to wait.
But if the PK already exists, this will quit without error.

RDB

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #9

P: n/a
On Tuesday 01 July 2003 11:08 pm, Jean-Christian Imbeault wrote:
Reuben D. Budiardja wrote:
INSERT INTO mytable
SELECT 'value1', 'value2'
WHERE NOT EXISTS
(SELECT NULL FROM mytable
WHERE mycondition)


Thank you to everyone who helped out on my question. I am trying to
implement the above solution but I'm having problems getting this to
work when I want to insert more than one value:

TAL=# create table b (a text primary key, b text);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'b_pkey'
for table 'b'
CREATE TABLE
TAL=# insert into b select 'a';
INSERT 335311 1
TAL=# insert into b select 'b', select 'b';
ERROR: parser: parse error at or near "select" at character 27


I don't see what you're trying to do. Why do you have two select ?

RDB
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 11 '05 #10

P: n/a
On Wed, Jul 02, 2003 at 12:08:56 +0900,
Jean-Christian Imbeault <jc@mega-bucks.co.jp> wrote:
TAL=# insert into b select 'b', select 'b';
ERROR: parser: parse error at or near "select" at character 27


You probably want:
insert into b select 'b', 'b';

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #11

P: n/a
On Wed, 2 Jul 2003, Jean-Christian Imbeault wrote:
I have a table with a primary field and a few other fields. What is the
fastest way to do an insert into that table assuming that sometimes I
might try to insert a record with a duplicate primary key and want that
to fail?

I know that if I try a plain insert this will work, but in the case
where I am trying to insert a duplicate key, the insert fails (as it
should) and an error is logged.

I could first do a check to see if there is already an entry with the
same key as the one I am trying to insert but that would mean quite a
few extra operations.

Is there a quick and clean way of doing something like:

INSERT into table values(...) IF there isn't already a row with pkey=x

Thanks,

Jean-Christian Imbeault

PS The reason I am asking is that an open source project is using MySQL
as their DB and they have a port to PG that isn't very clean b/c the DB
code keeps trying to insert duplicate primary keys. According to them
MySQL doesn't complain and just drops the insert whereas PG (as is
right) complains. I've offered to clean up their PG insertion code but
they say that they don't want too many extra checks as their app writes
to the DB a *lot* and any extra check is going to slow down the
application noticeably ...


Skipping ahead without reading all the messages in this thread (got to rush
out) what about using a before insert trigger, doing the check in there,
returning null if the insert would fail and see if they complain about the slow
down :)

Of course it's still got the race condition for the application unless you also
lock the table and it'll screw up any use of currval(sequence) afterwards that
expects to get the id of the row inserted with a id obtained from
nextval(sequence)

Nigel Andrews

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #12

This discussion thread is closed

Replies have been disabled for this discussion.