472,992 Members | 3,451 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,992 software developers and data experts.

DBD::Pg handling duplicates in a transaction

**** Post for FREE via your newsreader at post.usenet.com ****

I am a newbie try to port my applications to Postgres. I have an
application that is bulk loading a table with autocommit off (with it on
it's way to slow). The logic of the application dictates that I try the
insert, and if it fails because of a duplicate, update the record instead.
The entire bulk load is wrapped in a transaction for speed purposes. When I
hit a duplicate I can detect it, but when I try the update it fails with:

ERROR: current transaction is aborted, commands ignored until end of
transaction block

I assume this is a result of the duplicate error, but is there a way I can
get around this. In my case duplicate is fatal to my transaction...

Thanks. Tim


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
*** Usenet.com - The #1 Usenet Newsgroup Service on The Planet! ***
http://www.usenet.com
Unlimited Download - 19 Seperate Servers - 90,000 groups - Uncensored
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Jul 19 '05 #1
1 2408
After a long battle with technology, "Tim Nelson" <ti***************@softhome.net>, an earthling, wrote:
I am a newbie try to port my applications to Postgres. I have an
application that is bulk loading a table with autocommit off (with
it on it's way to slow). The logic of the application dictates that
I try the insert, and if it fails because of a duplicate, update the
record instead. The entire bulk load is wrapped in a transaction
for speed purposes. When I hit a duplicate I can detect it, but
when I try the update it fails with:

ERROR: current transaction is aborted, commands ignored until end of
transaction block

I assume this is a result of the duplicate error, but is there a way
I can get around this. In my case duplicate is fatal to my
transaction...


There is no ready way of blindly doing this.

Several improvements would seem plausible:

1. Try to do groups of 1000 records, committing every 1K, so that if
there's a failure, it is only expected to roll back on the order
of 500 rows.

2. Along with that approach, keep the last 1K records 'buffered' in
memory, so that if you hit a failure, you can quickly go back and
redo the ones that _were_ good, throwing in a COMMIT as soon as
you complete all the records you _know_ to be good.

3. Alternatively, before doing each row, do a query that checks to
see if the relevant key is already there. That is regrettably
likely to be rather time-consuming, and you'll be firing in
great gobs of little queries.

4. You could create a stored procedure that amounts to "insert or
ignore", and pass the data through that.

Thus, instead of
insert into mytable (f1,f2,f3,f4) values ('this', 'that',
'other', 'foo');

You might do...
select add_to_my_table ('this','that','other', 'foo');

5. You can cut down on the number of statements by grouping these
together...

select atmt ([row 1 data]), atmt([row 2 data]), atmt([row3
data]), ... atmt([row 20 data]).

That allows you to submit 20 records in one statement, which
is fairly likely to save some query submission overhead.
It oughtn't make parsing too much more expensive.

6. Stage the bulk load.

Load the data into a "staging" table, which has minimal (if any)
integrity checks; this would be most efficiently handled via
COPY, which would load it all in one transaction.

Then take that data and insert the relevant bits into the new
location.

Thus:

-- This'll be way fast
copy stage_table from '/tmp/cruddy_data.txt';

-- Maybe do some validation, fix crud, delete crud, and such...
select *; -- Look for crud
update *; -- Fix crud
delete *; -- Delete crud

-- stage_table now contains gleaming fixed data; load into final table...
insert into ultimate_table
select * from stage_table s where
not exists (select * from ultimate_table u where u.pkey = s.pkey);

-- Alternatively, if you had deleted the cruddy data, it might
-- just be:

insert into ultimate_table
select * from stage_table s;

I rather like approach #6, FYI...
--
output = ("cbbrowne" "@" "ntlug.org")
http://www3.sympatico.ca/cbbrowne/lsf.html
"The wrath of Holloway is nothing compared to the wrath of Moon."
-- Fred Drenckhahn
Jul 19 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Pablo S | last post by:
Hi there mod_perl/Pg folks, I have 2 systems, one OLD, (linux 2.4 running postgresql-7.2.1-5 with a perl web db on Apache/1.3.23 mod_perl 1.26 and dbi 1.37 with peristent db connections via...
0
by: Piotr B. | last post by:
Hello, I want to make use of a Perl script "ora2pg" (Oracle to PostgreSQL schema converter), which requires the following modules: DBI, DBD::Oracle and DBD::Pg. As I don't use Perl on a...
0
by: Envex Developments | last post by:
Hey guys, I have a need to install the DBD::Pg Perl module on many shared web servers, which do not have PostgreSQL installed. Then the DBD::Pg module will just connect to a remote PostgreSQL...
0
by: Envex Developments | last post by:
Hey guys, I have a need to install the DBD::Pg Perl module on many shared web servers, which do not have PostgreSQL installed. Then the DBD::Pg module will just connect to a remote PostgreSQL...
0
by: Alex | last post by:
Hi, I am using DBD::Pg in some of my scripts. I want to customize the error login based on the error received. While I am happy with the Errstr message I want to take specific actions depending...
10
by: Ausrack Webmaster | last post by:
Hi I am trying to insert a simple email address into a text field, and I get the below error: DBD::Pg::st execute failed: ERROR: pg_atoi: error in "<support@somedomain.com>": can't parse...
11
by: greg | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 A new version of DBD::Pg is about to be released (1.32), and we need your help to test it out. If you use DBD::Pg, please download and test the...
4
by: Patrick Hatcher | last post by:
Pg: 7.4.2 I use perl scripts to import data into my db. When errors occurred uploading files in Pg ver 7.3.x, $DBI::errstr used to return a row number from the input file. I could then go to...
2
by: David Siebert | last post by:
Anyone using Activestate Perl and DBD-Pg? I am using perl 5.8.3 ppm does not seem to work. I downloaded the DBD-Pg ..zip file I found through google but ppm could not seem to install that. Any...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
tracyyun
by: tracyyun | last post by:
Hello everyone, I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
2
by: giovanniandrean | last post by:
The energy model is structured as follows and uses excel sheets to give input data: 1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
3
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
3
by: nia12 | last post by:
Hi there, I am very new to Access so apologies if any of this is obvious/not clear. I am creating a data collection tool for health care employees to complete. It consists of a number of...
0
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...
4
by: GKJR | last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...
3
SueHopson
by: SueHopson | last post by:
Hi All, I'm trying to create a single code (run off a button that calls the Private Sub) for our parts list report that will allow the user to filter by either/both PartVendor and PartType. On...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.