473,408 Members | 2,813 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

Cursors and Transactions, why?

Why must a cursor be defined in an open transaction? Obviously there's
a good reason, but I can't figure it out. On a high level, what would
be involved in allowing a cursor to outlive the transaction that
created it?

Cursors seem as if they have some nice performance benefits (esp. if
you're not using all rows found), but their usefulness drops
considerably since you must leave a transaction open.

eric
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #1
18 3305
On Apr 5, 2004, at 6:44 PM, Joe Conway wrote:
Eric Ridge wrote:
Why must a cursor be defined in an open transaction? Obviously
there's a good reason, but I can't figure it out. On a high level,
what would be involved in allowing a cursor to outlive the
transaction that created it?


Historically I think it was because the memory was released at the end
of the current transaction (i.e. allocations were made in
TopTransactionContext). But as of 7.4, cursors *can* outlive
transactions:
http://www.postgresql.org/docs/curre...l-declare.html

WITH HOLD
WITHOUT HOLD


holy cow! This is fantastic. I had no idea. <short pause> ooh, and
I see FETCH, in 7.4, supports absolute positioning. Must upgrade.

thanks!

eric
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #2
Eric Ridge wrote:
Why must a cursor be defined in an open transaction? Obviously there's
a good reason, but I can't figure it out. On a high level, what would
be involved in allowing a cursor to outlive the transaction that created
it?


Historically I think it was because the memory was released at the end
of the current transaction (i.e. allocations were made in
TopTransactionContext). But as of 7.4, cursors *can* outlive transactions:
http://www.postgresql.org/docs/curre...l-declare.html

WITH HOLD
WITHOUT HOLD

WITH HOLD specifies that the cursor may continue to be used after
the transaction that created it successfully commits. WITHOUT HOLD
specifies that the cursor cannot be used outside of the transaction that
created it. If neither WITHOUT HOLD nor WITH HOLD is specified, WITHOUT
HOLD is the default.

HTH,

Joe
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #3
Eric Ridge wrote:
Why must a cursor be defined in an open transaction? Obviously there's
a good reason, but I can't figure it out. On a high level, what would
be involved in allowing a cursor to outlive the transaction that
created it?
Because the transaction is what protects the rows that build the result
set from being removed by vacuum. In PostgreSQL, a cursor is a running
query executor just sitting in the middle of its operation. If the
underlying query is for example a simple sequential scan, then the
result set is not materialized but every future fetch operation will
read directly from the base table. This would obviously get screwed up
if vacuum would think nobody needs those rows any more.

Cursors seem as if they have some nice performance benefits (esp. if
you're not using all rows found), but their usefulness drops
considerably since you must leave a transaction open.


And now you know why they are so good if you don't use all rows. This
benefit I think goes away if you use Joe Conway's suggestion of WITH HOLD.
Jan

--
#================================================= =====================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================= = Ja******@Yahoo.com #
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #4
On 4/6/04 10:54 AM, "Jan Wieck" <Ja******@yahoo.com> wrote:
Cursors seem as if they have some nice performance benefits (esp. if
you're not using all rows found), but their usefulness drops
considerably since you must leave a transaction open.


And now you know why they are so good if you don't use all rows. This
benefit I think goes away if you use Joe Conway's suggestion of WITH HOLD.


I tried using WITH HOLD in the following case (using an ecpg C program):

foreach row in table A
update table B with value from table A
commit once every 10,000 updates
forend

I created a cursor on table A. Without WITH HOLD, obviously I got an error
on the next TABLE A fetch because the COMMIT closed the cursor. I added
'WITH HOLD' to the cursor. On the first COMMIT, the application hung. I
assume the COMMIT would have completed after some period of time, but I
didn't wait that long.

There are 20 million rows in table A and 60 million in table B (one to many
relationship).

Is this hang on COMMIT when using WITH HOLD to be expected? Is there a way
around it? I don't think it's reasonable put the entire 60 million updates
in a single transaction. The kludge solution I implemented was to write out
all the data I needed from table A to a file, then read that file and update
table B.

Wes
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #5
<we****@syntegra.com> writes:
Is this hang on COMMIT when using WITH HOLD to be expected?
Yes. WITH HOLD is not magic, it just makes a materialized copy of the
SELECT result. If you're talking about a multi-million-row result,
it's gonna take awhile.
The kludge solution I implemented was to write out all the data I
needed from table A to a file, then read that file and update table B.


In theory at least, that should not be any faster than a WITH HOLD
cursor, since you're effectively replicating the same functionality
outside the database ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #6
On Apr 6, 2004, at 11:54 AM, Jan Wieck wrote:
Eric Ridge wrote:
Why must a cursor be defined in an open transaction? Obviously
there's a good reason, but I can't figure it out. On a high level,
what would be involved in allowing a cursor to outlive the
transaction that created it?
Because the transaction is what protects the rows that build the
result set from being removed by vacuum. In PostgreSQL, a cursor is a
running query executor just sitting in the middle of its operation.


That's a good thing to know.
If the underlying query is for example a simple sequential scan, then
the result set is not materialized but every future fetch operation
will read directly from the base table. This would obviously get
screwed up if vacuum would think nobody needs those rows any more.


Is vacuum the only thing that would muck with the rows?
Cursors seem as if they have some nice performance benefits (esp. if
you're not using all rows found), but their usefulness drops
considerably since you must leave a transaction open.


And now you know why they are so good if you don't use all rows. This
benefit I think goes away if you use Joe Conway's suggestion of WITH
HOLD.


Okay, so WITH HOLD is actually materializing the entire resultset
(sequential scan or otherwise)? If that's true, you're right, some of
the benefits do go away.

I need to setup a 7.4 test server and play with this some, and figure
out if the benefits are really what I want them to be. I do appreciate
the insight into how cursors work... it helps a lot!

eric
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #7
On 4/6/04 3:55 PM, "Tom Lane" <tg*@sss.pgh.pa.us> wrote:
The kludge solution I implemented was to write out all the data I
needed from table A to a file, then read that file and update table B.


In theory at least, that should not be any faster than a WITH HOLD
cursor, since you're effectively replicating the same functionality
outside the database ...


Except for the "out of memory" thing...

Are you saying that once the first COMMIT completed, all COMMIT's after that
would function at normal speed - only the first one has to save the result
set?

Wes
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #8
<we****@syntegra.com> writes:
On 4/6/04 3:55 PM, "Tom Lane" <tg*@sss.pgh.pa.us> wrote:
In theory at least, that should not be any faster than a WITH HOLD
cursor, since you're effectively replicating the same functionality
outside the database ...
Except for the "out of memory" thing...


What "out of memory thing"? The tuplestore code is perfectly capable of
spilling to disk --- in fact the usual performance gripe against it has
to do with spilling too soon, because sort_mem is set too small.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #9
Eric Ridge wrote:
On Apr 6, 2004, at 11:54 AM, Jan Wieck wrote:
And now you know why they are so good if you don't use all rows. This
benefit I think goes away if you use Joe Conway's suggestion of WITH
HOLD.


Okay, so WITH HOLD is actually materializing the entire resultset
(sequential scan or otherwise)? If that's true, you're right, some of
the benefits do go away.


Keep in mind that the tuplestore stays in memory as long as it fits
within sort_mem kilobytes. And you can do:

set sort_mem to <some_large_number>;

prior to COMMIT, and then

set sort_mem to default;

after COMMIT, as long as you can afford the memory use. A bit ugly, but
it might come in handy ;-)

Joe

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #10
Eric Ridge wrote:
On Apr 6, 2004, at 11:54 AM, Jan Wieck wrote:
If the underlying query is for example a simple sequential scan, then
the result set is not materialized but every future fetch operation
will read directly from the base table. This would obviously get
screwed up if vacuum would think nobody needs those rows any more.
Is vacuum the only thing that would muck with the rows?


Vacuum is the only thing that cares for the dustmites, yes.
I need to setup a 7.4 test server and play with this some, and figure
out if the benefits are really what I want them to be. I do appreciate
the insight into how cursors work... it helps a lot!


Experience and knowledge can only be replaced by more experience and
more knowledge.
Jan

--
#================================================= =====================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================= = Ja******@Yahoo.com #
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #11
On Apr 7, 2004, at 12:43 AM, Joe Conway wrote:
Eric Ridge wrote:
On Apr 6, 2004, at 11:54 AM, Jan Wieck wrote:
And now you know why they are so good if you don't use all rows.
This benefit I think goes away if you use Joe Conway's suggestion of
WITH HOLD.

Okay, so WITH HOLD is actually materializing the entire resultset
(sequential scan or otherwise)? If that's true, you're right, some
of the benefits do go away.


Keep in mind that the tuplestore stays in memory as long as it fits
within sort_mem kilobytes. And you can do:


More good information. Thanks!

Is the tuplestore basically just an array of ItemPointer-s? In mean,
it's not a copy of each entire row, is it?

eric
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #12
On Apr 7, 2004, at 7:51 AM, Jan Wieck wrote:
Eric Ridge wrote:
On Apr 6, 2004, at 11:54 AM, Jan Wieck wrote:
If the underlying query is for example a simple sequential scan,
then the result set is not materialized but every future fetch
operation will read directly from the base table. This would
obviously get screwed up if vacuum would think nobody needs those
rows any more.

Is vacuum the only thing that would muck with the rows?


Vacuum is the only thing that cares for the dustmites, yes.


And WITH HOLD is strong enough to defend against a vacuum, I hope...
I need to setup a 7.4 test server and play with this some, and figure
out if the benefits are really what I want them to be. I do
appreciate the insight into how cursors work... it helps a lot!


Experience and knowledge can only be replaced by more experience and
more knowledge.


Very wise words.

My real problem is that the JDBC drivers (and I assume this is true for
all client interfaces) buffer the results of a SELECT in memory,
because the backend pushes out all the tuples as the response. I'm not
dealing with a large number of rows (only a few thousand), but they've
very wide, and many contain fields with multi-megabyte data. In some
situations, when I've got a lot of open ResultSets, the JVM throws
OutOfMemory errors.

One half-baked thought was to hack the JDBC drivers to have 'em gzip
large resultsets in memory. Wouldn't completely solve the problem, but
would probably help quite a bit. But the better solution is to use
cursors. We're not in a position to upgrade to 7.4 just yet, so we'll
just deal with the OutOfMemory errors until we can.

eric
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #13
On 4/6/04 11:09 PM, "Tom Lane" <tg*@sss.pgh.pa.us> wrote:
What "out of memory thing"? The tuplestore code is perfectly capable of
spilling to disk --- in fact the usual performance gripe against it has
to do with spilling too soon, because sort_mem is set too small.


I tried doing a mass update of all rows with a single SQL statement in psql
and after it ran for many hours, I got 'out of memory'. I didn't try that
using C and WITH HOLD. I assumed it ran out of swap space, but was sleeping
at the time.

Wes
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #14


On Wed, 7 Apr 2004, Eric Ridge wrote:

My real problem is that the JDBC drivers (and I assume this is true for
all client interfaces) buffer the results of a SELECT in memory,
because the backend pushes out all the tuples as the response. I'm not
dealing with a large number of rows (only a few thousand), but they've
very wide, and many contain fields with multi-megabyte data. In some
situations, when I've got a lot of open ResultSets, the JVM throws
OutOfMemory errors.


The 7.4 jdbc driver has the ability to use cursors behind the scenes on
queries. This is done by calling Statement.setFetchSize(n) to retrieve n
rows at a time. There are a number of other restrictions: you must be in
a transaction and the ResultSet type must be FORWARD_ONLY. You can use
the 7.4 jdbc driver against a 7.3 server as well so this may provide some
relief.

Kris Jurka
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #15
Eric Ridge wrote:
Is the tuplestore basically just an array of ItemPointer-s? In mean,
it's not a copy of each entire row, is it?


Yup, it is copied:

src/backend/utils/sort/tuplestore.c:tuplestore_puttuple()
8<--------------------------------------------------------
/*
* Accept one tuple and append it to the tuplestore.
*
* Note that the input tuple is always copied; the caller need not save
* it.
8<--------------------------------------------------------

Joe
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 23 '05 #16
On Apr 7, 2004, at 2:01 PM, Kris Jurka wrote:
The 7.4 jdbc driver has the ability to use cursors behind the scenes on
queries. This is done by calling Statement.setFetchSize(n) to
retrieve n
rows at a time. There are a number of other restrictions: you must be
in
a transaction and the ResultSet type must be FORWARD_ONLY. You can use
the 7.4 jdbc driver against a 7.3 server as well so this may provide
some
relief.


I remember reading about this long ago. It's not really an option in
my little world because of the open transaction bit. My solution is
eventually going to be to upgrade to 7.4... heh, probably about the
time 7.5 is released. *sigh*

eric
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #17
On Apr 7, 2004, at 7:48 PM, Joe Conway wrote:
Eric Ridge wrote:
Is the tuplestore basically just an array of ItemPointer-s? In mean,
it's not a copy of each entire row, is it?


Yup, it is copied:


wow. I should go back and read the archives to see if this was
discussed already, but I can't help but wonder if there's a way to only
copy pointers to the tuples. I realize VACUUM could screw it up, but
maybe something could be invented (or re-used) to help guard against
that.

eric
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #18
Eric Ridge <eb*@tcdi.com> writes:
wow. I should go back and read the archives to see if this was
discussed already, but I can't help but wonder if there's a way to only
copy pointers to the tuples. I realize VACUUM could screw it up, but
maybe something could be invented (or re-used) to help guard against
that.


Still looking for that free lunch, eh?

If you want to leave the tuples on-disk, then you hold a transaction
open to ensure that VACUUM won't whisk them out from under you. That's
what the normal non-HOLD cursor case will do.

If you don't want to leave the tuples on-disk, you have to copy them
someplace. You can do that with a HOLD cursor.

AFAICS any other solution will simply reinvent one or the other of these
techniques.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #19

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Alban Hertroys | last post by:
Oh no! It's me and transactions again :) I'm not really sure whether this is a limitation of psycopg or postgresql. When I use multiple cursors in a transaction, the records inserted at the...
19
by: Dave Anderson | last post by:
Can anyone point me to a good resource for learning cursors in MSSQL? Thanks Dave
22
by: T.S.Negi | last post by:
Hi All, I want to avoid using cursors and loops in stored procedures. Please suggest alternate solutions with example (if possible). Any suggestion in these regards will be appreciated. ...
5
by: Mark Harrison | last post by:
In the program testlibpq.c, these five SQL statements are executed: res = PQexec(conn, "BEGIN"); res = PQexec(conn, "DECLARE myportal CURSOR FOR select * from pg_database"); res = PQexec(conn,...
5
by: Todd Huish | last post by:
I have noticed something disturbing when retrieving datasets over a relatively slow line (multiple T1). I am looking at about 25 seconds to retrieve 500 rows via a php-odbc link. This same select...
6
by: a | last post by:
Hello, I am doing some multithreading in an MDI app, and I can't seem to get the cursor to stay as an Hourglass. I call: Cursor.Current = cursors.wait at the beginning of my routing, and...
10
by: Just Me | last post by:
Does Me.Cursor.Current=Cursors.WaitCursor set the current property of Me.Cursor to Cursors.WaitCursor And Me.Cursor.Current=Cursors.Default set the Me.Current property to something (default)...
7
by: H. Williams | last post by:
I know the .Net Cursor class doesn't work with color cursors. So I'm currently using the LoadCursorFromFile API with reflection to set color cursors: here is my code: public static extern...
1
by: Dima Kuchin | last post by:
Hello, I was trying to find the information about when and where should I use cursors in DB2, no luck. Maybe you can point me to some article that describes just that (or tell me which page is...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.