473,513 Members | 2,684 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

how not to run out of memory in cursor.execute

I am using cx_Oracle and MySQLdb to pull a lot of data from some tables
and I find that the cursor.execute method uses a lot of memory that
never gets garbage collected. Using fetchmany instead of fetchall does
not seem to make any difference, since it's the execute that uses
memory. Breaking the query down to build lots of small tables doesn't
help, since execute doesn't give its memory back, after reading enough
small tables execute returns a memory error. What is the trick to get
memory back from execute in cx_Oracle and MySQLdb?

Jun 4 '06 #1
8 18717

jo**************@gmail.com wrote:
I am using cx_Oracle and MySQLdb to pull a lot of data from some tables
and I find that the cursor.execute method uses a lot of memory that
never gets garbage collected. Using fetchmany instead of fetchall does
not seem to make any difference, since it's the execute that uses
memory. Breaking the query down to build lots of small tables doesn't
help, since execute doesn't give its memory back, after reading enough
small tables execute returns a memory error. What is the trick to get
memory back from execute in cx_Oracle and MySQLdb?


cx_Oracle and MySQLdb must be handled differently, due to the fact that
MySQL does not actually have cursors (MySQLdb fakes them for you).

To handle large resultsets efficiently in cx_Oracle simply use the
cursor iteration idiom:

for row in cursor:
# do stuff with the row

cx_Oracle takes care of the fetching for you, and your memory usage
should remain fairly constant when using this idiom.

To handle large resultsets in MySQLdb, you have to resort to multiple
queries:

l = 1000
o = 0

cursor.execute('SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall()
while len(rows) > 0:
# process the fetched rows
o += l
cursor.execute('SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall()

cursor.close()

As you can see, the MySQLdb version is more involved, due to the lack
of real cursor support in the MySQL database. Any database with good
cursor support will likely have good cursor iteration support in the
corresponding DBAPI driver.

Hope this helps,

L. Daniel Burr

Jun 5 '06 #2
amberite wrote:
jo**************@gmail.com wrote:
I am using cx_Oracle and MySQLdb to pull a lot of data from some tables
and I find that the cursor.execute method uses a lot of memory that
never gets garbage collected. Using fetchmany instead of fetchall does
not seem to make any difference, since it's the execute that uses
memory. Breaking the query down to build lots of small tables doesn't
help, since execute doesn't give its memory back, after reading enough
small tables execute returns a memory error. What is the trick to get
memory back from execute in cx_Oracle and MySQLdb?

cx_Oracle and MySQLdb must be handled differently, due to the fact that
MySQL does not actually have cursors (MySQLdb fakes them for you).

To handle large resultsets efficiently in cx_Oracle simply use the
cursor iteration idiom:

for row in cursor:
# do stuff with the row

cx_Oracle takes care of the fetching for you, and your memory usage
should remain fairly constant when using this idiom.

To handle large resultsets in MySQLdb, you have to resort to multiple
queries:

l = 1000
o = 0

cursor.execute('SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall()
while len(rows) > 0:
# process the fetched rows
o += l
cursor.execute('SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall()

cursor.close()

As you can see, the MySQLdb version is more involved, due to the lack
of real cursor support in the MySQL database. Any database with good
cursor support will likely have good cursor iteration support in the
corresponding DBAPI driver.

Hope this helps,

L. Daniel Burr

The MySQLdb solution you give is way more complicated than it needs to
be, thereby skewing your opinion towards cx_Oracle unnecessarily.

Look up the .fetchmany() method of cursors in the DB API. There is only
any need to execute a single query no matter how large the result set:
you simply need to keep calling .fetchmany(N) (where N is whatever
you've decided by testing is your optimum chunk size) until it returns
less than N rows, at which point you have exhausted the query.

It's very little more effort to wrap this all up as a generator that
effectively allows you to use the same solution as you quote for cx_Oracle.

regards
Steve

--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Love me, love my blog http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Jun 5 '06 #3
On Mon, Jun 05, 2006 at 07:34:05PM +0100, Steve Holden wrote:
amberite wrote:
jo**************@gmail.com wrote:
I am using cx_Oracle and MySQLdb to pull a lot of data from some tables
and I find that the cursor.execute method uses a lot of memory that
never gets garbage collected. Using fetchmany instead of fetchall does
not seem to make any difference, since it's the execute that uses
memory. Breaking the query down to build lots of small tables doesn't
help, since execute doesn't give its memory back, after reading enough
small tables execute returns a memory error. What is the trick to get
memory back from execute in cx_Oracle and MySQLdb?

cx_Oracle and MySQLdb must be handled differently, due to the fact that
MySQL does not actually have cursors (MySQLdb fakes them for you).

To handle large resultsets efficiently in cx_Oracle simply use the
cursor iteration idiom:

for row in cursor:
# do stuff with the row

cx_Oracle takes care of the fetching for you, and your memory usage
should remain fairly constant when using this idiom.

To handle large resultsets in MySQLdb, you have to resort to multiple
queries:

l = 1000
o = 0

cursor.execute('SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall()
while len(rows) > 0:
# process the fetched rows
o += l
cursor.execute('SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall()

cursor.close()

As you can see, the MySQLdb version is more involved, due to the lack
of real cursor support in the MySQL database. Any database with good
cursor support will likely have good cursor iteration support in the
corresponding DBAPI driver.

Hope this helps,

L. Daniel Burr

The MySQLdb solution you give is way more complicated than it needs to
be, thereby skewing your opinion towards cx_Oracle unnecessarily.

Look up the .fetchmany() method of cursors in the DB API. There is only
any need to execute a single query no matter how large the result set:
you simply need to keep calling .fetchmany(N) (where N is whatever
you've decided by testing is your optimum chunk size) until it returns
less than N rows, at which point you have exhausted the query.

It's very little more effort to wrap this all up as a generator that
effectively allows you to use the same solution as you quote for cx_Oracle.


MySQL will keep table locks until the results are all fetched so even though
the DB API allows fetchone() or fetchmany() using those with MySQLdb is
dangerous.

-Jack

Jun 5 '06 #4

Steve Holden wrote:

The MySQLdb solution you give is way more complicated than it needs to
be, thereby skewing your opinion towards cx_Oracle unnecessarily.

I respectfully disagree with your assertion here. The code I presented
for MySQLdb is what you *have* to do, to avoid using up too much
memory. This thread is about how to avoid running out of memory when
handling large resultsets, and the approach I outlined is really the
only way to do that with MySQLdb's DBAPI support.
Look up the .fetchmany() method of cursors in the DB API. There is only
any need to execute a single query no matter how large the result set:
you simply need to keep calling .fetchmany(N) (where N is whatever
you've decided by testing is your optimum chunk size) until it returns
less than N rows, at which point you have exhausted the query.

..fetchmany() buys you nothing in the case of large resultsets. The
memory usage will continue to climb with each call to fetchmany. This
isn't the fault of MySQLdb, it is just that MySQL doesn't support
cursors, so MySQLdb has to fake it.
It's very little more effort to wrap this all up as a generator that
effectively allows you to use the same solution as you quote for cx_Oracle.


Again, I respectfully disagree. Your proposed solution, while
resulting in shorter code, will continue to eat memory until the entire
resultset has been delivered.

L. Daniel Burr

Jun 6 '06 #5
Jack Diederich <ja**@performancedrivers.com> writes:
[...]
MySQL will keep table locks until the results are all fetched so even though
the DB API allows fetchone() or fetchmany() using those with MySQLdb is
dangerous.

[...]

That's not true of InnoDB tables.
John
Jun 6 '06 #6
whenever you are using a package that leaks memory.
it can be appropriate to use Rpyc (http://rpyc.wikispaces.com/) to run
the leaking code in a different process, and restart it from time to
time.
I've been using this method to avoid the leaks of matplotlib.

Jun 7 '06 #7
>>>>> "yairchu@gmail" == yairchu@gmail com <ya*****@gmail.com> writes:

yairchu@gmail> whenever you are using a package that leaks memory.
yairchu@gmail> it can be appropriate to use Rpyc
yairchu@gmail> (http://rpyc.wikispaces.com/) to run the leaking
yairchu@gmail> code in a different process, and restart it from
yairchu@gmail> time to time. I've been using this method to avoid
yairchu@gmail> the leaks of matplotlib.

The only known leak in matplotlib is in the tkagg backend which we
believe comes from tkinter and is not in matplotlib proper. There are
a variety of ways to make it look like matplotlib is leaking memory,
eg overplotting when you want to first clear the plot, or failing to
close figures properly. We have unit tests to check for leaks, and
they are passing. Perhaps you can post some code which exposes the
problem.

JDH
Jun 7 '06 #8
mjp
jo**************@gmail.com wrote:
I am using cx_Oracle and MySQLdb to pull a lot of data from some tables
and I find that the cursor.execute method uses a lot of memory that
never gets garbage collected. Using fetchmany instead of fetchall does
not seem to make any difference, since it's the execute that uses
memory. [...]
For MySQLdb, the SSCursor class ("Server Side Cursor"), rather than the
default cursor class, may do what you want: retrieve the result set
row-by-row on demand, rather than all at once at the time of
..execute(). You'll need to remember to .fetch...() every row and call
..close(), however.

See the docstrings for CursorUseResultMixIn and CursorStoreResultMixIn
classes in MySQLdb.cursors for more information.
[...] Breaking the query down to build lots of small tables doesn't
help, since execute doesn't give its memory back, after reading enough
small tables execute returns a memory error. What is the trick to get
memory back from execute in cx_Oracle and MySQLdb?


Generally, the trick is to avoid consuming the memory in the first
place. :)

Regards,
Mike

Jun 11 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
8624
by: Michel Combe | last post by:
Hi all, I'm writing a program that will read an ASCII file periodically and update several tables in a MySQL database. My question is "Can I use the same cursor for several SQL requests (SELECT...
7
10532
by: Brian Kelley | last post by:
I am trying to use threads and mysqldb to retrieve data from multiple asynchronous queries. My basic strategy is as follows, create two cursors, attach them to the appropriate databases and then...
2
5033
by: Tim Williams | last post by:
I'm trying to write a simple python program to access a MySQL database. I'm having a problem with using MySQLdb to get the results of a SQL command in a cursor. Sometimes the cursor.execute works,...
2
1980
by: TA | last post by:
Hi, This might be a silly question, but I was wondering how you would navigate backwards in a PostgreSQL cursor when the Python DB-API 2.0 allows records to be fetched in a forward-only manner?...
7
8637
by: William Gill | last post by:
I have been trying to pass parameters as indicated in the api. when I use: sql= 'select * from %s where cusid = %s ' % name,recID) Cursor.execute(sql) it works fine, but when I try : sql=...
4
6823
by: Rachel McConnell | last post by:
Hello, I have a Java web application using Hibernate to connect to a PostgreSQL backend. I am seeing the below stack trace during processing of a set of data consisting of around 1000 objects;...
10
10862
by: technocrat | last post by:
Hi, I am trying to declare and cursor and thn load from that cursor into another table. Since I have almost 4 million records, I cant do it without the cursor which reduces the time by almost...
7
6637
by: ChaosKCW | last post by:
Hi I am trying to use pymssql, and have an issue where by the execute (not the fetch) is appearing to load all records into memory. if I execute con = pymssql.connect(...) cur =...
1
6738
by: vituko | last post by:
plpgsql (postgresql 8.3 but I can upgrade) I can open a cursor with a dynamic query (table / column variable) : -open cursor for execute '...' ; But if I want do updates... - execute 'update...
0
7257
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7157
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
1
7098
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7521
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5682
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
5084
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
1591
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
798
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
455
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.