473,769 Members | 4,202 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

how not to run out of memory in cursor.execute

I am using cx_Oracle and MySQLdb to pull a lot of data from some tables
and I find that the cursor.execute method uses a lot of memory that
never gets garbage collected. Using fetchmany instead of fetchall does
not seem to make any difference, since it's the execute that uses
memory. Breaking the query down to build lots of small tables doesn't
help, since execute doesn't give its memory back, after reading enough
small tables execute returns a memory error. What is the trick to get
memory back from execute in cx_Oracle and MySQLdb?

Jun 4 '06 #1
8 18823

jo************* *@gmail.com wrote:
I am using cx_Oracle and MySQLdb to pull a lot of data from some tables
and I find that the cursor.execute method uses a lot of memory that
never gets garbage collected. Using fetchmany instead of fetchall does
not seem to make any difference, since it's the execute that uses
memory. Breaking the query down to build lots of small tables doesn't
help, since execute doesn't give its memory back, after reading enough
small tables execute returns a memory error. What is the trick to get
memory back from execute in cx_Oracle and MySQLdb?


cx_Oracle and MySQLdb must be handled differently, due to the fact that
MySQL does not actually have cursors (MySQLdb fakes them for you).

To handle large resultsets efficiently in cx_Oracle simply use the
cursor iteration idiom:

for row in cursor:
# do stuff with the row

cx_Oracle takes care of the fetching for you, and your memory usage
should remain fairly constant when using this idiom.

To handle large resultsets in MySQLdb, you have to resort to multiple
queries:

l = 1000
o = 0

cursor.execute( 'SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall ()
while len(rows) > 0:
# process the fetched rows
o += l
cursor.execute( 'SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall ()

cursor.close()

As you can see, the MySQLdb version is more involved, due to the lack
of real cursor support in the MySQL database. Any database with good
cursor support will likely have good cursor iteration support in the
corresponding DBAPI driver.

Hope this helps,

L. Daniel Burr

Jun 5 '06 #2
amberite wrote:
jo************* *@gmail.com wrote:
I am using cx_Oracle and MySQLdb to pull a lot of data from some tables
and I find that the cursor.execute method uses a lot of memory that
never gets garbage collected. Using fetchmany instead of fetchall does
not seem to make any difference, since it's the execute that uses
memory. Breaking the query down to build lots of small tables doesn't
help, since execute doesn't give its memory back, after reading enough
small tables execute returns a memory error. What is the trick to get
memory back from execute in cx_Oracle and MySQLdb?

cx_Oracle and MySQLdb must be handled differently, due to the fact that
MySQL does not actually have cursors (MySQLdb fakes them for you).

To handle large resultsets efficiently in cx_Oracle simply use the
cursor iteration idiom:

for row in cursor:
# do stuff with the row

cx_Oracle takes care of the fetching for you, and your memory usage
should remain fairly constant when using this idiom.

To handle large resultsets in MySQLdb, you have to resort to multiple
queries:

l = 1000
o = 0

cursor.execute( 'SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall ()
while len(rows) > 0:
# process the fetched rows
o += l
cursor.execute( 'SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall ()

cursor.close()

As you can see, the MySQLdb version is more involved, due to the lack
of real cursor support in the MySQL database. Any database with good
cursor support will likely have good cursor iteration support in the
corresponding DBAPI driver.

Hope this helps,

L. Daniel Burr

The MySQLdb solution you give is way more complicated than it needs to
be, thereby skewing your opinion towards cx_Oracle unnecessarily.

Look up the .fetchmany() method of cursors in the DB API. There is only
any need to execute a single query no matter how large the result set:
you simply need to keep calling .fetchmany(N) (where N is whatever
you've decided by testing is your optimum chunk size) until it returns
less than N rows, at which point you have exhausted the query.

It's very little more effort to wrap this all up as a generator that
effectively allows you to use the same solution as you quote for cx_Oracle.

regards
Steve

--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Love me, love my blog http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Jun 5 '06 #3
On Mon, Jun 05, 2006 at 07:34:05PM +0100, Steve Holden wrote:
amberite wrote:
jo************* *@gmail.com wrote:
I am using cx_Oracle and MySQLdb to pull a lot of data from some tables
and I find that the cursor.execute method uses a lot of memory that
never gets garbage collected. Using fetchmany instead of fetchall does
not seem to make any difference, since it's the execute that uses
memory. Breaking the query down to build lots of small tables doesn't
help, since execute doesn't give its memory back, after reading enough
small tables execute returns a memory error. What is the trick to get
memory back from execute in cx_Oracle and MySQLdb?

cx_Oracle and MySQLdb must be handled differently, due to the fact that
MySQL does not actually have cursors (MySQLdb fakes them for you).

To handle large resultsets efficiently in cx_Oracle simply use the
cursor iteration idiom:

for row in cursor:
# do stuff with the row

cx_Oracle takes care of the fetching for you, and your memory usage
should remain fairly constant when using this idiom.

To handle large resultsets in MySQLdb, you have to resort to multiple
queries:

l = 1000
o = 0

cursor.execute( 'SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall ()
while len(rows) > 0:
# process the fetched rows
o += l
cursor.execute( 'SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o))
rows = cursor.fetchall ()

cursor.close()

As you can see, the MySQLdb version is more involved, due to the lack
of real cursor support in the MySQL database. Any database with good
cursor support will likely have good cursor iteration support in the
corresponding DBAPI driver.

Hope this helps,

L. Daniel Burr

The MySQLdb solution you give is way more complicated than it needs to
be, thereby skewing your opinion towards cx_Oracle unnecessarily.

Look up the .fetchmany() method of cursors in the DB API. There is only
any need to execute a single query no matter how large the result set:
you simply need to keep calling .fetchmany(N) (where N is whatever
you've decided by testing is your optimum chunk size) until it returns
less than N rows, at which point you have exhausted the query.

It's very little more effort to wrap this all up as a generator that
effectively allows you to use the same solution as you quote for cx_Oracle.


MySQL will keep table locks until the results are all fetched so even though
the DB API allows fetchone() or fetchmany() using those with MySQLdb is
dangerous.

-Jack

Jun 5 '06 #4

Steve Holden wrote:

The MySQLdb solution you give is way more complicated than it needs to
be, thereby skewing your opinion towards cx_Oracle unnecessarily.

I respectfully disagree with your assertion here. The code I presented
for MySQLdb is what you *have* to do, to avoid using up too much
memory. This thread is about how to avoid running out of memory when
handling large resultsets, and the approach I outlined is really the
only way to do that with MySQLdb's DBAPI support.
Look up the .fetchmany() method of cursors in the DB API. There is only
any need to execute a single query no matter how large the result set:
you simply need to keep calling .fetchmany(N) (where N is whatever
you've decided by testing is your optimum chunk size) until it returns
less than N rows, at which point you have exhausted the query.

..fetchmany() buys you nothing in the case of large resultsets. The
memory usage will continue to climb with each call to fetchmany. This
isn't the fault of MySQLdb, it is just that MySQL doesn't support
cursors, so MySQLdb has to fake it.
It's very little more effort to wrap this all up as a generator that
effectively allows you to use the same solution as you quote for cx_Oracle.


Again, I respectfully disagree. Your proposed solution, while
resulting in shorter code, will continue to eat memory until the entire
resultset has been delivered.

L. Daniel Burr

Jun 6 '06 #5
Jack Diederich <ja**@performan cedrivers.com> writes:
[...]
MySQL will keep table locks until the results are all fetched so even though
the DB API allows fetchone() or fetchmany() using those with MySQLdb is
dangerous.

[...]

That's not true of InnoDB tables.
John
Jun 6 '06 #6
whenever you are using a package that leaks memory.
it can be appropriate to use Rpyc (http://rpyc.wikispaces.com/) to run
the leaking code in a different process, and restart it from time to
time.
I've been using this method to avoid the leaks of matplotlib.

Jun 7 '06 #7
>>>>> "yairchu@gm ail" == yairchu@gmail com <ya*****@gmail. com> writes:

yairchu@gmail> whenever you are using a package that leaks memory.
yairchu@gmail> it can be appropriate to use Rpyc
yairchu@gmail> (http://rpyc.wikispaces.com/) to run the leaking
yairchu@gmail> code in a different process, and restart it from
yairchu@gmail> time to time. I've been using this method to avoid
yairchu@gmail> the leaks of matplotlib.

The only known leak in matplotlib is in the tkagg backend which we
believe comes from tkinter and is not in matplotlib proper. There are
a variety of ways to make it look like matplotlib is leaking memory,
eg overplotting when you want to first clear the plot, or failing to
close figures properly. We have unit tests to check for leaks, and
they are passing. Perhaps you can post some code which exposes the
problem.

JDH
Jun 7 '06 #8
mjp
jo************* *@gmail.com wrote:
I am using cx_Oracle and MySQLdb to pull a lot of data from some tables
and I find that the cursor.execute method uses a lot of memory that
never gets garbage collected. Using fetchmany instead of fetchall does
not seem to make any difference, since it's the execute that uses
memory. [...]
For MySQLdb, the SSCursor class ("Server Side Cursor"), rather than the
default cursor class, may do what you want: retrieve the result set
row-by-row on demand, rather than all at once at the time of
..execute(). You'll need to remember to .fetch...() every row and call
..close(), however.

See the docstrings for CursorUseResult MixIn and CursorStoreResu ltMixIn
classes in MySQLdb.cursors for more information.
[...] Breaking the query down to build lots of small tables doesn't
help, since execute doesn't give its memory back, after reading enough
small tables execute returns a memory error. What is the trick to get
memory back from execute in cx_Oracle and MySQLdb?


Generally, the trick is to avoid consuming the memory in the first
place. :)

Regards,
Mike

Jun 11 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
8642
by: Michel Combe | last post by:
Hi all, I'm writing a program that will read an ASCII file periodically and update several tables in a MySQL database. My question is "Can I use the same cursor for several SQL requests (SELECT and INSERT) or do I have to close the cursor between 2 requests ?". Regards Michel Combe
7
10576
by: Brian Kelley | last post by:
I am trying to use threads and mysqldb to retrieve data from multiple asynchronous queries. My basic strategy is as follows, create two cursors, attach them to the appropriate databases and then spawn worker functions to execute sql queries and process the results. This works occasionally, but fails a lot taking python down with it. Sometimes it also loses connection to the database. Sometimes I get an error, "Commands out of sync; ...
2
5054
by: Tim Williams | last post by:
I'm trying to write a simple python program to access a MySQL database. I'm having a problem with using MySQLdb to get the results of a SQL command in a cursor. Sometimes the cursor.execute works, sometimes not. From mysql: mysql> show databases; +-----------+ | Database |
2
1992
by: TA | last post by:
Hi, This might be a silly question, but I was wondering how you would navigate backwards in a PostgreSQL cursor when the Python DB-API 2.0 allows records to be fetched in a forward-only manner? I have tried different solutions - two of which are included here. The first solution uses the cursor declared implicitly by the pyPgSQL interface, but for some reason this does not work as expected.
7
8665
by: William Gill | last post by:
I have been trying to pass parameters as indicated in the api. when I use: sql= 'select * from %s where cusid = %s ' % name,recID) Cursor.execute(sql) it works fine, but when I try : sql= 'select * from %s where cusid like %s ' Cursor.execute(sql,(name,recID))
4
6879
by: Rachel McConnell | last post by:
Hello, I have a Java web application using Hibernate to connect to a PostgreSQL backend. I am seeing the below stack trace during processing of a set of data consisting of around 1000 objects; for a 200 object set I don't see the exception. I believe the salient point is the Out Of Memory bit - marked below by *****'s The fact that this exception occurs during a call to
10
10900
by: technocrat | last post by:
Hi, I am trying to declare and cursor and thn load from that cursor into another table. Since I have almost 4 million records, I cant do it without the cursor which reduces the time by almost 1/10th. I tried to create a sql statement for "load from cursor cur insert into table name" using java stored procedure, but this isnt recognised by sql since load isnt a sql keyword. So whats the solution to this. I have to do it programatically...
7
6689
by: ChaosKCW | last post by:
Hi I am trying to use pymssql, and have an issue where by the execute (not the fetch) is appearing to load all records into memory. if I execute con = pymssql.connect(...) cur = con.cursor() cur.execute(sql)
1
6785
by: vituko | last post by:
plpgsql (postgresql 8.3 but I can upgrade) I can open a cursor with a dynamic query (table / column variable) : -open cursor for execute '...' ; But if I want do updates... - execute 'update ...' where current of cursor ?? Mmm I don't find the way. Another approach :
0
10219
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10049
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9865
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8876
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7413
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5310
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3967
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3567
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2815
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.