By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,965 Members | 1,690 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,965 IT Pros & Developers. It's quick & easy.

psycopg2 & large result set

P: n/a
Hi All.

I'm using psycopg2 to retrieve results from a rather large query (it
returns 22m records); unsurprisingly this doesn't fit in memory all at
once. What I'd like to achieve is something similar to a .NET data
provider I have which allows you to set a 'FetchSize' property; it
then retrieves 'n' many rows at a time, and fetches the next 'chunk'
after you read past the end of the current chunk. I suppose I could
use Python for .NET or IronPython but I'd rather stick with CPython
2.5 if possible.

I'm not 100% sure if it's an interface or a server thing. Any ideas
are most welcome.

Cheers,

Jon.

May 25 '07 #1
Share this Question
Share on Google+
2 Replies


P: n/a

Jon Clements wrote:
Hi All.

I'm using psycopg2 to retrieve results from a rather large query (it
returns 22m records); unsurprisingly this doesn't fit in memory all at
once. What I'd like to achieve is something similar to a .NET data
provider I have which allows you to set a 'FetchSize' property; it
then retrieves 'n' many rows at a time, and fetches the next 'chunk'
after you read past the end of the current chunk. I suppose I could
use Python for .NET or IronPython but I'd rather stick with CPython
2.5 if possible.
psycopg2 is DB-API 2.0 [1]_ compliant, so you can use
``fetchmany`` method and set ``cursor.arraysize`` accordingly.

[1] .. http://www.python.org/dev/peps/pep-0249/

--
HTH,
Rob

May 25 '07 #2

P: n/a
On 25 May, 11:16, Jon Clements <jon...@googlemail.comwrote:
>
I'm using psycopg2 to retrieve results from a rather large query (it
returns 22m records); unsurprisingly this doesn't fit in memory all at
once. What I'd like to achieve is something similar to a .NET data
provider I have which allows you to set a 'FetchSize' property; it
then retrieves 'n' many rows at a time, and fetches the next 'chunk'
after you read past the end of the current chunk. I suppose I could
use Python for .NET or IronPython but I'd rather stick with CPython
2.5 if possible.

I'm not 100% sure if it's an interface or a server thing. Any ideas
are most welcome.
It's an interface thing. The DB-API has fetchone, fetchmany and
(optionally) iteration methods on cursor objects; PostgreSQL supports
what you have in mind; pyPgSQL supports it at the interface level, but
psycopg2 only supports it if you use "named cursors", which is not
part of the DB-API specification as far as I recall, and not
particularly convenient if you're thinking of targeting more than one
database system with the same code. See this bug filed against
psycopg2 and the resulting discussion:

http://www.initd.org/tracker/psycopg/ticket/158

I've been running a patched version of psycopg2, but haven't developed
the patch further since it may be more convenient for me to switch
back to pyPgSQL eventually.

Paul

May 25 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.