473,385 Members | 1,357 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Geneator/Iterator Nesting Problem - Any Ideas? 2.4

Hi

Using Python 2.4 I am trying to procduce a generator which will return
the results of dbi SQL statement using fetchmany for performance.

So instead of fetching one record for each call, I will fetch batches
of X (eg 100) and yeild each record in turn.

For reasons of pure asthetics and my own learning I wanted it to look
like this:

</B>
def resultsetbatchgen(cursor, size=100):
for results in (recordbatch for recordbatch in
cursor.fetchmany(size)):
for rec in results:
yield rec
</B>

Problem is this this gives spurious results. To understand the problem
if I excute the following in the console I get the correct results:

</B>
cur.execute(<QUERY WITH MOER THAN 1000 records>)
recs = (recordbatch for recordbatch in cur.fetchmany(1000))
sum(map(lambda x: 1, (rec for rec in recs))) 1000
</B>

This is PERFECT! 1000 is what I expected. but now lets nest this inside
another for construct like so

</B> cur.execute(<QUERY WITH MOER THAN 1000 records>)
for results in (recordbatch for recordbatch in

cur.fetchmany(1000)):
.... print sum(map(lambda x: 1, (rec for rec in results)))
76
76
76
76
76
76
76
76
............
</B>

Now it thinks each batch size is 76 ... ? This completly wrong, and
baffling.

The commands are exactly the same as far as I can tell, the only
difference is that it is now nested wihtin another for loop?

Any help would be greatly aprpeciated.

PS a working but far less elegant version of this below but I would
like to understand why teh above doesnt work and doesnt work
consitantly:

def resultsetbatch(cursor, size=100):
results = cursor.fetchmany(size)
while results <> []:
for rec in results:
yield rec
results = cursor.fetchmany(size)

Jul 18 '05 #1
4 1861
ChaosKCW wrote:
For reasons of pure asthetics and my own learning I wanted it to look
like this:

</B>
def resultsetbatchgen(cursor, size=100):
for results in (recordbatch for recordbatch in
cursor.fetchmany(size)):
for rec in results:
yield rec
</B>


Note that this is equivalent to:

def resultsetbatchgen(cursor, size=100):
for results in cursor.fetchmany(size):
for rec in results:
yield rec

That is, your generator expression isn't really doing anything.
cur.execute(<QUERY WITH MOER THAN 1000 records>)
recs = (recordbatch for recordbatch in cur.fetchmany(1000))
sum(map(lambda x: 1, (rec for rec in recs))) 1000


This should be equivalent to
....
recs = iter(cur.fetchmany(1000))
len(list(recs)) cur.execute(<QUERY WITH MOER THAN 1000 records>)
for results in (recordbatch for recordbatch in cur.fetchmany(1000)): ... print sum(map(lambda x: 1, (rec for rec in results)))


This is now equivalent to:
cur.execute(<QUERY WITH MOER THAN 1000 records>)
for rec in cur.fetchmany(1000):

.... print len(list(results))

Note that you're counting the number of elements in record, which I'm
guessing is 76?

I'm thinking you want to do something like:

def resultsetbatchgen(cursor, size=100):
while True:
results = cursor.fetchmany(size)
if not results:
break
for rec in results:
yield rec

I don't actually know how you tell when there isn't anything more to
fetch, but if it returns an empty list, you could also write this like:

def resultsetbatchgen(cursor, size=100):
for results in iter(lambda: cursor.fetchmany(size), [])
for rec in results:
yield rec
HTH,

STeVe
Jul 18 '05 #2
Hi

Thanks this was very helpfull.

Your final solution seems like the best (most elegant). I was trying to
avoid the ugly while loops with breaks and this final one certainly
does that.

Thanks for your help .

Jul 18 '05 #3
For those that are interested I ran a performance comparison of various
functions for wrapping sql results in a interables and generators. The
results are below and the code is at the very bottom.

Surprisinly (in a happy way) the functional version of the batch
retrieve comes in a very close second beating out two other common
iterative approaches. However the winner by a small margin is still an
iterative routine. The elegance of the more functional approach (thanks
to Steve) and its excellent performance makes it a clear winner in my
mind.

The other intersting conclusion is that batch fetching results gives a
fairly significant and real performance boost. Its not just academic.

The Winner:

def resultset_functional_batchgenerator(cursor, size=100):
for results in iter(lambda: cursor.fetchmany(size), []):
for rec in results:
yield rec
Test Results (P2.4 IBM T41 Thinkpad):
Ordered by: cumulative time
List reduced from 57 to 7 due to restriction <'test_'>

ncalls tottime percall cumtime percall filename:lineno(function)
1 2.140 2.140 54.002 54.002
PerformanceTestGenerators.py:102(test_resultset_fu nctional_generator)
1 1.957 1.957 45.484 45.484
PerformanceTestGenerators.py:98(test_resultset_ite rative_generator)
1 2.433 2.433 41.844 41.844
PerformanceTestGenerators.py:94(test_resultset_ite rator)
1 1.930 1.930 39.793 39.793
PerformanceTestGenerators.py:110(test_resultset_it erative_batchgenerator_2)
1 1.734 1.734 35.561 35.561
PerformanceTestGenerators.py:114(test_resultset_it erative_batchgenerator_3)
1 1.980 1.980 34.579 34.579
PerformanceTestGenerators.py:118(test_resultset_fu nctional_batchgenerator)
1 1.780 1.780 31.696 31.696
PerformanceTestGenerators.py:106(test_resultset_it erative_batchgenerator_1)

Code:

import unittest
import odbc
import profile
import pstats

class resultset_iterator:
"Iterate over the recordset and frees the cursor afterwards."
def __init__(self, cursor):
self.cursor = cursor
def __iter__(self): return self
def next(self):
rec = self.cursor.fetchone()
if not rec:
raise StopIteration
return rec

def resultset_iterative_generator(cursor):
rec = cursor.fetchone();
while rec:
yield rec
rec = cursor.fetchone();

def resultset_functional_generator(cursor):
for rec in iter(lambda: cursor.fetchone(), None):
yield rec

def resultset_iterative_batchgenerator_1(cursor, size=100):
results = cursor.fetchmany(size)
while results:
for rec in results:
yield rec
results = cursor.fetchmany(size)

def resultset_iterative_batchgenerator_2(cursor, arraysize=100):
'An iterator that uses fetchmany to keep memory usage down'
done = False
while not done:
results = cursor.fetchmany(arraysize)
if results == []:
done = True
for result in results:
yield result

def resultset_iterative_batchgenerator_3(cursor, size=100):
while True:
results = cursor.fetchmany(size)
if not results:
break
for rec in results:
yield rec

def resultset_functional_batchgenerator(cursor, size=100):
for results in iter(lambda: cursor.fetchmany(size), []):
for rec in results:
yield rec

class testResultSetGenerators(unittest.TestCase):

connectstring = "*REMOVED*"
sql = "*REMOVED*"

def setUp(self):
self.con = odbc.odbc(self.connectstring)
self.cur = self.con.cursor()
self.cur.execute(self.sql)

def tearDown(self):
self.cur.close()
self.con.close()

def test_resultset_iterator(self):
for row in resultset_iterator(self.cur):
pass

def test_resultset_iterative_generator(self):
for row in resultset_iterative_generator(self.cur):
pass

def test_resultset_functional_generator(self):
for row in resultset_functional_generator(self.cur):
pass

def test_resultset_iterative_batchgenerator_1(self):
for row in resultset_iterative_batchgenerator_1(self.cur):
pass

def test_resultset_iterative_batchgenerator_2(self):
for row in resultset_iterative_batchgenerator_2(self.cur):
pass

def test_resultset_iterative_batchgenerator_3(self):
for row in resultset_iterative_batchgenerator_3(self.cur):
pass

def test_resultset_functional_batchgenerator(self):
for row in resultset_functional_batchgenerator(self.cur):
pass

if __name__ == '__main__':
suite = unittest.makeSuite(testResultSetGenerators)
profile.run('unittest.TextTestRunner(verbosity=2). run(suite)',
'c:\\temp\\profile')

p = pstats.Stats('c:\\temp\\profile')
p.strip_dirs().sort_stats('cumulative').print_stat s('test_')

Jul 18 '05 #4
ChaosKCW wrote:
def resultset_functional_generator(cursor):
for*rec*in*iter(lambda:*cursor.fetchone(),*None):
yield*rec


This can be simplified to

def resultset_functional_generator(cursor):
return iter(cursor.fetchone, None)

It should be a bit faster, too.

Peter

Jul 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

38
by: Grant Edwards | last post by:
In an interview at http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=273 Alan Kay said something I really liked, and I think it applies equally well to Python as well as the languages...
2
by: Rex_chaos | last post by:
Hi all, I am writing my own container and need an iterator. I am hesitating if I should my iterator should inherited from std::iterator or just write my own one. Please give me an idea. BTW,...
10
by: Sonoman | last post by:
Hi all: I am trying to write a simple program that simulates asking several persons their birth day and it counts how many persons are asked until two have the same birth day. The problem that I...
6
by: John | last post by:
I want to store an array of pointers (void *) to iterators inside a map. The idea is to reach the map <key,info> pair faster if the data is in the array. If not use a O(log) search. But stl...
13
by: Adam Hartshorne | last post by:
Hi All, I was wondering if anybody can tell me what is wrong with the following code, in a .h file I have std::list<std::vector<Site> > positions ; std::list<std::vector<Site> >::iterator...
8
by: Hardrock | last post by:
I encountered some difficulty in implementing dynamic loop nesting. I.e. the number of nesting in a for(...) loop is determined at run time. For example void f(int n) { For(i=0; i<=K; i++)...
16
by: akameswaran | last post by:
Ok, this is really irritating me. I'm sure there are different ways of doing this - I'm interested in the algo, not the practical solution, I'm more trying to play with iterators and recursion. I...
27
by: Steven D'Aprano | last post by:
I thought that an iterator was any object that follows the iterator protocol, that is, it has a next() method and an __iter__() method. But I'm having problems writing a class that acts as an...
7
by: Eric Wertman | last post by:
I have a set of files with this kind of content (it's dumped from WebSphere): ]
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.