473,626 Members | 3,041 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Geneator/Iterator Nesting Problem - Any Ideas? 2.4

Hi

Using Python 2.4 I am trying to procduce a generator which will return
the results of dbi SQL statement using fetchmany for performance.

So instead of fetching one record for each call, I will fetch batches
of X (eg 100) and yeild each record in turn.

For reasons of pure asthetics and my own learning I wanted it to look
like this:

</B>
def resultsetbatchg en(cursor, size=100):
for results in (recordbatch for recordbatch in
cursor.fetchman y(size)):
for rec in results:
yield rec
</B>

Problem is this this gives spurious results. To understand the problem
if I excute the following in the console I get the correct results:

</B>
cur.execute(<QU ERY WITH MOER THAN 1000 records>)
recs = (recordbatch for recordbatch in cur.fetchmany(1 000))
sum(map(lambda x: 1, (rec for rec in recs))) 1000
</B>

This is PERFECT! 1000 is what I expected. but now lets nest this inside
another for construct like so

</B> cur.execute(<QU ERY WITH MOER THAN 1000 records>)
for results in (recordbatch for recordbatch in

cur.fetchmany(1 000)):
.... print sum(map(lambda x: 1, (rec for rec in results)))
76
76
76
76
76
76
76
76
............
</B>

Now it thinks each batch size is 76 ... ? This completly wrong, and
baffling.

The commands are exactly the same as far as I can tell, the only
difference is that it is now nested wihtin another for loop?

Any help would be greatly aprpeciated.

PS a working but far less elegant version of this below but I would
like to understand why teh above doesnt work and doesnt work
consitantly:

def resultsetbatch( cursor, size=100):
results = cursor.fetchman y(size)
while results <> []:
for rec in results:
yield rec
results = cursor.fetchman y(size)

Jul 18 '05 #1
4 1888
ChaosKCW wrote:
For reasons of pure asthetics and my own learning I wanted it to look
like this:

</B>
def resultsetbatchg en(cursor, size=100):
for results in (recordbatch for recordbatch in
cursor.fetchman y(size)):
for rec in results:
yield rec
</B>


Note that this is equivalent to:

def resultsetbatchg en(cursor, size=100):
for results in cursor.fetchman y(size):
for rec in results:
yield rec

That is, your generator expression isn't really doing anything.
cur.execute (<QUERY WITH MOER THAN 1000 records>)
recs = (recordbatch for recordbatch in cur.fetchmany(1 000))
sum(map(lam bda x: 1, (rec for rec in recs))) 1000


This should be equivalent to
....
recs = iter(cur.fetchm any(1000))
len(list(recs)) cur.execute (<QUERY WITH MOER THAN 1000 records>)
for results in (recordbatch for recordbatch in cur.fetchmany(1 000)): ... print sum(map(lambda x: 1, (rec for rec in results)))


This is now equivalent to:
cur.execute(<QU ERY WITH MOER THAN 1000 records>)
for rec in cur.fetchmany(1 000):

.... print len(list(result s))

Note that you're counting the number of elements in record, which I'm
guessing is 76?

I'm thinking you want to do something like:

def resultsetbatchg en(cursor, size=100):
while True:
results = cursor.fetchman y(size)
if not results:
break
for rec in results:
yield rec

I don't actually know how you tell when there isn't anything more to
fetch, but if it returns an empty list, you could also write this like:

def resultsetbatchg en(cursor, size=100):
for results in iter(lambda: cursor.fetchman y(size), [])
for rec in results:
yield rec
HTH,

STeVe
Jul 18 '05 #2
Hi

Thanks this was very helpfull.

Your final solution seems like the best (most elegant). I was trying to
avoid the ugly while loops with breaks and this final one certainly
does that.

Thanks for your help .

Jul 18 '05 #3
For those that are interested I ran a performance comparison of various
functions for wrapping sql results in a interables and generators. The
results are below and the code is at the very bottom.

Surprisinly (in a happy way) the functional version of the batch
retrieve comes in a very close second beating out two other common
iterative approaches. However the winner by a small margin is still an
iterative routine. The elegance of the more functional approach (thanks
to Steve) and its excellent performance makes it a clear winner in my
mind.

The other intersting conclusion is that batch fetching results gives a
fairly significant and real performance boost. Its not just academic.

The Winner:

def resultset_funct ional_batchgene rator(cursor, size=100):
for results in iter(lambda: cursor.fetchman y(size), []):
for rec in results:
yield rec
Test Results (P2.4 IBM T41 Thinkpad):
Ordered by: cumulative time
List reduced from 57 to 7 due to restriction <'test_'>

ncalls tottime percall cumtime percall filename:lineno (function)
1 2.140 2.140 54.002 54.002
PerformanceTest Generators.py:1 02(test_results et_functional_g enerator)
1 1.957 1.957 45.484 45.484
PerformanceTest Generators.py:9 8(test_resultse t_iterative_gen erator)
1 2.433 2.433 41.844 41.844
PerformanceTest Generators.py:9 4(test_resultse t_iterator)
1 1.930 1.930 39.793 39.793
PerformanceTest Generators.py:1 10(test_results et_iterative_ba tchgenerator_2)
1 1.734 1.734 35.561 35.561
PerformanceTest Generators.py:1 14(test_results et_iterative_ba tchgenerator_3)
1 1.980 1.980 34.579 34.579
PerformanceTest Generators.py:1 18(test_results et_functional_b atchgenerator)
1 1.780 1.780 31.696 31.696
PerformanceTest Generators.py:1 06(test_results et_iterative_ba tchgenerator_1)

Code:

import unittest
import odbc
import profile
import pstats

class resultset_itera tor:
"Iterate over the recordset and frees the cursor afterwards."
def __init__(self, cursor):
self.cursor = cursor
def __iter__(self): return self
def next(self):
rec = self.cursor.fet chone()
if not rec:
raise StopIteration
return rec

def resultset_itera tive_generator( cursor):
rec = cursor.fetchone ();
while rec:
yield rec
rec = cursor.fetchone ();

def resultset_funct ional_generator (cursor):
for rec in iter(lambda: cursor.fetchone (), None):
yield rec

def resultset_itera tive_batchgener ator_1(cursor, size=100):
results = cursor.fetchman y(size)
while results:
for rec in results:
yield rec
results = cursor.fetchman y(size)

def resultset_itera tive_batchgener ator_2(cursor, arraysize=100):
'An iterator that uses fetchmany to keep memory usage down'
done = False
while not done:
results = cursor.fetchman y(arraysize)
if results == []:
done = True
for result in results:
yield result

def resultset_itera tive_batchgener ator_3(cursor, size=100):
while True:
results = cursor.fetchman y(size)
if not results:
break
for rec in results:
yield rec

def resultset_funct ional_batchgene rator(cursor, size=100):
for results in iter(lambda: cursor.fetchman y(size), []):
for rec in results:
yield rec

class testResultSetGe nerators(unitte st.TestCase):

connectstring = "*REMOVED*"
sql = "*REMOVED*"

def setUp(self):
self.con = odbc.odbc(self. connectstring)
self.cur = self.con.cursor ()
self.cur.execut e(self.sql)

def tearDown(self):
self.cur.close( )
self.con.close( )

def test_resultset_ iterator(self):
for row in resultset_itera tor(self.cur):
pass

def test_resultset_ iterative_gener ator(self):
for row in resultset_itera tive_generator( self.cur):
pass

def test_resultset_ functional_gene rator(self):
for row in resultset_funct ional_generator (self.cur):
pass

def test_resultset_ iterative_batch generator_1(sel f):
for row in resultset_itera tive_batchgener ator_1(self.cur ):
pass

def test_resultset_ iterative_batch generator_2(sel f):
for row in resultset_itera tive_batchgener ator_2(self.cur ):
pass

def test_resultset_ iterative_batch generator_3(sel f):
for row in resultset_itera tive_batchgener ator_3(self.cur ):
pass

def test_resultset_ functional_batc hgenerator(self ):
for row in resultset_funct ional_batchgene rator(self.cur) :
pass

if __name__ == '__main__':
suite = unittest.makeSu ite(testResultS etGenerators)
profile.run('un ittest.TextTest Runner(verbosit y=2).run(suite) ',
'c:\\temp\\prof ile')

p = pstats.Stats('c :\\temp\\profil e')
p.strip_dirs(). sort_stats('cum ulative').print _stats('test_')

Jul 18 '05 #4
ChaosKCW wrote:
def resultset_funct ional_generator (cursor):
for*rec*in*iter (lambda:*cursor .fetchone(),*No ne):
yield*rec


This can be simplified to

def resultset_funct ional_generator (cursor):
return iter(cursor.fet chone, None)

It should be a bit faster, too.

Peter

Jul 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

38
3660
by: Grant Edwards | last post by:
In an interview at http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=273 Alan Kay said something I really liked, and I think it applies equally well to Python as well as the languages mentioned: I characterized one way of looking at languages in this way: a lot of them are either the agglutination of features or they're a crystallization of style. Languages such as APL, Lisp, and Smalltalk are what you might call style...
2
2843
by: Rex_chaos | last post by:
Hi all, I am writing my own container and need an iterator. I am hesitating if I should my iterator should inherited from std::iterator or just write my own one. Please give me an idea. BTW, if I write my own one, should I let it defined inside my container(inner class) or just be a complelely independent class. Thanks.
10
2888
by: Sonoman | last post by:
Hi all: I am trying to write a simple program that simulates asking several persons their birth day and it counts how many persons are asked until two have the same birth day. The problem that I have is that the first loop I get a sequence of random numbers untuil I get a match, BUT then on the following loops I get the SAME random(?) sequence. I am using rand(). I do not want to get too fancy with the random number generator, but is there...
6
3694
by: John | last post by:
I want to store an array of pointers (void *) to iterators inside a map. The idea is to reach the map <key,info> pair faster if the data is in the array. If not use a O(log) search. But stl wont let me typecast a map iterator to a void * and store it in an array. Can this be done. I was looking at the code of stl_tree and it seems that the "value" = key,info pair is stored in a node and never moved unless deleted. So if map points...
13
4905
by: Adam Hartshorne | last post by:
Hi All, I was wondering if anybody can tell me what is wrong with the following code, in a .h file I have std::list<std::vector<Site> > positions ; std::list<std::vector<Site> >::iterator itSteps ;
8
2849
by: Hardrock | last post by:
I encountered some difficulty in implementing dynamic loop nesting. I.e. the number of nesting in a for(...) loop is determined at run time. For example void f(int n) { For(i=0; i<=K; i++) For(i=0; i<=K; i++)
16
3532
by: akameswaran | last post by:
Ok, this is really irritating me. I'm sure there are different ways of doing this - I'm interested in the algo, not the practical solution, I'm more trying to play with iterators and recursion. I want to create a program that generates every possible combination of a set of a n dice, with s sides. so I started with an iterator class die(object): def __init__(self,sides):
27
5306
by: Steven D'Aprano | last post by:
I thought that an iterator was any object that follows the iterator protocol, that is, it has a next() method and an __iter__() method. But I'm having problems writing a class that acts as an iterator. I have: class Parrot(object): def __iter__(self): return self def __init__(self): self.next = self._next()
7
1966
by: Eric Wertman | last post by:
I have a set of files with this kind of content (it's dumped from WebSphere): ]
0
8259
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8192
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
8358
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8502
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7188
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6119
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4090
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4195
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1805
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.