Hi
Using Python 2.4 I am trying to procduce a generator which will return
the results of dbi SQL statement using fetchmany for performance.
So instead of fetching one record for each call, I will fetch batches
of X (eg 100) and yeild each record in turn.
For reasons of pure asthetics and my own learning I wanted it to look
like this:
</B>
def resultsetbatchg en(cursor, size=100):
for results in (recordbatch for recordbatch in
cursor.fetchman y(size)):
for rec in results:
yield rec
</B>
Problem is this this gives spurious results. To understand the problem
if I excute the following in the console I get the correct results:
</B> cur.execute(<QU ERY WITH MOER THAN 1000 records>) recs = (recordbatch for recordbatch in cur.fetchmany(1 000)) sum(map(lambda x: 1, (rec for rec in recs)))
1000
</B>
This is PERFECT! 1000 is what I expected. but now lets nest this inside
another for construct like so
</B> cur.execute(<QU ERY WITH MOER THAN 1000 records>) for results in (recordbatch for recordbatch in
cur.fetchmany(1 000)):
.... print sum(map(lambda x: 1, (rec for rec in results)))
76
76
76
76
76
76
76
76
............
</B>
Now it thinks each batch size is 76 ... ? This completly wrong, and
baffling.
The commands are exactly the same as far as I can tell, the only
difference is that it is now nested wihtin another for loop?
Any help would be greatly aprpeciated.
PS a working but far less elegant version of this below but I would
like to understand why teh above doesnt work and doesnt work
consitantly:
def resultsetbatch( cursor, size=100):
results = cursor.fetchman y(size)
while results <> []:
for rec in results:
yield rec
results = cursor.fetchman y(size) 4 1888
ChaosKCW wrote: For reasons of pure asthetics and my own learning I wanted it to look like this:
</B> def resultsetbatchg en(cursor, size=100): for results in (recordbatch for recordbatch in cursor.fetchman y(size)): for rec in results: yield rec </B>
Note that this is equivalent to:
def resultsetbatchg en(cursor, size=100):
for results in cursor.fetchman y(size):
for rec in results:
yield rec
That is, your generator expression isn't really doing anything. cur.execute (<QUERY WITH MOER THAN 1000 records>) recs = (recordbatch for recordbatch in cur.fetchmany(1 000)) sum(map(lam bda x: 1, (rec for rec in recs))) 1000
This should be equivalent to
.... recs = iter(cur.fetchm any(1000)) len(list(recs))
cur.execute (<QUERY WITH MOER THAN 1000 records>) for results in (recordbatch for recordbatch in cur.fetchmany(1 000)): ... print sum(map(lambda x: 1, (rec for rec in results)))
This is now equivalent to: cur.execute(<QU ERY WITH MOER THAN 1000 records>) for rec in cur.fetchmany(1 000):
.... print len(list(result s))
Note that you're counting the number of elements in record, which I'm
guessing is 76?
I'm thinking you want to do something like:
def resultsetbatchg en(cursor, size=100):
while True:
results = cursor.fetchman y(size)
if not results:
break
for rec in results:
yield rec
I don't actually know how you tell when there isn't anything more to
fetch, but if it returns an empty list, you could also write this like:
def resultsetbatchg en(cursor, size=100):
for results in iter(lambda: cursor.fetchman y(size), [])
for rec in results:
yield rec
HTH,
STeVe
Hi
Thanks this was very helpfull.
Your final solution seems like the best (most elegant). I was trying to
avoid the ugly while loops with breaks and this final one certainly
does that.
Thanks for your help .
For those that are interested I ran a performance comparison of various
functions for wrapping sql results in a interables and generators. The
results are below and the code is at the very bottom.
Surprisinly (in a happy way) the functional version of the batch
retrieve comes in a very close second beating out two other common
iterative approaches. However the winner by a small margin is still an
iterative routine. The elegance of the more functional approach (thanks
to Steve) and its excellent performance makes it a clear winner in my
mind.
The other intersting conclusion is that batch fetching results gives a
fairly significant and real performance boost. Its not just academic.
The Winner:
def resultset_funct ional_batchgene rator(cursor, size=100):
for results in iter(lambda: cursor.fetchman y(size), []):
for rec in results:
yield rec
Test Results (P2.4 IBM T41 Thinkpad):
Ordered by: cumulative time
List reduced from 57 to 7 due to restriction <'test_'>
ncalls tottime percall cumtime percall filename:lineno (function)
1 2.140 2.140 54.002 54.002
PerformanceTest Generators.py:1 02(test_results et_functional_g enerator)
1 1.957 1.957 45.484 45.484
PerformanceTest Generators.py:9 8(test_resultse t_iterative_gen erator)
1 2.433 2.433 41.844 41.844
PerformanceTest Generators.py:9 4(test_resultse t_iterator)
1 1.930 1.930 39.793 39.793
PerformanceTest Generators.py:1 10(test_results et_iterative_ba tchgenerator_2)
1 1.734 1.734 35.561 35.561
PerformanceTest Generators.py:1 14(test_results et_iterative_ba tchgenerator_3)
1 1.980 1.980 34.579 34.579
PerformanceTest Generators.py:1 18(test_results et_functional_b atchgenerator)
1 1.780 1.780 31.696 31.696
PerformanceTest Generators.py:1 06(test_results et_iterative_ba tchgenerator_1)
Code:
import unittest
import odbc
import profile
import pstats
class resultset_itera tor:
"Iterate over the recordset and frees the cursor afterwards."
def __init__(self, cursor):
self.cursor = cursor
def __iter__(self): return self
def next(self):
rec = self.cursor.fet chone()
if not rec:
raise StopIteration
return rec
def resultset_itera tive_generator( cursor):
rec = cursor.fetchone ();
while rec:
yield rec
rec = cursor.fetchone ();
def resultset_funct ional_generator (cursor):
for rec in iter(lambda: cursor.fetchone (), None):
yield rec
def resultset_itera tive_batchgener ator_1(cursor, size=100):
results = cursor.fetchman y(size)
while results:
for rec in results:
yield rec
results = cursor.fetchman y(size)
def resultset_itera tive_batchgener ator_2(cursor, arraysize=100):
'An iterator that uses fetchmany to keep memory usage down'
done = False
while not done:
results = cursor.fetchman y(arraysize)
if results == []:
done = True
for result in results:
yield result
def resultset_itera tive_batchgener ator_3(cursor, size=100):
while True:
results = cursor.fetchman y(size)
if not results:
break
for rec in results:
yield rec
def resultset_funct ional_batchgene rator(cursor, size=100):
for results in iter(lambda: cursor.fetchman y(size), []):
for rec in results:
yield rec
class testResultSetGe nerators(unitte st.TestCase):
connectstring = "*REMOVED*"
sql = "*REMOVED*"
def setUp(self):
self.con = odbc.odbc(self. connectstring)
self.cur = self.con.cursor ()
self.cur.execut e(self.sql)
def tearDown(self):
self.cur.close( )
self.con.close( )
def test_resultset_ iterator(self):
for row in resultset_itera tor(self.cur):
pass
def test_resultset_ iterative_gener ator(self):
for row in resultset_itera tive_generator( self.cur):
pass
def test_resultset_ functional_gene rator(self):
for row in resultset_funct ional_generator (self.cur):
pass
def test_resultset_ iterative_batch generator_1(sel f):
for row in resultset_itera tive_batchgener ator_1(self.cur ):
pass
def test_resultset_ iterative_batch generator_2(sel f):
for row in resultset_itera tive_batchgener ator_2(self.cur ):
pass
def test_resultset_ iterative_batch generator_3(sel f):
for row in resultset_itera tive_batchgener ator_3(self.cur ):
pass
def test_resultset_ functional_batc hgenerator(self ):
for row in resultset_funct ional_batchgene rator(self.cur) :
pass
if __name__ == '__main__':
suite = unittest.makeSu ite(testResultS etGenerators)
profile.run('un ittest.TextTest Runner(verbosit y=2).run(suite) ',
'c:\\temp\\prof ile')
p = pstats.Stats('c :\\temp\\profil e')
p.strip_dirs(). sort_stats('cum ulative').print _stats('test_')
ChaosKCW wrote: def resultset_funct ional_generator (cursor): for*rec*in*iter (lambda:*cursor .fetchone(),*No ne): yield*rec
This can be simplified to
def resultset_funct ional_generator (cursor):
return iter(cursor.fet chone, None)
It should be a bit faster, too.
Peter This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Grant Edwards |
last post by:
In an interview at http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=273
Alan Kay said something I really liked, and I think it applies
equally well to Python as well as the languages mentioned:
I characterized one way of looking at languages in this
way: a lot of them are either the agglutination of features
or they're a crystallization of style. Languages such as
APL, Lisp, and Smalltalk are what you might call style...
|
by: Rex_chaos |
last post by:
Hi all,
I am writing my own container and need an iterator. I am hesitating
if I should my iterator should inherited from std::iterator or just
write my own one. Please give me an idea.
BTW, if I write my own one, should I let it defined inside my
container(inner class) or just be a complelely independent class.
Thanks.
|
by: Sonoman |
last post by:
Hi all:
I am trying to write a simple program that simulates asking several persons
their birth day and it counts how many persons are asked until two have the
same birth day. The problem that I have is that the first loop I get a
sequence of random numbers untuil I get a match, BUT then on the following
loops I get the SAME random(?) sequence. I am using rand(). I do not want to
get too fancy with the random number generator, but is there...
|
by: John |
last post by:
I want to store an array of pointers (void *) to
iterators inside a map. The idea is to reach the
map <key,info> pair faster if the data is in the
array. If not use a O(log) search.
But stl wont let me typecast a map iterator to a void *
and store it in an array. Can this be done. I was looking
at the code of stl_tree and it seems that the "value" =
key,info pair is stored in a node and never moved unless
deleted. So if map points...
|
by: Adam Hartshorne |
last post by:
Hi All,
I was wondering if anybody can tell me what is wrong with the following
code,
in a .h file I have
std::list<std::vector<Site> > positions ;
std::list<std::vector<Site> >::iterator itSteps ;
| |
by: Hardrock |
last post by:
I encountered some difficulty in implementing dynamic loop nesting.
I.e. the number of nesting in a for(...) loop is determined at run
time. For example
void f(int n)
{
For(i=0; i<=K; i++)
For(i=0; i<=K; i++)
|
by: akameswaran |
last post by:
Ok, this is really irritating me. I'm sure there are different ways of
doing this - I'm interested in the algo, not the practical solution,
I'm more trying to play with iterators and recursion. I want to create
a program that generates every possible combination of a set of a n
dice, with s sides.
so I started with an iterator
class die(object):
def __init__(self,sides):
|
by: Steven D'Aprano |
last post by:
I thought that an iterator was any object that follows the iterator
protocol, that is, it has a next() method and an __iter__() method.
But I'm having problems writing a class that acts as an iterator. I have:
class Parrot(object):
def __iter__(self):
return self
def __init__(self):
self.next = self._next()
|
by: Eric Wertman |
last post by:
I have a set of files with this kind of content (it's dumped from WebSphere):
]
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |