By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
431,990 Members | 1,741 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 431,990 IT Pros & Developers. It's quick & easy.

Fw: Python Database Objects (PDO) 1.2.0 Released

P: n/a

----- Original Message -----
From: "Jon Franz" <jf****@neurokode.com>
To: "Serge Orlov" <so********@pobox.ru>
Sent: Wednesday, November 19, 2003 2:39 PM
Subject: Re: Python Database Objects (PDO) 1.2.0 Released

Yes, if the .open() is an generator then it must return a sequence of items
but only one at a time. If the loop body doesn't keep the result object
it will be garbage collected pretty soon. You don't need to return
a dictionary you can return a special "coupler" object that will bind
the column description data (created only one time) with the column
values. Of course, it means one more allocation per row and extra
references, but I don't really think it's very expensive. After all it is
idiomatic iteration over a sequence. Without hard data to prove
that it's really expensive I don't think it's right to say it's

expensive.
Well, it is, hard data following, but first an aside: generators are
something we want to avoid right now, since we support older
versions of python that do not support generators. Thus, older
interpreters would suffer from the memory bloat I've already
described.

Sorry for the caps below, DB2 uses caps everywhere and I'm just
trying to be correct.

However, the creation of the mapping object for each result adds up
over time. Coupling objects perform bad as well - I think it has to do
with the sheer # of objects that are created as you loop over the
results - gcing them takes time.

* Hard data.
I populated the SAMPLE.STAFF table of my DB2 installation with
5235 records.
I created two methods that grab all records from this table and loop
over them, outputting each field for every record. The regtimer
method used the .next() call, meanwhile the itertimer method
used an generator function to get a mapping object per record.

Note that this test is not very optimized - for every record I loop
over the .fields member of the Resultset, or in the iterator case,
I loop over the members of the returned mapping object.
The iterator mapping object is a python dictionary - I do no
tricks to provide associated column data - it's just name:value.
'Coupler' objects were even more expensive, and I didn't want to
lose the use of my test server for too long, so I dropped them
before progressing to the 100 run tests.

I used the timeit module to call each method 100 times,
and reran the tests 3 times.
Python was restarted in between runs.
Output was identical between the tests.

regtimer method: 281.4 seconds (avg)
itertimer method: 627.1 seconds (avg)

Building 5235 * 3 * 100 mapping objects seems very expensive
(even when using an iterator/generator system) versus providing
a mapping-object interface and keeping track of indexes
internally.

With smaller numbers of records, the difference is less pronounced,
but it's always there.
It may help to quit thinking of a Resultset as a sequence of dictionaries - PDO explicitly avoids that.

Isn't it premature optimization?


Nope, we'd already through about this and investigated it - until you
asked, no one knew that we had. Our older tests showed that memory
bloat was a problem, since we created the mapping objects all
at once, and that performance was horrible.

the iterator/generator version is faster than our old tests, but is still
slower.

cheers.

~Jon Franz
NeuroKode Labs, LLC

Jul 18 '05 #1
Share this question for a faster answer!
Share on Google+

This discussion thread is closed

Replies have been disabled for this discussion.