473,395 Members | 1,694 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Inefficient summing

Hi All,

I have a list of records like below:

rec=[{"F1":1, "F2":2}, {"F1":3, "F2":4} ]

Now I want to write code to find out the ratio of the sums of the two
fields.

One thing I can do is:

sum(r["F1"] for r in rec)/sum(r["F2"] for r in rec)

But this is slow because I have to iterate through the list twice.
Also, in the case where rec is an iterator, it does not work.

I can also do this:

sum1, sum2= reduce(lambda x, y: (x[0]+y[0], x[1]+y[1]), ((r["F1"],
r["F2"]) for r in rec))
sum1/sum2

This loops through the list only once, and is probably more efficient,
but it is less readable.

I can of course use an old-fashioned loop. This is more readable, but
also more verbose.

What is the best way, I wonder?
-a new python programmer
Oct 8 '08 #1
9 1041
I personally would probably do:

from collections import defaultdict

label2sum = defaultdict(lambda: 0)
for r in rec:
for key, value in r.iteritems():
label2sum[key] += value

ratio = label2sum["F1"] / label2sum["F2"]

This iterates through each 'r' only once, and (imho) is pretty
readable provided you know how defaultdicts work. Not everything has
to unnecessarily be made a one-liner. Coding is about readability
first, optimization second. And optimized code should not be
abbreviated, which would make it even harder to understand.

I probably would have gone with your second solution if performance
was no object.

Cheers,
Chris
--
Follow the path of the Iguana...
http://rebertia.com

On Wed, Oct 8, 2008 at 1:23 PM, beginner <zy*******@gmail.comwrote:
Hi All,

I have a list of records like below:

rec=[{"F1":1, "F2":2}, {"F1":3, "F2":4} ]

Now I want to write code to find out the ratio of the sums of the two
fields.

One thing I can do is:

sum(r["F1"] for r in rec)/sum(r["F2"] for r in rec)

But this is slow because I have to iterate through the list twice.
Also, in the case where rec is an iterator, it does not work.

I can also do this:

sum1, sum2= reduce(lambda x, y: (x[0]+y[0], x[1]+y[1]), ((r["F1"],
r["F2"]) for r in rec))
sum1/sum2

This loops through the list only once, and is probably more efficient,
but it is less readable.

I can of course use an old-fashioned loop. This is more readable, but
also more verbose.

What is the best way, I wonder?
-a new python programmer
--
http://mail.python.org/mailman/listinfo/python-list
Oct 8 '08 #2
beginner:
I can of course use an old-fashioned loop. This is more readable, but
also more verbose.
What is the best way, I wonder?
In such situation the old loop seems the best solution. Short code is
good only when it doesn't make the code too much slow/difficult to
understand. Keeping the code quite readable is very important. So I
think a simple solution is the best in this situation. The following
code can be understood quickly:

records = [{"F1": 1, "F2": 2}, {"F1": 3, "F2": 4}]

f1sum, f2sum = 0, 0
for rec in records:
f1sum += rec["F1"]
f2sum += rec["F2"]
ratio = f1sum / float(f2sum)
print ratio

Output:
0.666666666667

Note that I allowed myself to use this line of code:
f1sum, f2sum = 0, 0
because the two values on the right are equal, so you don't need one
bit of brain to understand where each value goes :-)

You can of course generalize the code in various ways, for example:

keys = ["F1", "F2"]
totals = [0] * len(keys)

for rec in records:
for i, key in enumerate(keys):
totals[i] += rec[key]

ratio = totals[0] / float(totals[1])
print ratio

But that already smells of over-engineering. Generally it's better to
use the simpler solution that works in all your cases at a speed that
is acceptable for you (my variant of the KISS principle).

Bye,
bearophile
Oct 8 '08 #3
Chris Rebert wrote:
I personally would probably do:

from collections import defaultdict

label2sum = defaultdict(lambda: 0)
FWIW, you can just use:

label2sum = defaultdict(int)

You don't need a lambda.
for r in rec:
for key, value in r.iteritems():
label2sum[key] += value

ratio = label2sum["F1"] / label2sum["F2"]

This iterates through each 'r' only once, and (imho) is pretty
readable provided you know how defaultdicts work. Not everything has
to unnecessarily be made a one-liner. Coding is about readability
first, optimization second. And optimized code should not be
abbreviated, which would make it even harder to understand.

I probably would have gone with your second solution if performance
was no object.

Cheers,
Chris
--
Oct 9 '08 #4
On 8 Ott, 22:23, beginner <zyzhu2...@gmail.comwrote:
Hi All,

I have a list of records like below:

rec=[{"F1":1, "F2":2}, {"F1":3, "F2":4} ]

Now I want to write code to find out the ratio of the sums of the two
fields.

One thing I can do is:

sum(r["F1"] for r in rec)/sum(r["F2"] for r in rec)

But this is slow because I have to iterate through the list twice.
Also, in the case where rec is an iterator, it does not work.

I can also do this:

sum1, sum2= reduce(lambda x, y: (x[0]+y[0], x[1]+y[1]), ((r["F1"],
r["F2"]) for r in rec))
sum1/sum2

This loops through the list only once, and is probably more efficient,
but it is less readable.

I can of course use an old-fashioned loop. This is more readable, but
also more verbose.

What is the best way, I wonder?

-a new python programmer
The loop way is probably the right choice.
OTHA, you could try to make more readable the 'reduce' approach,
writing it like this:

def add_r( sums, r ): return sums[0]+r['F1'], sums[1]+r['F2']
sum_f1, sum_f2 = reduce( add_r, rec, (0,0) )
result = sum_f1/sum_f2

Less verbose than the for loop, but IMO almost as understandable : one
only needs to know the semantic
of 'reduce' (which for a python programmer is not big thing) and most
important the code does only one thing per line.
Ciao
-----
FB
Oct 9 '08 #5
FB:
def add_r( sums, r ): return sums[0]+r['F1'], sums[1]+r['F2']
sum_f1, sum_f2 = reduce( add_r, rec, (0,0) )
result = sum_f1/sum_f2
Until this feature vanishes I think it's better to use it (untested):

add_r = lambda (a, b), r: (a + r['F1'], b + r['F2'])

Bye,
bearophile
Oct 9 '08 #6
Matt Nordhoff wrote:
Chris Rebert wrote:
>I personally would probably do:

from collections import defaultdict

label2sum = defaultdict(lambda: 0)

FWIW, you can just use:

label2sum = defaultdict(int)

You don't need a lambda.
Indeed, in this case, with two known keys, the defaultdict is not needed
either, since the following should work as well to initialize

label2sum = {'F1':0,'F2':0}
>for r in rec:
for key, value in r.iteritems():
label2sum[key] += value

ratio = label2sum["F1"] / label2sum["F2"]
Oct 9 '08 #7
beginner <zy*******@gmail.comwrites:
Hi All,

I have a list of records like below:

rec=[{"F1":1, "F2":2}, {"F1":3, "F2":4} ]

Now I want to write code to find out the ratio of the sums of the two
fields.

One thing I can do is:

sum(r["F1"] for r in rec)/sum(r["F2"] for r in rec)

But this is slow because I have to iterate through the list twice.
Also, in the case where rec is an iterator, it does not work.
how about:
ratio = (lambda c: c.real/c.imag)(sum(complex(r["F1"], r["F2"] for r in rec)))

?

:)

Oct 9 '08 #8
On Oct 9, 3:53*pm, Alexander Schmolck <a.schmo...@gmail.comwrote:
beginner <zyzhu2...@gmail.comwrites:
Hi All,
I have a list of records like below:
rec=[{"F1":1, "F2":2}, {"F1":3, "F2":4} ]
Now I want to write code to find out the ratio of the sums of the two
fields.
One thing I can do is:
sum(r["F1"] for r in rec)/sum(r["F2"] for r in rec)
But this is slow because I have to iterate through the list twice.
Also, in the case where rec is an iterator, it does not work.

how about:

ratio = (lambda c: c.real/c.imag)(sum(complex(r["F1"], r["F2"] for r inrec)))

?

:)- Hide quoted text -

- Show quoted text -
Neat, but I will have a problem if I am dealing with three fields,
right?
Oct 9 '08 #9
beginner <zy*******@gmail.comwrites:
On Oct 9, 3:53Â*pm, Alexander Schmolck <a.schmo...@gmail.comwrote:
>beginner <zyzhu2...@gmail.comwrites:
how about:

ratio = (lambda c: c.real/c.imag)(sum(complex(r["F1"], r["F2"] for r in rec)))
Neat, but I will have a problem if I am dealing with three fields,
right?
Sure but then how often do you want to take the ratio of 3 numbers? :)

More seriously if you often find yourself doing similar operations and are
(legimately) worried about performance, numpy and pytables might be worth a
look. By "legitimately" I mean that I wouldn't be bothered by iterating twice
over rec; it doesn't affect the algorithmic complexity at all and I woudn't be
surprised if sum(imap(itemgetter("F1"),rec))/sum(imap(itemgetter("F2"),rec))
weren't faster than the explicit loop version for the cases you care about
(timeit will tell you). You're right that you loose some generality in not
being able to deal with arbitrary iterables in that case though.

'as
Oct 10 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Yaroslav Bulatov | last post by:
I made an array of 10 million floats timed how long it takes to sum the elements, here's what I got (millis): gcc -O2: 21 Python with numarray: 104 Python with Numeric: 302...
2
by: SunMan | last post by:
Hello! I am trying to create a program that will ask for a user to enter a series of letters (codes) and then print out a table that shows the codes in decending frequency. Only letters will be...
2
by: Targa | last post by:
<input NAME="TAXRATE" onBlur="this.form.TAX.value = (this.form.TAXRATE.value - 0) * (this.form.ITEM1TOTAL.value - 0) + (this.form.ITEM2TOTAL.value - 0) " Size="4"> In my TAX field I get...
7
by: Hank | last post by:
I have a report-summing problem using Access 2000. When a section runs over the end of the page, sometimes a detail gets picked up twice. Example: Customer Header XYZ Company Detail Section...
2
by: MrL8Knight | last post by:
I am building a simple shopping cart and I am having problems trying to add the costs of the items to generate a total cost. I am new at this so forgive me if my technical verbiage isn’t the...
12
by: neeraj | last post by:
Hi Can any body give me the syntax for summing the elements of the an array , without looping thanks
4
by: dancole42 | last post by:
So I have an invoicing database based on two main forms: Orders and OrderLines. Orders has fields like: OrderID BillingMethod OrderDate CreditCard CCExp OrdSubTotal ShippingCharge
8
by: highroller152 | last post by:
Not to step on anyone, but in reference to this thread on summing odds and evens, why not just use the 'continue' statement? So adding up the odds would look something like this: -on some event-...
7
by: lethek39 | last post by:
Hey I have been trying to figure out how to sum rows and columns in a matrix square. I also have been trying to get the program to list the numbers of the diagonal in the matrix. So far this is the...
3
by: NewlytoSQL | last post by:
Hi all, im fairly new to SQL and im stuck half way through a query, im using DB2 here is what im tryng to do. i have a query that brings back an item number , shelf req, sum of all orders columns,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.