472,374 Members | 1,597 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,374 software developers and data experts.

Performance Issues please help

I was testing this piece of code and it takes about 24-30 seconds to do
a look up on a list(m) of size 1000x1000
m -> list of size 1000x1000
import time
print time.ctime()
box = {}
for r,row in enumerate(m):
for c,e in enumerate(row):
if box.has_key(e):
params = box[e]
box[e] = ( min(c, params[0]), min(r, params[1]),
max(c, params[2]), max(r, params[3] ) )
else:
box[e] = [c, r, c, r]
print time.ctime()

Can some one tell me what exactly is taking more time here. Is it
because I am using dictionaries or what is the problem. Can some one
help me improve this .Is there a better way to write this.

Jul 19 '05 #1
9 1425
PyPK wrote:
I was testing this piece of code and it takes about 24-30 seconds to do
a look up on a list(m) of size 1000x1000
m -> list of size 1000x1000
import time
print time.ctime()
box = {}
for r,row in enumerate(m):
for c,e in enumerate(row):
if box.has_key(e):
params = box[e]
box[e] = ( min(c, params[0]), min(r, params[1]),
max(c, params[2]), max(r, params[3] ) )
else:
box[e] = [c, r, c, r]
print time.ctime()

Can some one tell me what exactly is taking more time here. Is it
because I am using dictionaries or what is the problem. Can some one
help me improve this .Is there a better way to write this.


Without gross changes to the algorithm, and in no particular order:
0. Stop worrying. Find something else to do during the 30 seconds.
1. Use psyco, if your [unspecified] platform supports it.
2. Why has_key()? Try "if e in box:" instead, if your [unspecified]
version of Python supports it. If it doesn't, (a) consider upgrading (b)
try doing box_has_key = box.has_key outside the loops and using the
result inside the loops.
3. Ensure this code is inside a function, not at global level in a
script [not apparent from your post].
4. Outside the loops, put "local_min = min", ditto for max. Then use
box[e] = (local_min(c, etc etc
5. Upgrade your [unspecified] hardware.
6. Use try/except, if indicated by the [unspecified] percentage of times
"e in box" is true. Hint: printing len(box) at the end might be useful.
7. Consider using pyrex.
8. Consider using numarray/mumeric.

Info req'd for discussing algorithm changes:
1. How much free memory do you have?
2. What can you tell us about the distribution of "e"?
3. Is "m" a rectangle, or are the rows of variable length?
4. Do you have control over "m" i.e. are you stuck with that data
structure, or can we design a different one?
Jul 19 '05 #2
>Info req'd for discussing algorithm changes:
1. How much free memory do you have?
- Memory is not a problem but I am working on some timing
constraints.Thats is the reason I am a bit concerned abt these 30
seconds
2. What can you tell us about the distribution of "e"?
- the distribution of e is not fixed.It changes based on what comes
first in the scan of the list.
3. Is "m" a rectangle, or are the rows of variable length?
- m is not a restnagle .its of varied dimensions.
4. Do you have control over "m" i.e. are you stuck with that data

structure, or can we design a different one?

-I can use 'm' as an array or a list. Other than that I am stuck.

I am suspecious abput dictionary approach here. Is this slower than if
i do it with arrays or lists with index of array as my key ?

Jul 19 '05 #3
PyPK wrote:

I am suspecious abput dictionary approach here. Is this slower than if
i do it with arrays or lists with index of array as my key ?


"with index of array as my key" is either redundant -- indicating that
your answer to my question about "e" should have included something like
"e is an integer that can range from blah0 to blah_max" -- or not
understandable.

In any case, seeing that the code in question is only a few lines, why
don't *you* write an alternative version and see how fast it runs? And
let us know?

Jul 19 '05 #4
PyPK wrote:
I was testing this piece of code and it takes about 24-30 seconds to do
a look up on a list(m) of size 1000x1000
m -> list of size 1000x1000
import time
print time.ctime()
box = {}
for r,row in enumerate(m):
for c,e in enumerate(row):
if box.has_key(e):
params = box[e]
box[e] = ( min(c, params[0]), min(r, params[1]),
max(c, params[2]), max(r, params[3] ) )
else:
box[e] = [c, r, c, r]
print time.ctime()

Can some one tell me what exactly is taking more time here. Is it
because I am using dictionaries or what is the problem. Can some one
help me improve this .Is there a better way to write this.


AFAICT the row index 'r' can only grow. Therefore one min() and one max()
should be superfluous (untested):

def make_box(m):
box = {}
for r, row in enumerate(m):
for c, e in enumerate(row):
if e in box:
left, top, right, bottom = box[e]
box[e] = (min(c, left), top, max(c, right), r)
else:
box[e] = (c, r, c, r)
return box

Peter
Jul 19 '05 #5
Yep that improved the speed by about 50% now it takes about 10 secs
instead of 24 seconds..Thanks much. I guess that is the best we could
do right.It would be really helpful if I could get it less than 5
seconds. Any suggestions on that??

Jul 19 '05 #6
PyPK wrote:
Yep that improved the speed by about 50% now it takes about 10 secs
instead of 24 seconds..Thanks much. I guess that is the best we could
do right.It would be really helpful if I could get it less than 5
seconds. Any suggestions on that??

Things to try:
* in-lining the min and max expressions
* depending on the distribution of e, it may be faster to catch KeyErrors

def search1(m):
box = {}
for r,row in enumerate(m):
for c,e in enumerate(row):
try:
minc, minr, maxc, maxr = box[e]
box[e] = ( c < minc and c or minc,
r < minr and r or minr,
c > maxc and c or maxc,
r > maxr and r or maxr)
except KeyError:
box[e] = (c, r, c, r)
return box

Michael

Jul 19 '05 #7
Michael Spencer wrote:
def search1(m):
box = {}
for r,row in enumerate(m):
for c,e in enumerate(row):
try:
minc, minr, maxc, maxr = box[e]
box[e] = ( c < minc and c or minc,
r < minr and r or minr,
c > maxc and c or maxc,
r > maxr and r or maxr)
except KeyError:
box[e] = (c, r, c, r)
return box

Michael


Sorry, my earlier solution was incorrect since:
c < minc and c or minc # evaluates to minc if c == 0
Two of the tests are unnecessary, as pointed out by Peter
The remaining test:
c > maxc and c or maxc
does not suffer from the same problem, since c cannot both be 0 and greater than
maxc

The following version, still untested ;-) has the correction:

def search2(m):
box = {}
for r,row in enumerate(m):
for c,e in enumerate(row):
try: # use this form if e will be found 90%+ of the time
# otherwise, use: if e in box:
minc, minr, maxc, maxr = box[e]
# note correction for c == 0
# also Peter's simplification
box[e] = ( c and (c < minc and c or minc),
minr,
c > maxc and c or maxc,
r)

except KeyError:
box[e] = (c, r, c, r)
return box

Michael

Jul 19 '05 #8
Michael Spencer wrote:
minc, minr, maxc, maxr = box[e]
# note correction for c == 0
# also Peter's simplification
box[e] = ( c and (c < minc and c or minc),
minr,
c > maxc and c or maxc,
r)


This may be slightly faster (when c > 0), and slightly less opaque:

minc, minr, maxc, maxr = box[e]
# note correction for c == 0
# also Peter's simplification
if c < minc:
minc = c
box[e] = ( minc,
minr,
c > maxc and c or maxc,
r)

Jul 19 '05 #9
Michael Spencer wrote:
minc, minr, maxc, maxr = box[e]
# note correction for c == 0
# also Peter's simplification
box[e] = ( c and (c < minc and c or minc),
minr,
c > maxc and c or maxc,
r)


This may be slightly faster (when c > 0), and slightly less opaque:

minc, minr, maxc, maxr = box[e]
# note correction for c == 0
# also Peter's simplification
if c < minc:
minc = c
box[e] = ( minc,
minr,
c > maxc and c or maxc,
r)
Jul 19 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Bob | last post by:
Are there any known applications out there used to test the performance of the .NET garbage collector over a long period of time? Basically I need an application that creates objects, uses them, and...
9
by: bluedolphin | last post by:
Hello All: I have been brought onboard to help on a project that had some performance problems last year. I have taken some steps to address the issues in question, but a huge question mark...
115
by: Mark Shelor | last post by:
I've encountered a troublesome inconsistency in the C-language Perl extension I've written for CPAN (Digest::SHA). The problem involves the use of a static array within a performance-critical...
4
by: Martin | last post by:
I am using graphics as backgrounds for forms,buttons,labels etc. The question is: is it faster to load all graphics from files on app start or to use it embeded (places in editor during design)....
2
by: Curtis Justus | last post by:
Hi, I've been searching for solutions to two issues that are undoubtedly common to everybody. The first is how do my team and I adequately perform unit testing. The second is how can I measure...
0
by: Lakshmi | last post by:
Hi All, I am having performance issues with the .NET client calling the Java Webservice running on axis. Have detailed the problem below. Please help. I wrote a webservice in Java. Lets name...
2
by: Brian Tabios | last post by:
Hello Everyone, I have a very complex performance issue with our production database. Here's the scenario. We have a production webserver server and a development web server. Both are running...
1
by: marcfischman | last post by:
Please help. I have a website running on a linux/apache/mysql/php server. I receive about 8,000-10,000 visitors a day with about 200,000 to 300,000 page views. The server is a RedHat Linux...
2
by: Ryan | last post by:
My apologies if this is not the forum to post questions regarding .NET Remoting, but I figured WebServices would be the most appropriate forum of the bunch. We're currently completely re-arching...
3
by: =?Utf-8?B?bXVkR2ls?= | last post by:
Hi, I have the Enterprise Architect version installed on one machine, and the developper edition on another (VS.Net 2005). To my surprise, the performance tools are only available (in the menu...
2
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge required to effectively administer and manage Oracle...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific technical details, Gmail likely implements measures...
0
Oralloy
by: Oralloy | last post by:
Hello Folks, I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA. My problem (spelled failure) is with the synthesis of my design into a bitstream, not the C++...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
0
BLUEPANDA
by: BLUEPANDA | last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS starter kit that's not only easy to use but also...
1
by: Johno34 | last post by:
I have this click event on my form. It speaks to a Datasheet Subform Private Sub Command260_Click() Dim r As DAO.Recordset Set r = Form_frmABCD.Form.RecordsetClone r.MoveFirst Do If...
0
by: jack2019x | last post by:
hello, Is there code or static lib for hook swapchain present? I wanna hook dxgi swapchain present for dx11 and dx9.
0
DizelArs
by: DizelArs | last post by:
Hi all) Faced with a problem, element.click() event doesn't work in Safari browser. Tried various tricks like emulating touch event through a function: let clickEvent = new Event('click', {...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.