Strange Execution Times

curi42

I am running two functions in a row that do the same thing. One runs
in .14 seconds, the other 56. I'm confused. I wrote another version
of the program and couldn't get the slow behavior again, only the fast.
I'm not sure what is causing it. Can anyone figure it out?

Here is my code (sorry it's a bit of a mess, but my cleaned up version
isn't slow!). Just skim to the bottom where the timing is. The first
time printed out is .14, the seond is 56.56.
f = open("/Users/curi/data.xml")
o = open("/Users/curi/out2.xml", "w")
import md5
import array
p1 = "<Password>"
p2 = "</Password>"

cnt = 0

m = md5.new
jo = "".join
adjust = len(p1) - 1

i = 1
s = f.read()
a = array.array('c', s).tolist()
spot = 0
k = 0
find = s.find

starts = []
ends = []

while k != -1:

#print len(s)
k = find(p2, spot)
if k != -1:
starts.append(find(p1, spot) + adjust)
ends.append(k)
spot = k + 1

#s = "".join([s[:j+1], md5.new(s[j+1:k-1]).hexdigest(), s[k:]])

#if k != -1: a[j+1:k-1] = m(jo(a[j+1:k-1])).hexdigest()

r = range(len(starts))
#r = range(20)
r.reverse()
import time
data = a[:]

md5 = m
join = jo

t1 = time.clock()
for j in r:
#print jo(s[starts[j]+1:ends[j]])
digest = m(jo(s[starts[j]+1:ends[j]])).hexdigest()

a[starts[j]+1:ends[j]] = digest
#cnt += 1
#if cnt % 100 == 0: print cnt
t2 = time.clock()
print "time is", round(t2-t1, 5)

t1 = time.clock()
for i in r:
data[starts[i]:ends[i]] =
md5(join(s[starts[i]:ends[i]])).hexdigest()
t2 = time.clock()
print "second time is", round(t2-t1, 5)
o.write(jo(a))

Jul 19 '05 #1

Subscribe Post Reply

1518

Fredrik Lundh

<cu****@gmail.com> wrote:

I am running two functions in a row that do the same thing. One runs
in .14 seconds, the other 56. I'm confused. I wrote another version
of the program and couldn't get the slow behavior again, only the fast.
I'm not sure what is causing it. Can anyone figure it out?

it would be a lot easier to help if you posted a self-contained example.

</F>

Jul 19 '05 #2

John Machin

cu****@gmail.com wrote:

I am running two functions in a row that do the same thing.
1. I see no functions here.

You should set out a script like this:

def main():
your_code_goes_here()

if __name__ == '__main__':
main()

for two reasons (a) your code will be referring to locals instead of
globals; this is faster, which might appeal to you (b) if somebody
accidentally imports the script, nothing happens.

2. The two loops to which you refer do *not* do the same thing; see later.
One runs
in .14 seconds, the other 56. I'm confused. I wrote another version
of the program and couldn't get the slow behavior again, only the fast.
I'm not sure what is causing it. Can anyone figure it out?

Here is my code (sorry it's a bit of a mess, but my cleaned up version
isn't slow!). Just skim to the bottom where the timing is. The first
time printed out is .14, the seond is 56.56.

[snip]
[following has extraneous blank lines and comments removed] t1 = time.clock()
for j in r:
digest = m(jo(s[starts[j]+1:ends[j]])).hexdigest()
a[starts[j]+1:ends[j]] = digest
t2 = time.clock()
print "time is", round(t2-t1, 5)

t1 = time.clock()
for i in r:
data[starts[i]:ends[i]] = \
md5(join(s[starts[i]:ends[i]])).hexdigest()
t2 = time.clock()
print "second time is", round(t2-t1, 5)

General questions: what platform? what version of Python? how large is
the file? how much free memory do you have? how many passwords are
there? what is the average length of a password?

Ignoring the superficial-but-meaningless differences (i vs j, md5
[aarrgghh!!] vs m), jo vs join), these two loops differ in the following
respects:

(1) 'data' is a copy of 'a'
(2) the first loop's body is effectively: digest = RHS; LHS = digest
whereas the 2nd loop's body is: LHS = RHS
(3) the first loop uses starts[j]+1 whereas the second loop uses starts[j]

Item (1) may affect the timing if file is large compared with available
memory -- could be 'a' has to be swapped out, and 'data' swapped in.

Item (2) should make the 2nd loop very slightly faster, so we'll ignore
that :-)

Item (3) means you are not comparing like with like. It means that the
1st loop has less work to do. So this could make an observable
difference for very short passwords -- but still nothing like 0.14
compared with 56.

So, some more questions:

The 56.56 is suspiciously precise -- you ran it a few times and it
printed exactly 56.56 each time?

Did you try putting the 2nd loop first [refer to Item (1) above]?
Did you try putting in a switch so that your script runs either 1st loop
or 2nd loop but not both? Note that each loop is making its target list
expand in situ; this may after a while (like inside loop 2) cause the
memory arena to become so fragmented that swapping will occur. This of
course can vary wildly depending on the platform; Win95 used to be the
most usual suspect but you're obviously not running on that.

Some observations:

(1) 's' is already a string, so ''.join(s[x:y]) is a slow way of doing
s[x:y]

(2) 'a' ends up as a list of one-byte strings, via a very circuitous
process: a = array.array('c', s).tolist()

A shorter route would be: a = list(s)

However what's wrong with what you presumably tried out first i.e. a =
array.array('c', s) ?? It doesn't need the final ''.join() before
writing to disk, and it takes up less memory. NOTE: the array variety
takes up 1 byte per character. The list variety takes up at least 4
bytes per character (on a machine where sizeof(PyObject *) == 4); to the
extent that the file contains characters that are not interned (i.e. not
[A-Za-z_] AFAIK), much more memory is required as a separate object
will be created for each such character. Was it consistently slower?

(3) If memory is your problem, you could rewrite the whole thing to
simply do one write per password; that way you only need 1.x copy of the
file contents in memory, not 2.x.

Hoping some of this helps,
John

Jul 19 '05 #3

Elliot Temple

hey FYI i found the problem: i accidentally copied an output file for
my test data. so all the passwords were exactly 32 chars long. so
when replacing them with new 32 char passwords, it went much much
faster, I guess because the list kept the same number of chars in it
and didn't have to copy lots of data around.

Jul 19 '05 #4

Similar topics

strange problem with file uplaod

by: Jay Donnell | last post by:

My script stopped working today after working perfectly for a month. It still works perfectly on our test server just not on our live server. The form that I use is this: <form...

PHP

strange performance

by: MGB | last post by:

I have a strange performance question hopefully someone can clarify for me. I take a production database and make a copy of it, called test, on the same instance on the same server both running at...

Microsoft SQL Server

Strange abortion of for loop

by: hall | last post by:

Hi. I've come across someting strange. I was trying to make a for-loop execute repetadly until the function called inside it does not return true during the entire loop (see program below). ...

C / C++

Strange Corrupt Pointer - Microsoft Bug??

by: James Niceguy | last post by:

Hi, I have written a simple program that does the following: The main program will spown MAX_THREADS number of threads, each of which will simply add to a global shared counter for MAX_COUNT...

C / C++

Strange way of optimizing a query

by: Ryan | last post by:

I came across a situation that I've been unable to explain and was hoping somebody had an answer: I had written an update query which was taking about 8 seconds to run and considered it too...

Microsoft Access / VBA

Strange code execution

by: Default | last post by:

Hi, I am new to C#, that is why I am not sure what kind of problem it is: Is VS files corrupted , or something else. that is the problems description: I am working on a small database project. I am...

C# / C Sharp

Strange difference in query execution time

by: Jeremy Jongsma | last post by:

I have a view, vw_tc_user_acccess, for determing user access to certain objects. On my machine, I get the following query execution times: 1. SELECT * FROM vw_tc_user_access: 33.04ms 2. SELECT...

PostgreSQL Database

set_time_limit(), using includes, & strange timeout behaviour

by: adamalton | last post by:

I'm trying to investigate the way PHP behaves in relation to timeouts and included scripts. PHP seems to do very strange things when scripts that include other scripts timeout, or when the included...

PHP

Same query - different execution plans??

by: sqlgirl | last post by:

Hi, We are trying to solve a real puzzle. We have a stored procedure that exhibits *drastically* different execution times depending on how its executed. When run from QA, it can take as...

Microsoft SQL Server

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing