473,837 Members | 1,531 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Strange Execution Times

I am running two functions in a row that do the same thing. One runs
in .14 seconds, the other 56. I'm confused. I wrote another version
of the program and couldn't get the slow behavior again, only the fast.
I'm not sure what is causing it. Can anyone figure it out?

Here is my code (sorry it's a bit of a mess, but my cleaned up version
isn't slow!). Just skim to the bottom where the timing is. The first
time printed out is .14, the seond is 56.56.
f = open("/Users/curi/data.xml")
o = open("/Users/curi/out2.xml", "w")
import md5
import array
p1 = "<Password> "
p2 = "</Password>"

cnt = 0

m = md5.new
jo = "".join
adjust = len(p1) - 1

i = 1
s = f.read()
a = array.array('c' , s).tolist()
spot = 0
k = 0
find = s.find

starts = []
ends = []

while k != -1:

#print len(s)
k = find(p2, spot)
if k != -1:
starts.append(f ind(p1, spot) + adjust)
ends.append(k)
spot = k + 1

#s = "".join([s[:j+1], md5.new(s[j+1:k-1]).hexdigest(), s[k:]])

#if k != -1: a[j+1:k-1] = m(jo(a[j+1:k-1])).hexdigest()

r = range(len(start s))
#r = range(20)
r.reverse()
import time
data = a[:]

md5 = m
join = jo

t1 = time.clock()
for j in r:
#print jo(s[starts[j]+1:ends[j]])
digest = m(jo(s[starts[j]+1:ends[j]])).hexdigest()

a[starts[j]+1:ends[j]] = digest
#cnt += 1
#if cnt % 100 == 0: print cnt
t2 = time.clock()
print "time is", round(t2-t1, 5)

t1 = time.clock()
for i in r:
data[starts[i]:ends[i]] =
md5(join(s[starts[i]:ends[i]])).hexdigest()
t2 = time.clock()
print "second time is", round(t2-t1, 5)
o.write(jo(a))

Jul 19 '05 #1
3 1541
<cu****@gmail.c om> wrote:
I am running two functions in a row that do the same thing. One runs
in .14 seconds, the other 56. I'm confused. I wrote another version
of the program and couldn't get the slow behavior again, only the fast.
I'm not sure what is causing it. Can anyone figure it out?


it would be a lot easier to help if you posted a self-contained example.

</F>

Jul 19 '05 #2
cu****@gmail.co m wrote:
I am running two functions in a row that do the same thing.
1. I see no functions here.

You should set out a script like this:

def main():
your_code_goes_ here()

if __name__ == '__main__':
main()

for two reasons (a) your code will be referring to locals instead of
globals; this is faster, which might appeal to you (b) if somebody
accidentally imports the script, nothing happens.

2. The two loops to which you refer do *not* do the same thing; see later.
One runs
in .14 seconds, the other 56. I'm confused. I wrote another version
of the program and couldn't get the slow behavior again, only the fast.
I'm not sure what is causing it. Can anyone figure it out?

Here is my code (sorry it's a bit of a mess, but my cleaned up version
isn't slow!). Just skim to the bottom where the timing is. The first
time printed out is .14, the seond is 56.56.


[snip]
[following has extraneous blank lines and comments removed] t1 = time.clock()
for j in r:
digest = m(jo(s[starts[j]+1:ends[j]])).hexdigest()
a[starts[j]+1:ends[j]] = digest
t2 = time.clock()
print "time is", round(t2-t1, 5)

t1 = time.clock()
for i in r:
data[starts[i]:ends[i]] = \
md5(join(s[starts[i]:ends[i]])).hexdigest()
t2 = time.clock()
print "second time is", round(t2-t1, 5)


General questions: what platform? what version of Python? how large is
the file? how much free memory do you have? how many passwords are
there? what is the average length of a password?

Ignoring the superficial-but-meaningless differences (i vs j, md5
[aarrgghh!!] vs m), jo vs join), these two loops differ in the following
respects:

(1) 'data' is a copy of 'a'
(2) the first loop's body is effectively: digest = RHS; LHS = digest
whereas the 2nd loop's body is: LHS = RHS
(3) the first loop uses starts[j]+1 whereas the second loop uses starts[j]

Item (1) may affect the timing if file is large compared with available
memory -- could be 'a' has to be swapped out, and 'data' swapped in.

Item (2) should make the 2nd loop very slightly faster, so we'll ignore
that :-)

Item (3) means you are not comparing like with like. It means that the
1st loop has less work to do. So this could make an observable
difference for very short passwords -- but still nothing like 0.14
compared with 56.

So, some more questions:

The 56.56 is suspiciously precise -- you ran it a few times and it
printed exactly 56.56 each time?

Did you try putting the 2nd loop first [refer to Item (1) above]?
Did you try putting in a switch so that your script runs either 1st loop
or 2nd loop but not both? Note that each loop is making its target list
expand in situ; this may after a while (like inside loop 2) cause the
memory arena to become so fragmented that swapping will occur. This of
course can vary wildly depending on the platform; Win95 used to be the
most usual suspect but you're obviously not running on that.

Some observations:

(1) 's' is already a string, so ''.join(s[x:y]) is a slow way of doing
s[x:y]

(2) 'a' ends up as a list of one-byte strings, via a very circuitous
process: a = array.array('c' , s).tolist()

A shorter route would be: a = list(s)

However what's wrong with what you presumably tried out first i.e. a =
array.array('c' , s) ?? It doesn't need the final ''.join() before
writing to disk, and it takes up less memory. NOTE: the array variety
takes up 1 byte per character. The list variety takes up at least 4
bytes per character (on a machine where sizeof(PyObject *) == 4); to the
extent that the file contains characters that are not interned (i.e. not
[A-Za-z_] AFAIK), much more memory is required as a separate object
will be created for each such character. Was it consistently slower?

(3) If memory is your problem, you could rewrite the whole thing to
simply do one write per password; that way you only need 1.x copy of the
file contents in memory, not 2.x.

Hoping some of this helps,
John
Jul 19 '05 #3
hey FYI i found the problem: i accidentally copied an output file for
my test data. so all the passwords were exactly 32 chars long. so
when replacing them with new 32 char passwords, it went much much
faster, I guess because the list kept the same number of chars in it
and didn't have to copy lots of data around.

Jul 19 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
1788
by: Jay Donnell | last post by:
My script stopped working today after working perfectly for a month. It still works perfectly on our test server just not on our live server. The form that I use is this: <form enctype="multipart/form-data" action="" method="post"> <input type="hidden" name="MAX_FILE_SIZE" value="3000000" /> Send this file: <input name="userfile" type="file" /> <input type="submit" value="Send File" /> </form>
5
1547
by: MGB | last post by:
I have a strange performance question hopefully someone can clarify for me. I take a production database and make a copy of it, called test, on the same instance on the same server both running at the same time. All the same things are running for each database and no one is using each database but me. From Query Analyzer I run a SQL against production three times in a row and it takes 1 minute 40 seconds on the last/best run. I then run...
4
1429
by: hall | last post by:
Hi. I've come across someting strange. I was trying to make a for-loop execute repetadly until the function called inside it does not return true during the entire loop (see program below). The two lines that confuse me are marked as (1) and (2). count=0; bool s(true);
2
1818
by: James Niceguy | last post by:
Hi, I have written a simple program that does the following: The main program will spown MAX_THREADS number of threads, each of which will simply add to a global shared counter for MAX_COUNT times and notify the main thread it has finished by clearing a mutex before exiting. The main thread will simply check verify the final result in the global shared counter, wait for each worker thread has notify finishing, and exit.
6
4573
by: Ryan | last post by:
I came across a situation that I've been unable to explain and was hoping somebody had an answer: I had written an update query which was taking about 8 seconds to run and considered it too slow. I copied the SQL statement from the query and tried executing it from code which then ran in 1 second. To make sure that I didn't miss anything, I copied the SQL statement back into a query and tried running it again. It now also only took 1...
1
1952
by: Default | last post by:
Hi, I am new to C#, that is why I am not sure what kind of problem it is: Is VS files corrupted , or something else. that is the problems description: I am working on a small database project. I am not using any data sources Mysql, access etc. Instead I use binary formatter to store and read data. at the beginning the program checks username/password. it does it in the following way: if(form2.initialized) { for(int i=0;...
1
1862
by: Jeremy Jongsma | last post by:
I have a view, vw_tc_user_acccess, for determing user access to certain objects. On my machine, I get the following query execution times: 1. SELECT * FROM vw_tc_user_access: 33.04ms 2. SELECT * FROM vw_tc_user_access WHERE object_type = 'FORUM': 3.49ms 3. SELECT * FROM vw_tc_user_access WHERE object_type = 'CATEGORY': 107.53ms Queries #2 and #3 are obviously a subset of #1, simply filtered by object_type. My questions are:
0
1539
by: adamalton | last post by:
I'm trying to investigate the way PHP behaves in relation to timeouts and included scripts. PHP seems to do very strange things when scripts that include other scripts timeout, or when the included scripts themselves timeout. Just wondering if anyone else has ever investigated this world of mystery, or if they would like to join me in my quest into the unknown!!....?? It appears that in some cases php will return a completely balnk page...
5
10763
by: sqlgirl | last post by:
Hi, We are trying to solve a real puzzle. We have a stored procedure that exhibits *drastically* different execution times depending on how its executed. When run from QA, it can take as little as 3 seconds. When it is called from an Excel vba application, it can take up to 180 seconds. Although, at other times, it can take as little as 20 seconds from Excel.
0
9837
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9682
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10870
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10267
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9392
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6998
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5668
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4474
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4038
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.