By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,928 Members | 1,146 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,928 IT Pros & Developers. It's quick & easy.

Huge performance gain compared to perl while loading a text file ina list ...!?

P: n/a
Hello,

I recently converted one of my perl scripts to python. What the script
does is simply search a lot of big mail files (~40MB) to retrieve
specific emails. I simply converted the script line by line to python,
keeping the algorithms & functions as they were in perl (no
optimization). The purpose was mainly to learn python and see the
differences with perl.

Now, once the converted script was finished, I was amazed to find that
the python version is running 8 times faster (800% faster!). Needless
to say, I was very intrigued and wanted to know what causes such a
performance gap between the two versions. So to keep my story short,
after some research and a few tests, I found that file IO is mainly
the cause of the performance diff.

I made two short test scripts, one in perl and one in python (see
below), and compared the performance difference. As we can see, the
bigger the file the larger the difference in performance....

I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?

Thanks
-----
#!/usr/bin/python

for i in range(20):
Data = open('data.test').readlines()

-----
#!/usr/bin/perl

for ($i = 0; $i < 20; $i++) {
open(DATA, "data.test");
@Data = <DATA>;
close(DATA);
}

-----
Running tests (data.test = 10MB text file):

blop@moya blop $ time ./ftest.py
real 0m6.408s
user 0m4.552s
sys 0m1.826s

blop@moya blop $ time ./ftest.pl
real 0m22.855s
user 0m21.946s
sys 0m0.822s

-----
Running tests (data.test = 40MB text file):

blop@moya blop $ time ./ftest.py
real 0m26.235s
user 0m18.238s
sys 0m7.872s

blop@moya blop $ time ./ftest.pl
real 3m26.741s
user 3m22.168s
sys 0m3.764s
Jul 18 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
"Marc H." <co******@gmail.com> writes:
I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?


My guess is readlines() in Python is separating on newlines while Perl
is doing a regexp scan for the RS string (I forget what it's called in
Perl).
Jul 18 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.