473,386 Members | 1,803 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Huge performance gain compared to perl while loading a text file ina list ...!?

Hello,

I recently converted one of my perl scripts to python. What the script
does is simply search a lot of big mail files (~40MB) to retrieve
specific emails. I simply converted the script line by line to python,
keeping the algorithms & functions as they were in perl (no
optimization). The purpose was mainly to learn python and see the
differences with perl.

Now, once the converted script was finished, I was amazed to find that
the python version is running 8 times faster (800% faster!). Needless
to say, I was very intrigued and wanted to know what causes such a
performance gap between the two versions. So to keep my story short,
after some research and a few tests, I found that file IO is mainly
the cause of the performance diff.

I made two short test scripts, one in perl and one in python (see
below), and compared the performance difference. As we can see, the
bigger the file the larger the difference in performance....

I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?

Thanks
-----
#!/usr/bin/python

for i in range(20):
Data = open('data.test').readlines()

-----
#!/usr/bin/perl

for ($i = 0; $i < 20; $i++) {
open(DATA, "data.test");
@Data = <DATA>;
close(DATA);
}

-----
Running tests (data.test = 10MB text file):

blop@moya blop $ time ./ftest.py
real 0m6.408s
user 0m4.552s
sys 0m1.826s

blop@moya blop $ time ./ftest.pl
real 0m22.855s
user 0m21.946s
sys 0m0.822s

-----
Running tests (data.test = 40MB text file):

blop@moya blop $ time ./ftest.py
real 0m26.235s
user 0m18.238s
sys 0m7.872s

blop@moya blop $ time ./ftest.pl
real 3m26.741s
user 3m22.168s
sys 0m3.764s
Jul 18 '05 #1
1 2209
"Marc H." <co******@gmail.com> writes:
I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?


My guess is readlines() in Python is separating on newlines while Perl
is doing a regexp scan for the RS string (I forget what it's called in
Perl).
Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

38
by: kbass | last post by:
In different articles that I have read, persons have constantly eluded to the productivity gains of Python. One person stated that Python's productivity gain was 5 to 10 times over Java in some in...
13
by: Paul | last post by:
Hi all I have a sorting problem, but my experience with Python is rather limited (3 days), so I am running this by the list first. I have a large database of 15GB, consisting of 10^8 entries...
3
by: Andy Tran | last post by:
I built a system using mysql innodb to archive SMS messages but the innodb databases are not keeping up with the number of SMS messages coming in. I'm looking for performance of 200 msgs/sec where...
2
by: Chris Mullins | last post by:
I'm building a GUI that needs to be able to view a large amount of text arranged in rows. Large being anywhere from a few hundred lines through a few hundred thousand. I need a way to "cap" the max...
22
by: Bob Darlington | last post by:
It has been suggested to me (by a potential client) that my app (which he is considering buying) should be web enabled to improve performance, particularly regarding screen refreshes. My initial...
26
by: pembed2003 | last post by:
Hi, I have an application where I use the strncmp function extensively to compare 2 string to see if they are the same or not. Does anyone know a faster replacement for strncmp? I notice there is...
13
by: bjarne | last post by:
Willy Denoyette wrote; > ... it > was not the intention of StrousTrup to the achieve the level of efficiency > of C when he invented C++, ... Ahmmm. It was my aim to match the performance...
2
by: Marty | last post by:
Hi, I would like to know about DLL and performance gain/penalty in an application. Let's say that I have a very big application and for component portability and easy maintenance, we fragmented...
21
KevinADC
by: KevinADC | last post by:
Note: You may skip to the end of the article if all you want is the perl code. Introduction Uploading files from a local computer to a remote web server has many useful purposes, the most...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.