473,545 Members | 2,715 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Huge performance gain compared to perl while loading a text file ina list ...!?

Hello,

I recently converted one of my perl scripts to python. What the script
does is simply search a lot of big mail files (~40MB) to retrieve
specific emails. I simply converted the script line by line to python,
keeping the algorithms & functions as they were in perl (no
optimization). The purpose was mainly to learn python and see the
differences with perl.

Now, once the converted script was finished, I was amazed to find that
the python version is running 8 times faster (800% faster!). Needless
to say, I was very intrigued and wanted to know what causes such a
performance gap between the two versions. So to keep my story short,
after some research and a few tests, I found that file IO is mainly
the cause of the performance diff.

I made two short test scripts, one in perl and one in python (see
below), and compared the performance difference. As we can see, the
bigger the file the larger the difference in performance....

I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?

Thanks
-----
#!/usr/bin/python

for i in range(20):
Data = open('data.test ').readlines()

-----
#!/usr/bin/perl

for ($i = 0; $i < 20; $i++) {
open(DATA, "data.test" );
@Data = <DATA>;
close(DATA);
}

-----
Running tests (data.test = 10MB text file):

blop@moya blop $ time ./ftest.py
real 0m6.408s
user 0m4.552s
sys 0m1.826s

blop@moya blop $ time ./ftest.pl
real 0m22.855s
user 0m21.946s
sys 0m0.822s

-----
Running tests (data.test = 40MB text file):

blop@moya blop $ time ./ftest.py
real 0m26.235s
user 0m18.238s
sys 0m7.872s

blop@moya blop $ time ./ftest.pl
real 3m26.741s
user 3m22.168s
sys 0m3.764s
Jul 18 '05 #1
1 2230
"Marc H." <co******@gmail .com> writes:
I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?


My guess is readlines() in Python is separating on newlines while Perl
is doing a regexp scan for the RS string (I forget what it's called in
Perl).
Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

38
3702
by: kbass | last post by:
In different articles that I have read, persons have constantly eluded to the productivity gains of Python. One person stated that Python's productivity gain was 5 to 10 times over Java in some in some cases. The strange thing that I have noticed is that there were no examples of this productivity gain (i.e., projects, programs, etc.,...). ...
13
4616
by: Paul | last post by:
Hi all I have a sorting problem, but my experience with Python is rather limited (3 days), so I am running this by the list first. I have a large database of 15GB, consisting of 10^8 entries of approximately 100 bytes each. I devised a relatively simple key map on my database, and I would like to order the database with respect to the...
3
949
by: Andy Tran | last post by:
I built a system using mysql innodb to archive SMS messages but the innodb databases are not keeping up with the number of SMS messages coming in. I'm looking for performance of 200 msgs/sec where 1 msg is 1 database row. I'm running on Red Linux: 2.4.20-8bigmem #1 SMP Thu Mar 13 17:32:29 EST 2003 i686 i686 i386 GNU/Linux The machine...
2
1563
by: Chris Mullins | last post by:
I'm building a GUI that needs to be able to view a large amount of text arranged in rows. Large being anywhere from a few hundred lines through a few hundred thousand. I need a way to "cap" the max number of rows, so that old rows are discarded in favor of new rows if the limit is reached. My use case is very similar to that of the SQL...
22
2050
by: Bob Darlington | last post by:
It has been suggested to me (by a potential client) that my app (which he is considering buying) should be web enabled to improve performance, particularly regarding screen refreshes. My initial reaction was to say "No' (I was cringing at the time), 'cos I just don't want to do it. But after a reality check, I thought that I should consult...
26
12967
by: pembed2003 | last post by:
Hi, I have an application where I use the strncmp function extensively to compare 2 string to see if they are the same or not. Does anyone know a faster replacement for strncmp? I notice there is a memncmp function which is very fast but it doesn't work the same like strncmp so I don't think I can use it. I also tried to write the...
13
2734
by: bjarne | last post by:
Willy Denoyette wrote; > ... it > was not the intention of StrousTrup to the achieve the level of efficiency > of C when he invented C++, ... Ahmmm. It was my aim to match the performance of C and I achieved that aim very early on. See, for example "The Design and Evolution of C++". -- Bjarne Stroustrup;...
2
1975
by: Marty | last post by:
Hi, I would like to know about DLL and performance gain/penalty in an application. Let's say that I have a very big application and for component portability and easy maintenance, we fragmented the application in a numerous number of sub projects compiled as DLLs. So those sub project are very easy to use between different applications. ...
21
34349
KevinADC
by: KevinADC | last post by:
Note: You may skip to the end of the article if all you want is the perl code. Introduction Uploading files from a local computer to a remote web server has many useful purposes, the most obvious of which is the sharing of files. For example, you upload images to a server to share them with other people over the Internet. Perl comes ready...
0
7432
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7689
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
7943
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7786
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
1
5359
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3490
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
1919
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1044
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
743
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.