473,767 Members | 1,996 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Huge performance gain compared to perl while loading a text file ina list ...!?

Hello,

I recently converted one of my perl scripts to python. What the script
does is simply search a lot of big mail files (~40MB) to retrieve
specific emails. I simply converted the script line by line to python,
keeping the algorithms & functions as they were in perl (no
optimization). The purpose was mainly to learn python and see the
differences with perl.

Now, once the converted script was finished, I was amazed to find that
the python version is running 8 times faster (800% faster!). Needless
to say, I was very intrigued and wanted to know what causes such a
performance gap between the two versions. So to keep my story short,
after some research and a few tests, I found that file IO is mainly
the cause of the performance diff.

I made two short test scripts, one in perl and one in python (see
below), and compared the performance difference. As we can see, the
bigger the file the larger the difference in performance....

I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?

Thanks
-----
#!/usr/bin/python

for i in range(20):
Data = open('data.test ').readlines()

-----
#!/usr/bin/perl

for ($i = 0; $i < 20; $i++) {
open(DATA, "data.test" );
@Data = <DATA>;
close(DATA);
}

-----
Running tests (data.test = 10MB text file):

blop@moya blop $ time ./ftest.py
real 0m6.408s
user 0m4.552s
sys 0m1.826s

blop@moya blop $ time ./ftest.pl
real 0m22.855s
user 0m21.946s
sys 0m0.822s

-----
Running tests (data.test = 40MB text file):

blop@moya blop $ time ./ftest.py
real 0m26.235s
user 0m18.238s
sys 0m7.872s

blop@moya blop $ time ./ftest.pl
real 3m26.741s
user 3m22.168s
sys 0m3.764s
Jul 18 '05 #1
1 2258
"Marc H." <co******@gmail .com> writes:
I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?


My guess is readlines() in Python is separating on newlines while Perl
is doing a regexp scan for the RS string (I forget what it's called in
Perl).
Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

38
3737
by: kbass | last post by:
In different articles that I have read, persons have constantly eluded to the productivity gains of Python. One person stated that Python's productivity gain was 5 to 10 times over Java in some in some cases. The strange thing that I have noticed is that there were no examples of this productivity gain (i.e., projects, programs, etc.,...). Can someone give me some real life examples of productivity gains using Python as opposed other...
13
4635
by: Paul | last post by:
Hi all I have a sorting problem, but my experience with Python is rather limited (3 days), so I am running this by the list first. I have a large database of 15GB, consisting of 10^8 entries of approximately 100 bytes each. I devised a relatively simple key map on my database, and I would like to order the database with respect to the key.
3
949
by: Andy Tran | last post by:
I built a system using mysql innodb to archive SMS messages but the innodb databases are not keeping up with the number of SMS messages coming in. I'm looking for performance of 200 msgs/sec where 1 msg is 1 database row. I'm running on Red Linux: 2.4.20-8bigmem #1 SMP Thu Mar 13 17:32:29 EST 2003 i686 i686 i386 GNU/Linux The machine has dual CPU and 2G of RAM.
2
1579
by: Chris Mullins | last post by:
I'm building a GUI that needs to be able to view a large amount of text arranged in rows. Large being anywhere from a few hundred lines through a few hundred thousand. I need a way to "cap" the max number of rows, so that old rows are discarded in favor of new rows if the limit is reached. My use case is very similar to that of the SQL Profiler GUI - I'm going to be receiving a large amount of data from a server that I want to display....
22
2080
by: Bob Darlington | last post by:
It has been suggested to me (by a potential client) that my app (which he is considering buying) should be web enabled to improve performance, particularly regarding screen refreshes. My initial reaction was to say "No' (I was cringing at the time), 'cos I just don't want to do it. But after a reality check, I thought that I should consult the experts, so here I am. Is there any reason to expect my access 2002 application to run any...
26
13024
by: pembed2003 | last post by:
Hi, I have an application where I use the strncmp function extensively to compare 2 string to see if they are the same or not. Does anyone know a faster replacement for strncmp? I notice there is a memncmp function which is very fast but it doesn't work the same like strncmp so I don't think I can use it. I also tried to write the string_equal function myself like: int string_equal(const char* s1,const char* s2){ while(*s1 && *s2 &&...
13
2765
by: bjarne | last post by:
Willy Denoyette wrote; > ... it > was not the intention of StrousTrup to the achieve the level of efficiency > of C when he invented C++, ... Ahmmm. It was my aim to match the performance of C and I achieved that aim very early on. See, for example "The Design and Evolution of C++". -- Bjarne Stroustrup; http://www.research.att.com/~bs
2
1987
by: Marty | last post by:
Hi, I would like to know about DLL and performance gain/penalty in an application. Let's say that I have a very big application and for component portability and easy maintenance, we fragmented the application in a numerous number of sub projects compiled as DLLs. So those sub project are very easy to use between different applications. Our application need high level of performance, does the loading of, let's say, 100s of DLL in...
21
34436
KevinADC
by: KevinADC | last post by:
Note: You may skip to the end of the article if all you want is the perl code. Introduction Uploading files from a local computer to a remote web server has many useful purposes, the most obvious of which is the sharing of files. For example, you upload images to a server to share them with other people over the Internet. Perl comes ready equipped for uploading files via the CGI.pm module, which has long been a core module and allows users...
0
9405
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10169
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10013
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9960
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8838
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6655
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5280
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5424
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
2807
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.