Hello,
I recently converted one of my perl scripts to python. What the script
does is simply search a lot of big mail files (~40MB) to retrieve
specific emails. I simply converted the script line by line to python,
keeping the algorithms & functions as they were in perl (no
optimization). The purpose was mainly to learn python and see the
differences with perl.
Now, once the converted script was finished, I was amazed to find that
the python version is running 8 times faster (800% faster!). Needless
to say, I was very intrigued and wanted to know what causes such a
performance gap between the two versions. So to keep my story short,
after some research and a few tests, I found that file IO is mainly
the cause of the performance diff.
I made two short test scripts, one in perl and one in python (see
below), and compared the performance difference. As we can see, the
bigger the file the larger the difference in performance....
I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?
Thanks
-----
#!/usr/bin/python
for i in range(20):
Data = open('data.test ').readlines()
-----
#!/usr/bin/perl
for ($i = 0; $i < 20; $i++) {
open(DATA, "data.test" );
@Data = <DATA>;
close(DATA);
}
-----
Running tests (data.test = 10MB text file):
blop@moya blop $ time ./ftest.py
real 0m6.408s
user 0m4.552s
sys 0m1.826s
blop@moya blop $ time ./ftest.pl
real 0m22.855s
user 0m21.946s
sys 0m0.822s
-----
Running tests (data.test = 40MB text file):
blop@moya blop $ time ./ftest.py
real 0m26.235s
user 0m18.238s
sys 0m7.872s
blop@moya blop $ time ./ftest.pl
real 3m26.741s
user 3m22.168s
sys 0m3.764s 1 2258
"Marc H." <co******@gmail .com> writes: I'm fairly new to python, and don't know much of its inner working so I wonder if someone could explain to me why it is so much faster in python to open a file and load it in a list/array ?
My guess is readlines() in Python is separating on newlines while Perl
is doing a regexp scan for the RS string (I forget what it's called in
Perl). This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: kbass |
last post by:
In different articles that I have read, persons have constantly eluded to
the productivity gains of Python. One person stated that Python's
productivity gain was 5 to 10 times over Java in some in some cases. The
strange thing that I have noticed is that there were no examples of this
productivity gain (i.e., projects, programs, etc.,...). Can someone give me
some real life examples of productivity gains using Python as opposed other...
|
by: Paul |
last post by:
Hi all
I have a sorting problem, but my experience with Python is rather
limited (3 days), so I am running this by the list first.
I have a large database of 15GB, consisting of 10^8 entries of
approximately 100 bytes each. I devised a relatively simple key map on
my database, and I would like to order the database with respect to the
key.
|
by: Andy Tran |
last post by:
I built a system using mysql innodb to archive SMS messages but the
innodb databases are not keeping up with the number of SMS messages
coming in. I'm looking for performance of 200 msgs/sec where 1 msg is
1 database row.
I'm running on Red Linux:
2.4.20-8bigmem #1 SMP Thu Mar 13 17:32:29 EST 2003 i686 i686 i386
GNU/Linux
The machine has dual CPU and 2G of RAM.
|
by: Chris Mullins |
last post by:
I'm building a GUI that needs to be able to view a large amount of text
arranged in rows. Large being anywhere from a few hundred lines through a
few hundred thousand. I need a way to "cap" the max number of rows, so that
old rows are discarded in favor of new rows if the limit is reached.
My use case is very similar to that of the SQL Profiler GUI - I'm going to
be receiving a large amount of data from a server that I want to display....
|
by: Bob Darlington |
last post by:
It has been suggested to me (by a potential client) that my app (which he is
considering buying) should be web enabled to improve performance,
particularly regarding screen refreshes.
My initial reaction was to say "No' (I was cringing at the time), 'cos I
just don't want to do it. But after a reality check, I thought that I should
consult the experts, so here I am.
Is there any reason to expect my access 2002 application to run any...
| |
by: pembed2003 |
last post by:
Hi,
I have an application where I use the strncmp function extensively to
compare 2 string to see if they are the same or not. Does anyone know
a faster replacement for strncmp? I notice there is a memncmp function
which is very fast but it doesn't work the same like strncmp so I
don't think I can use it. I also tried to write the string_equal
function myself like:
int string_equal(const char* s1,const char* s2){
while(*s1 && *s2 &&...
|
by: bjarne |
last post by:
Willy Denoyette wrote;
> ... it
> was not the intention of StrousTrup to the achieve the level of efficiency
> of C when he invented C++, ...
Ahmmm. It was my aim to match the performance of C and I achieved that
aim very early on. See, for example "The Design and Evolution of C++".
-- Bjarne Stroustrup; http://www.research.att.com/~bs
|
by: Marty |
last post by:
Hi,
I would like to know about DLL and performance gain/penalty in an
application. Let's say that I have a very big application and for
component portability and easy maintenance, we fragmented the
application in a numerous number of sub projects compiled as DLLs. So
those sub project are very easy to use between different applications.
Our application need high level of performance, does the loading of,
let's say, 100s of DLL in...
|
by: KevinADC |
last post by:
Note: You may skip to the end of the article if all you want is the perl code.
Introduction
Uploading files from a local computer to a remote web server has many useful purposes, the most obvious of which is the sharing of files. For example, you upload images to a server to share them with other people over the Internet. Perl comes ready equipped for uploading files via the CGI.pm module, which has long been a core module and allows users...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |