By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
444,100 Members | 2,495 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 444,100 IT Pros & Developers. It's quick & easy.

Can I beat perl at grep-like processing speed?

P: n/a
js
Just my curiosity.
Can python beats perl at speed of grep-like processing?
$ wget http://www.gutenberg.org/files/7999/7999-h.zip
$ unzip 7999-h.zip
$ cd 7999-h
$ cat *.htm bigfile
$ du -h bigfile
du -h bigfile
8.2M bigfile

---------- grep.pl ----------
#!/usr/local/bin/perl
open(F, 'bigfile') or die;

while(<F>) {
s/[\n\r]+$//;
print "$_\n" if m/destroy/oi;
}
---------- END ----------
---------- grep.py ----------
#!/usr/bin/env python
import re
r = re.compile(r'destroy', re.IGNORECASE)

for s in file('bigfile'):
if r.search(s): print s.rstrip("\r\n")
---------- END ----------

$ time perl grep.pl pl.out; time python grep.py py.out
real 0m0.168s
user 0m0.149s
sys 0m0.015s

real 0m0.450s
user 0m0.374s
sys 0m0.068s
# I used python2.5 and perl 5.8.6
Dec 29 '06 #1
Share this Question
Share on Google+
4 Replies


P: n/a
js wrote:
Just my curiosity.
Can python beats perl at speed of grep-like processing?
$ wget http://www.gutenberg.org/files/7999/7999-h.zip
$ unzip 7999-h.zip
$ cd 7999-h
$ cat *.htm bigfile
$ du -h bigfile
du -h bigfile
8.2M bigfile

---------- grep.pl ----------
#!/usr/local/bin/perl
open(F, 'bigfile') or die;

while(<F>) {
s/[\n\r]+$//;
print "$_\n" if m/destroy/oi;
}
---------- END ----------
---------- grep.py ----------
#!/usr/bin/env python
import re
r = re.compile(r'destroy', re.IGNORECASE)

for s in file('bigfile'):
if r.search(s): print s.rstrip("\r\n")
---------- END ----------

$ time perl grep.pl pl.out; time python grep.py py.out
real 0m0.168s
user 0m0.149s
sys 0m0.015s

real 0m0.450s
user 0m0.374s
sys 0m0.068s
# I used python2.5 and perl 5.8.6
I'm thankful for the Python version or else, I'd never have guessed what
that code was supposed to do!

Try that :
---------- grep.py ----------
#!/usr/bin/env python
import re
def main():
search = re.compile(r'destroy', re.IGNORECASE).search

for s in file('bigfile'):
if search(s): print s.rstrip("\r\n")

main()
---------- END ----------

Dec 29 '06 #2

P: n/a
js <eb*****@gmail.comwrote:
Just my curiosity.
Can python beats perl at speed of grep-like processing?

$ wget http://www.gutenberg.org/files/7999/7999-h.zip
$ unzip 7999-h.zip
$ cd 7999-h
$ cat *.htm bigfile
$ du -h bigfile
du -h bigfile
8.2M bigfile

#!/usr/local/bin/perl
open(F, 'bigfile') or die;

while(<F>) {
s/[\n\r]+$//;
print "$_\n" if m/destroy/oi;
}
#!/usr/bin/env python
import re
r = re.compile(r'destroy', re.IGNORECASE)

for s in file('bigfile'):
if r.search(s): print s.rstrip("\r\n")

$ time perl grep.pl pl.out; time python grep.py py.out
real 0m0.168s
user 0m0.149s
sys 0m0.015s

real 0m0.450s
user 0m0.374s
sys 0m0.068s
# I used python2.5 and perl 5.8.6
Playing for the other side temporarily, this is nearly twice as fast...

$ time perl -lne 'print if m/destroy/oi' bigfile >pl.out
real 0m0.133s
user 0m0.120s
sys 0m0.012s

vs

$ time ./z.pl >pl.out.orig
real 0m0.223s
user 0m0.208s
sys 0m0.016s

Which gives the same output modulo a few \r

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Dec 30 '06 #3

P: n/a
js a écrit :
Just my curiosity.
Can python beats perl at speed of grep-like processing?
Probably not.
>
$ wget http://www.gutenberg.org/files/7999/7999-h.zip
$ unzip 7999-h.zip
$ cd 7999-h
$ cat *.htm bigfile
$ du -h bigfile
du -h bigfile
8.2M bigfile

---------- grep.pl ----------
#!/usr/local/bin/perl
open(F, 'bigfile') or die;

while(<F>) {
s/[\n\r]+$//;
print "$_\n" if m/destroy/oi;
}
---------- END ----------
---------- grep.py ----------
#!/usr/bin/env python
import re
r = re.compile(r'destroy', re.IGNORECASE)

for s in file('bigfile'):
if r.search(s): print s.rstrip("\r\n")
---------- END ----------
Please notice that you're also benchmarking IO here - and perl seems to
use a custom, highly optimized IO lib, that is much much faster than the
system's one. I once made a Q&D cat-like comparison of perl, Python and
C on my gentoo-linux box, and the perl version was insanely faster than
the C one.

Now the real question is IMHO: is the Python version fast enough ?

My 2 cents..
Jan 2 '07 #4

P: n/a
Nick Craig-Wood wrote:
> #!/usr/bin/env python
import re
r = re.compile(r'destroy', re.IGNORECASE)

for s in file('bigfile'):
if r.search(s): print s.rstrip("\r\n")
footnote: if you're searching for literal strings with Python 2.5, using "in" is a
lot faster than using re.search.

</F>

Jan 3 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.