473,598 Members | 2,916 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

speed


I implemented a lexer in Pylly and compared it to the version I
had written in Flex. Processing 219062 lines took 0.9 seconds in
C (from Flex), and 5 minutes 54 second in Python (from Pylly), a
ratio of 393 to 1.

Is this normal for Python, or does Flex produce better parsers
than Pylly? I have been looking at the code produced by Flex to
see if I could translate it to Python automaticly. But it has a
lot of goto statements, and I haven't figured out how to
translate those to Python efficiently.

What are the average times used for text processing of Python
compared to C?

--
Peter Kleiweg L:NL,af,da,de,e n,ia,nds,no,sv, (fr,it) S:NL,de,en,(da, ia)
info: http://www.let.rug.nl/~kleiweg/ls.html

Jul 18 '05 #1
7 1936
On Thu, Aug 19, 2004 at 03:37:26PM +0200, Peter Kleiweg wrote:

I implemented a lexer in Pylly and compared it to the version I
had written in Flex. Processing 219062 lines took 0.9 seconds in
C (from Flex), and 5 minutes 54 second in Python (from Pylly), a
ratio of 393 to 1.

Is this normal for Python, or does Flex produce better parsers
than Pylly? I have been looking at the code produced by Flex to
see if I could translate it to Python automaticly. But it has a
lot of goto statements, and I haven't figured out how to
translate those to Python efficiently.


flex has an option to generate code without the gotos...

--
John Lenton (jo**@grulic.or g.ar) -- Random fortune:
Don't read everything you believe.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFBJLEYgPq u395ykGsRAnZWAJ 9Kf/+vqmZ/t/FJrBWvfsQPwMVdX wCgk7Jp
YmxLnwJ2ciNDG9q zeKHSW/s=
=BquW
-----END PGP SIGNATURE-----

Jul 18 '05 #2
John Lenton schreef:

flex has an option to generate code without the gotos...


I have the latest version. I can't find it, not as run time
option, not as build option.

--
Peter Kleiweg L:NL,af,da,de,e n,ia,nds,no,sv, (fr,it) S:NL,de,en,(da, ia)
info: http://www.let.rug.nl/~kleiweg/ls.html

Jul 18 '05 #3
On Thu, Aug 19, 2004 at 04:16:24PM +0200, Peter Kleiweg wrote:
John Lenton schreef:

flex has an option to generate code without the gotos...


I have the latest version. I can't find it, not as run time
option, not as build option.


hmm! you're right... I wonder what lexer it was, then? I definitely
have a weak ref to the option in my head, but the owner has been gc'ed
:(

--
John Lenton (jo**@grulic.or g.ar) -- Random fortune:
There was a phone call for you.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFBJLuogPq u395ykGsRAhDKAJ 4xO/JWXvLl8UnQGpV3V zZWE7ArWwCgtefk
Kdqboao+WYsvWqs dZkgz2UY=
=4JCc
-----END PGP SIGNATURE-----

Jul 18 '05 #4
Peter Kleiweg <in************ *@nl.invalid> wrote:
I implemented a lexer in Pylly and compared it to the version I
had written in Flex. Processing 219062 lines took 0.9 seconds in
C (from Flex), and 5 minutes 54 second in Python (from Pylly), a
ratio of 393 to 1.

Is this normal for Python, or does Flex produce better parsers
than Pylly? I have been looking at the code produced by Flex to
see if I could translate it to Python automaticly. But it has a
lot of goto statements, and I haven't figured out how to
translate those to Python efficiently.

What are the average times used for text processing of Python
compared to C?


I don't know Pylly, but I guess it generates a parser using
a finite automaton -- just like lex/flex, except it handles
every single character in Python, wheres lex/flex will lead
to compiled C code. That would explain the speed difference.

When I have to parse something in Python, I try to do that
using things like string.split(), string.find(), the "re"
module etc. Those things are written in C, therefore they
are fast enough for most applications. There are also some
modules for specialized cases, such as "ConfigPars er" and
"shlex". See the Python Library Reference.

Best regards
Oliver

--
Oliver Fromme, Konrad-Celtis-Str. 72, 81369 Munich, Germany

``All that we see or seem is just a dream within a dream.''
(E. A. Poe)
Jul 18 '05 #5
Hi,

On Thu, Aug 19, 2004 at 03:37:26PM +0200, Peter Kleiweg wrote:

I implemented a lexer in Pylly and compared it to the version I
had written in Flex. Processing 219062 lines took 0.9 seconds in
C (from Flex), and 5 minutes 54 second in Python (from Pylly), a
ratio of 393 to 1.

Is this normal for Python, or does Flex produce better parsers
than Pylly? I have been looking at the code produced by Flex to
see if I could translate it to Python automaticly. But it has a
lot of goto statements, and I haven't figured out how to
translate those to Python efficiently.
Don't try to translate the generated code to python. Python code is
(almost) always slower than C code, because C is converted into machine
code, and Python has to be interpreted by the VM. Besides, python does a
lot of checks.

Try with PLY, <http://systems.cs.uchi cago.edu/ply/>. If you have
experience with flex/yacc in C, this module should be easy to use.

You can also play with Psyco (a JIT compiler for x86) or even with
Pyrex.

But, IMHO, if you has to process very big files, don't do it with
python. Instead, write a simple C-module, which uses your Flex parser
and creates python objects with that information. It should be trivial
if you have experience with the C API. :-)

What are the average times used for text processing of Python
compared to C?


IMO, Python is a powerful language to do almost everything, but in some
cases it is bad. One of this cases is intensive computing (like parsing a
big file). Use the correct tool =)

--
Ayose Cazorla León
Debian GNU/Linux - setepo
Jul 18 '05 #6

Another Python parser generator to look into is SimpleParse/mxTextTools

<http://simpleparse.sou rceforge.net/>

We use it to parse and process large log files. In our case, a typical
grammar contains over 250 productions and parsing a log file of 180
Klines (100 MB) takes approx 3 min. Processing the result from the
parse step requires an additional 3 mins. This on a 2.4 GHz Xeon
machine running RedHat 8.

Obviously these figures are very grammar and application specific. Your
milage may vary.

/Jean Brouwers

PS) A good reference is David Mertz' book "Text Processing in Python"

<http://www.informit.co m/title/0321112547>

or several articles on (t)his web page

<http://gnosis.cx/publish/tech_index_cp.h tml>


In article <ma************ *************** ***********@pyt hon.org>, Ayose
<ay***********@ hispalinux.es> wrote:
<http://systems.cs.uchi cago.edu/ply/>.

Jul 18 '05 #7
At some point, Ayose <ay***********@ hispalinux.es> wrote:
On Thu, Aug 19, 2004 at 03:37:26PM +0200, Peter Kleiweg wrote:

I implemented a lexer in Pylly and compared it to the version I
had written in Flex. Processing 219062 lines took 0.9 seconds in
C (from Flex), and 5 minutes 54 second in Python (from Pylly), a
ratio of 393 to 1.

Is this normal for Python, or does Flex produce better parsers
than Pylly? I have been looking at the code produced by Flex to
see if I could translate it to Python automaticly. But it has a
lot of goto statements, and I haven't figured out how to
translate those to Python efficiently.

...
But, IMHO, if you has to process very big files, don't do it with
python. Instead, write a simple C-module, which uses your Flex parser
and creates python objects with that information. It should be trivial
if you have experience with the C API. :-)


Or have a look at FlexModule at
http://www.cs.utexas.edu/users/mcgui...ware/fbmodule/
which makes it really simple without experience with the C API.

--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke
|cookedm(at)phy sics(dot)mcmast er(dot)ca
Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
23039
by: Yang Li Ke | last post by:
Hi guys, Is it possible to know the internet speed of the visitors with php? Thanx -- Yang
8
2971
by: Rob Ristroph | last post by:
I have tried out PHP 5 for the first time (with assistance from this group -- thanks!). The people I was working with have a site that uses lots of php objects. They are having problems with speed. They had a vague idea that PHP5 has improved handling of objects over PHP4, so it would probably be faster also. In fact it seems slower. We did a few timing loops, in which a a number of objects were created and and members were...
34
2451
by: Jacek Generowicz | last post by:
I have a program in which I make very good use of a memoizer: def memoize(callable): cache = {} def proxy(*args): try: return cache except KeyError: return cache.setdefault(args, callable(*args)) return proxy which, is functionally equivalent to
28
2584
by: Maboroshi | last post by:
Hi I am fairly new to programming but not as such that I am a total beginner From what I understand C and C++ are faster languages than Python. Is this because of Pythons ability to operate on almost any operating system? Or is there many other reasons why? I understand there is ansi/iso C and C++ and that ANSI/ISO Code will work on any system If this is the reason why, than why don't developers create specific Python Distrubutions...
52
3819
by: Neuruss | last post by:
It seems there are quite a few projects aimed to improve Python's speed and, therefore, eliminate its main limitation for mainstream acceptance. I just wonder what do you all think? Will Python (and dynamic languages in general) be someday close to compiled languages speed? What will be the future of Psyco, Pypy, Starkiller, Ironpython and all the other projects currently on development?
7
3039
by: YAZ | last post by:
Hello, I have a dll which do some number crunching. Performances (execution speed) are very important in my application. I use VC6 to compile the DLL. A friend of mine told me that in Visual studio 2003 .net optimization were enhanced and that i must gain in performance if I switch to VS 2003 or intel compiler. So I send him the project and he returned a compiled DLL with VS 2003. Result : the VS 2003 compiled Dll is slower than the VC6...
6
2021
by: Ham | last post by:
Yeah, Gotto work with my VB.Net graphic application for days, do any possible type of code optimization, check for unhandled errors and finally come up with sth that can't process 2D graphics and photos at an acceptable speed. I have heard things about the virtual machine of Mr. Net, that it can run my app at a high speed....but could never compare it with Java VM and its speed. Then, what should i do? Go and learn C++ ? Do i have time for...
6
6236
by: Jassim Rahma | last post by:
I want to detect the internet speed using C# to show the user on what speed he's connecting to internet?
11
6481
by: kyosohma | last post by:
Hi, We use a script here at work that runs whenever someone logs into their machine that logs various bits of information to a database. One of those bits is the CPU's model and speed. While this works in 95% of the time, we have some fringe cases where the only thing returned is the processor name. We use this data to help us decide which PCs need to be updated, so it would be nice to have the processor speed in all cases.
4
8603
by: nestle | last post by:
I have DSL with a download speed of 32MB/s and an upload speed of 8MB/s(according to my ISP), and I am using a router. My upload speed is always between 8MB/s and 9MB/s(which is above the max upload speed), ALWAYS. However, my download speed doesn't go over 25MB/s. And when my brother turns on the internet from his computer and takes up half the download/upload speeds (routers automatically split the speeds in two when two computers are using...
0
7981
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8284
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8392
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8262
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6711
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
5847
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5437
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
3894
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
3938
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.