472,779 Members | 1,728 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,779 software developers and data experts.

embedding executable code in a regular expression in Python

Folks,

Does regular expression processing in Python allow for executable
code to be embedded inside a regular expression?

For example, in Perl the following two statements

$regex = qr/hello(?{print "saw hello\n"})mello(?{print "saw
mello\n"})/;
"jellohellomello" =~ /$regex/;

will produce the output

saw hello
saw mello

Is it possible to do the same in Python with any modules that come
with the standard distribution, or with any other modules?

Thanks in advance for any help,

Avi Kak (ka*@purdue.edu)

Jul 16 '06 #1
5 2225

----- Original Message -----
From: "Avi Kak" <ka*@purdue.edu>
Newsgroups: comp.lang.python
To: <py*********@python.org>
Sent: Sunday, July 16, 2006 11:05 PM
Subject: embedding executable code in a regular expression in Python

Folks,

Does regular expression processing in Python allow for executable
code to be embedded inside a regular expression?

For example, in Perl the following two statements

$regex = qr/hello(?{print "saw hello\n"})mello(?{print "saw
mello\n"})/;
"jellohellomello" =~ /$regex/;

will produce the output

saw hello
saw mello

Is it possible to do the same in Python with any modules that come
with the standard distribution, or with any other modules?

Thanks in advance for any help,

Avi Kak (ka*@purdue.edu)

--
http://mail.python.org/mailman/listinfo/python-list
There's a new module out that can do it like this:

import SE
Stream_Editor = SE.SE ('<EAT"~[hm]ello~=print \'saw =\'\n" ')
for line in Stream_Editor ('yellomellojellohello').split ('\n'):
exec ("%s" % line)

saw mello
saw hello

You'll find SE in the Cheese Shop. It has been commented favorably. I
wouldn't comment it, because I wrote it.

Regards

Frederic
Jul 16 '06 #2
On 2006-07-16, Avi Kak <ka*@purdue.eduwrote:
Folks,

Does regular expression processing in Python allow for executable
code to be embedded inside a regular expression?

For example, in Perl the following two statements

$regex = qr/hello(?{print "saw hello\n"})mello(?{print "saw
mello\n"})/;
"jellohellomello" =~ /$regex/;

will produce the output

saw hello
saw mello

Is it possible to do the same in Python with any modules that come
with the standard distribution, or with any other modules?
You can use sub and make the replacement pattern a function (or any
"callable" thing) and it gets called back with the match object:

import re

def f(mo):
if "hello" in mo.groups():
print "saw hello"
if "mello" in mo.groups():
print "saw mello"

re.sub(r'(hello)(mello)', f, "jellohellomello")

Actually I didn't know you could do that in Perl. The time I've found
this useful in Python is substitutions to convert e.g.
"background-color" into "backgroundColor"; a function turns the c into
C. I always assumed in Perl you would need to use eval for this, but
perhaps there is another way.
Jul 16 '06 #3
Avi Kak wrote:
Does regular expression processing in Python allow for executable
code to be embedded inside a regular expression?

For example, in Perl the following two statements

$regex = qr/hello(?{print "saw hello\n"})mello(?{print "saw
mello\n"})/;
"jellohellomello" =~ /$regex/;

will produce the output

saw hello
saw mello

Is it possible to do the same in Python with any modules that come
with the standard distribution, or with any other modules?
Just in case you were referring to security concerns: Sufficiently complex REs
can take ages to compile and run and eat tons of memory, so there are always
issues involved here.

Stefan
Jul 17 '06 #4
Avi Kak wrote:
Folks,

Does regular expression processing in Python allow for executable
code to be embedded inside a regular expression?

For example, in Perl the following two statements

$regex = qr/hello(?{print "saw hello\n"})mello(?{print "saw
mello\n"})/;
"jellohellomello" =~ /$regex/;

will produce the output

saw hello
saw mello
Not nearly so terse, but perhaps easier to follow, here is a pyparsing
version. Pyparsing parse actions are intended to do just what you ask.
Parse actions may be defined to take no arguments, just one argument
(which will be passed the list of matching token strings), 2 arguments
(the match location and the matching tokens), or 3 arguments (the
original source string, the match location, and the tokens). Parse
actions are very good for transforming input text into modified output
form, such as the "background-color" to "backgroundColor" transform -
the BoaConstructor team used pyparsing to implement a version upgrade
that transformed user source to a new version of wx (involving a
variety of suh changes).

Here is your jello/mello program, with two variations of parse actions.

-- Paul

from pyparsing import *

instr = "jellorelohellomellofellowbellowmello"
searchTerm = oneOf( ["jello","mello"] )

# simple parse action, just echoes matched text
def echoMatchedText(tokens):
print "saw", tokens[0]

searchTerm.setParseAction( echoMatchedText )
searchTerm.searchString(instr)

# modified parse action, prints location too
def echoMatchedText(loc,tokens):
print "saw", tokens[0], "at locn", loc

searchTerm.setParseAction( echoMatchedText )
searchTerm.searchString(instr)

Prints out:
saw jello
saw mello
saw mello
saw jello at locn 0
saw mello at locn 14
saw mello at locn 31

Jul 17 '06 #5
Hi, see below ...

----- Original Message -----
From: "Paul McGuire" <pt***@austin.rr.com>
Newsgroups: comp.lang.python
To: <py*********@python.org>
Sent: Monday, July 17, 2006 10:09 AM
Subject: Re: embedding executable code in a regular expression in Python

Avi Kak wrote:
Folks,

Does regular expression processing in Python allow for executable
code to be embedded inside a regular expression?

For example, in Perl the following two statements

$regex = qr/hello(?{print "saw hello\n"})mello(?{print "saw
mello\n"})/;
"jellohellomello" =~ /$regex/;

will produce the output

saw hello
saw mello

Not nearly so terse, but perhaps easier to follow, here is a pyparsing
version. Pyparsing parse actions are intended to do just what you ask.
Parse actions may be defined to take no arguments, just one argument
(which will be passed the list of matching token strings), 2 arguments
(the match location and the matching tokens), or 3 arguments (the
original source string, the match location, and the tokens). Parse
actions are very good for transforming input text into modified output
form, such as the "background-color" to "backgroundColor" transform -
the BoaConstructor team used pyparsing to implement a version upgrade
that transformed user source to a new version of wx (involving a
variety of suh changes).

Here is your jello/mello program, with two variations of parse actions.

-- Paul

from pyparsing import *

instr = "jellorelohellomellofellowbellowmello"
searchTerm = oneOf( ["jello","mello"] )

# simple parse action, just echoes matched text
def echoMatchedText(tokens):
print "saw", tokens[0]

searchTerm.setParseAction( echoMatchedText )
searchTerm.searchString(instr)

# modified parse action, prints location too
def echoMatchedText(loc,tokens):
print "saw", tokens[0], "at locn", loc

searchTerm.setParseAction( echoMatchedText )
searchTerm.searchString(instr)

Prints out:
saw jello
saw mello
saw mello
saw jello at locn 0
saw mello at locn 14
saw mello at locn 31

--
http://mail.python.org/mailman/listinfo/python-list
One fine example of a pyparse solution!

On my part I had a second thought on the SE solution I proposed. It was
needlessly complicated. Let me try again:
>>sentence = 'On each occurrence of the word "square" in this text the
variable n will be squared. When it says "double" it will be doubled. At the
end print n % 11.'
>>def square_n (): global n; n *= n
def double_n (): global n; n += n
def n_mod_11 (): global n; return n % 11
se_definitions = """<EAT>
.... "square=print 'Calling square_n () - ',; square_n (); print 'n is now
%d' % n \n" # A piece of code for 'square'
... "double=print 'Calling double_n () - ',; double_n (); print 'n is now
%d' % n \n" # Another piece of code for 'double'
... ".=\nprint 'n %% 11 is: %d' % n_mod_11 ()\n" # Another piece of code
for the final dot
.... """
>>from SE import *
Stream_Editor = SE (definitions)
n = 9; exec (Stream_Editor (sentence))
Calling square_n () - n is now 81
Calling square_n () - n is now 6561
n % 11 is: 5
Calling double_n () - n is now 13122
Calling double_n () - n is now 26244
n % 11 is: 9
n % 11 is: 9

Suppose we now realize that the quoted words "square" and "double" should
not trigger the respective function and neither should the ending dot of
each sentence, except the last one. We fix the defintions in seconds, adding
the exceptions as deletions. The order is immaterial. Targets may be regular
expressions. (The last one is.)

se_definitions = """<EAT # Deletes everything except the defined
substitutions
.... "square=print 'Calling square_n () - ',; square_n (); print 'n is now
%d' % n \n" # A piece of code for 'square'
.... "double=print 'Calling double_n () - ',; double_n (); print 'n is now
%d' % n \n" # Another piece of code for 'double'
.... ".=\nprint 'n %% 11 is: %d' % n_mod_11 ()\n" # Another piece of code
for the final dot
.... \""square"=" \""double"=" # Deletes the quoted words
.... "~\.\s~=" # Deletes dots followed by white space.
.... """
>>n = 9; exec (SE (se_definitions)(sentence))
Calling square_n () - n is now 81
Calling double_n () - n is now 162
n % 11 is: 8

Note 1: SE is no match for pyparse on many accounts. Conversely SE has its
own strengths in the realm of its specialty, in particular ease and speed of
use, versatility and modularity.
Most any problem can be solved in a variety of different ways. Finding
good approaches is one of the appeals of programming.

Note 2: The SE solution rests entirely on Python's 'exec' functionality and
so here we have another testimony to Python's excellence.

Note 3: I shall include the example in the manual. I hadn't thought of it.
My thanks to the OP for the inspiration. Perhaps he might want to explain
his purpose. I don't quite see the solution's problem.

Frederic
Jul 18 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

18
by: K_Lee | last post by:
I documented the regex internal implementation code for both Tcl and Python. As much as I like Tcl, I like Python's code much more. Tcl's Stub interface to the external commands is confusing to...
4
by: Alicia Haumann | last post by:
I accidentally sent this to webmaster@python.org, so this could be a duplicate if "webmaster" forwards it to this list. :{ Hi, there. Thanks for any help that can be offered. I've been...
1
by: tom fogal | last post by:
Hi all, I can't seem to find out how to get a python script to run from a C (well, C++...) program. In particular, I'm confused about how the execution of the example code at...
0
by: Kenneth McDonald | last post by:
We're looking at embedding Python into our product to provide users with the ability to write scripts for the programming. My knowledge of Python is excellent, I'm familiar with the concepts of...
4
by: David Abrahams | last post by:
I'm seeing highly surprising (and different!) behaviors of PyImport_ImportModule on Linux and Windows when used in a program with python embedding. On Linux, when attempting to import a module...
4
by: DavidM | last post by:
Hi all, I'm embedding python in a C prog which is built as a linux shared lib. The prog is linked against libpython, and on startup, it calls Py_Initialize(). The prog imports a pure-python...
5
by: Noah Hoffman | last post by:
I have been trying to write a regular expression that identifies a block of text enclosed by (potentially nested) parentheses. I've found solutions using other regular expression engines (for...
0
by: gdetre | last post by:
Dear all, I'm trying to get a large, machine-generated regular expression (many thousands of characters) to work in Python on a Mac (running Leopard), and I keep banging my head against this...
8
by: Uwe Schmitt | last post by:
Hi, Is anobody aware of this post: http://swtch.com/~rsc/regexp/regexp1.html ? Are there any plans to speed up Pythons regular expression module ? Or is the example in this artricle too...
0
by: erikbower65 | last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps: 1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal. 2. Connect to...
0
by: erikbower65 | last post by:
Here's a concise step-by-step guide for manually installing IntelliJ IDEA: 1. Download: Visit the official JetBrains website and download the IntelliJ IDEA Community or Ultimate edition based on...
0
by: kcodez | last post by:
As a H5 game development enthusiast, I recently wrote a very interesting little game - Toy Claw ((http://claw.kjeek.com/))。Here I will summarize and share the development experience here, and hope it...
14
DJRhino1175
by: DJRhino1175 | last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this - If...
5
by: DJRhino | last post by:
Private Sub CboDrawingID_BeforeUpdate(Cancel As Integer) If = 310029923 Or 310030138 Or 310030152 Or 310030346 Or 310030348 Or _ 310030356 Or 310030359 Or 310030362 Or...
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
0
by: lllomh | last post by:
How does React native implement an English player?
0
by: Mushico | last post by:
How to calculate date of retirement from date of birth
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.