473,233 Members | 1,470 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,233 software developers and data experts.

Python Regular Expressions: re.sub(regex, replacement, subject)

Hi Folks,

I put a Regular Expression question on this list a
couple days ago. I would like to rephrase my question
as below:

In the Python re.sub(regex, replacement, subject)
method/function, I need the second argument
'replacement' to be another regular expression ( not a
string) . So when I find a 'certain kind of string' in
the subject, I can replace it with 'another kind of
string' ( not a predefined string ). Note that the
'replacement' may depend on what exact string is found
as a result of match with the first argument 'regex'.

Please let me know if the question is not clear.

Peace.
Vibha

=======
"Things are only impossible until they are not."

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Jul 21 '05 #1
3 9688
Vibha Tripathi wrote:
In the Python re.sub(regex, replacement, subject)
method/function, I need the second argument
'replacement' to be another regular expression ( not a
string) . So when I find a 'certain kind of string' in
the subject, I can replace it with 'another kind of
string' ( not a predefined string ). Note that the
'replacement' may depend on what exact string is found
as a result of match with the first argument 'regex'.

Please let me know if the question is not clear.


It's still not very clear, but my guess is you want to supply a
replacement function instead of a replacement string, e.g.:

py> help(re.sub)
Help on function sub in module sre:

sub(pattern, repl, string, count=0)
Return the string obtained by replacing the leftmost
non-overlapping occurrences of the pattern in string by the
replacement repl. repl can be either a string or a callable;
if a callable, it's passed the match object and must return
a replacement string to be used.

py> def repl(match):
.... print match.group()
.... return '46'
....
py> re.sub(r'x.*?x', repl, 'yxyyyxxyyxyy')
xyyyx
xyyx
'y4646yy'

STeVe
Jul 21 '05 #2
Vibha Tripathi wrote:
Hi Folks,

I put a Regular Expression question on this list a
couple days ago. I would like to rephrase my question
as below:

In the Python re.sub(regex, replacement, subject)
method/function, I need the second argument
'replacement' to be another regular expression ( not a
string) . So when I find a 'certain kind of string' in
the subject, I can replace it with 'another kind of
string' ( not a predefined string ). Note that the
'replacement' may depend on what exact string is found
as a result of match with the first argument 'regex'.


Do mean 'backreferences'?
re.sub(r"this(\d+)that", r"that\1this", "this12that foo13bar")

'that12this foo13bar'

Note that the replacement string r"that\1this" is not a regular expression,
it has completely different semantics as described in the docs. (Just
guessing: are you coming from perl? r"xxx" is not a regular expression in
Python, like /xxx/ in perl. It's is just an ordinary string where
backslashes are not interpreted by the parser, e.g. r"\x" == "\\x". Using
r"" when working with the re module is not required but pretty useful,
because re has it's own rules for backslash handling).

For more details see the docs for re.sub():
http://docs.python.org/lib/node114.html

--
Benjamin Niemann
Email: pink at odahoda dot de
WWW: http://www.odahoda.de/
Jul 21 '05 #3
"Vibha Tripathi" <vi*****@yahoo.com> wrote:
Hi Folks,

I put a Regular Expression question on this list a
couple days ago. I would like to rephrase my question
as below:

In the Python re.sub(regex, replacement, subject)
method/function, I need the second argument
'replacement' to be another regular expression ( not a
string) . So when I find a 'certain kind of string' in
the subject, I can replace it with 'another kind of
string' ( not a predefined string ). Note that the
'replacement' may depend on what exact string is found
as a result of match with the first argument 'regex'.


In re.sub, 'replacement' can be either a string, or a callable that
takes a single match argument and should return the replacement string.
So although replacement cannot be a regular expression, it can be
something even more powerful, a function. Here's a toy example of what
you can do that wouldn't be possible with regular expressions alone:
import re
from datetime import datetime
this_year = datetime.now().year
rx = re.compile(r'(born|gratuated|hired) in (\d{4})')
def replace_year(match):
return "%s %d years ago" % (match.group(1), this_year - int(match.group(2)))
rx.sub(replace_year, 'I was born in 1979 and gratuated in 1996.') 'I was born 26 years ago and gratuated 9 years ago'

In cases where you don't have to transform the matched string (such as
calling int() and evaluating an expression as in the example) but only
append or prepend another string, there is a simpler solution that
doesn't require writing a replacement function: backreferences.
Replacement can be a string where \1 denotes the first group of the
match, \2 the second and so on. Continuing the example, you could hide
the dates by:
rx.sub(r'\1 in ****', 'I was hired in 2001 in a company of 2001 employees.')

'I was hired in **** in a company of 2001 employees.'

By the way, run the last example without the 'r' in front of the
replacement string and you'll see why it is there for.

HTH,

George

Jul 21 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

75
by: Xah Lee | last post by:
http://python.org/doc/2.4.1/lib/module-re.html http://python.org/doc/2.4.1/lib/node114.html --------- QUOTE The module defines several functions, constants, and an exception. Some of the...
0
by: Thomas | last post by:
Hi, I need to see if one work in a set of words is in a string. You have to ignore punctuation. You have to match whole words. Here is the actual requirement from our client:
2
by: bruce | last post by:
hi... does python provide regex handling similar to perl. can't find anything in the docs i've seen to indicate it does... -bruce
8
by: Xah Lee | last post by:
the Python regex documentation is available at: http://xahlee.org/perl-python/python_re-write/lib/module-re.html Note that, i've just made the terms of use clear. Also, can anyone answer what...
4
by: johnny | last post by:
I need to get the content inside the bracket. eg. some characters before bracket (3.12345). I need to get whatever inside the (), in this case 3.12345. How do you do this with python regular...
4
by: charonzen | last post by:
I have a list of strings. These strings are previously selected bigrams with underscores between them ('and_the', 'nothing_given', and so on). I need to write a regex that will read another text...
1
by: chris fellows | last post by:
My DotNet component needs to parse a phone number and check that it is in the range 01234560020 to 01234562500. It needs to be done using a Regular Expression. (Please don't ask why!!!) Please can...
12
by: pistacchio | last post by:
hi! i'm a php user and a python programmer. i'd love to use python for my server side needs but i can't seem to find what i'm looking for. for most of my php work i use mysql and tinyButStrong...
0
by: Tim N. van der Leeuw | last post by:
Hey Gerhard, Gerhard Häring wrote: I so far forgot to say a "thank you" for the suggestion :-) The sample code as you sent it doesn't do what I need to do, but I did look at it for...
0
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
0
Git
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.