473,407 Members | 2,326 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,407 software developers and data experts.

Python Regular Expressions: re.sub(regex, replacement, subject)

Hi Folks,

I put a Regular Expression question on this list a
couple days ago. I would like to rephrase my question
as below:

In the Python re.sub(regex, replacement, subject)
method/function, I need the second argument
'replacement' to be another regular expression ( not a
string) . So when I find a 'certain kind of string' in
the subject, I can replace it with 'another kind of
string' ( not a predefined string ). Note that the
'replacement' may depend on what exact string is found
as a result of match with the first argument 'regex'.

Please let me know if the question is not clear.

Peace.
Vibha

=======
"Things are only impossible until they are not."

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Jul 21 '05 #1
3 9711
Vibha Tripathi wrote:
In the Python re.sub(regex, replacement, subject)
method/function, I need the second argument
'replacement' to be another regular expression ( not a
string) . So when I find a 'certain kind of string' in
the subject, I can replace it with 'another kind of
string' ( not a predefined string ). Note that the
'replacement' may depend on what exact string is found
as a result of match with the first argument 'regex'.

Please let me know if the question is not clear.


It's still not very clear, but my guess is you want to supply a
replacement function instead of a replacement string, e.g.:

py> help(re.sub)
Help on function sub in module sre:

sub(pattern, repl, string, count=0)
Return the string obtained by replacing the leftmost
non-overlapping occurrences of the pattern in string by the
replacement repl. repl can be either a string or a callable;
if a callable, it's passed the match object and must return
a replacement string to be used.

py> def repl(match):
.... print match.group()
.... return '46'
....
py> re.sub(r'x.*?x', repl, 'yxyyyxxyyxyy')
xyyyx
xyyx
'y4646yy'

STeVe
Jul 21 '05 #2
Vibha Tripathi wrote:
Hi Folks,

I put a Regular Expression question on this list a
couple days ago. I would like to rephrase my question
as below:

In the Python re.sub(regex, replacement, subject)
method/function, I need the second argument
'replacement' to be another regular expression ( not a
string) . So when I find a 'certain kind of string' in
the subject, I can replace it with 'another kind of
string' ( not a predefined string ). Note that the
'replacement' may depend on what exact string is found
as a result of match with the first argument 'regex'.


Do mean 'backreferences'?
re.sub(r"this(\d+)that", r"that\1this", "this12that foo13bar")

'that12this foo13bar'

Note that the replacement string r"that\1this" is not a regular expression,
it has completely different semantics as described in the docs. (Just
guessing: are you coming from perl? r"xxx" is not a regular expression in
Python, like /xxx/ in perl. It's is just an ordinary string where
backslashes are not interpreted by the parser, e.g. r"\x" == "\\x". Using
r"" when working with the re module is not required but pretty useful,
because re has it's own rules for backslash handling).

For more details see the docs for re.sub():
http://docs.python.org/lib/node114.html

--
Benjamin Niemann
Email: pink at odahoda dot de
WWW: http://www.odahoda.de/
Jul 21 '05 #3
"Vibha Tripathi" <vi*****@yahoo.com> wrote:
Hi Folks,

I put a Regular Expression question on this list a
couple days ago. I would like to rephrase my question
as below:

In the Python re.sub(regex, replacement, subject)
method/function, I need the second argument
'replacement' to be another regular expression ( not a
string) . So when I find a 'certain kind of string' in
the subject, I can replace it with 'another kind of
string' ( not a predefined string ). Note that the
'replacement' may depend on what exact string is found
as a result of match with the first argument 'regex'.


In re.sub, 'replacement' can be either a string, or a callable that
takes a single match argument and should return the replacement string.
So although replacement cannot be a regular expression, it can be
something even more powerful, a function. Here's a toy example of what
you can do that wouldn't be possible with regular expressions alone:
import re
from datetime import datetime
this_year = datetime.now().year
rx = re.compile(r'(born|gratuated|hired) in (\d{4})')
def replace_year(match):
return "%s %d years ago" % (match.group(1), this_year - int(match.group(2)))
rx.sub(replace_year, 'I was born in 1979 and gratuated in 1996.') 'I was born 26 years ago and gratuated 9 years ago'

In cases where you don't have to transform the matched string (such as
calling int() and evaluating an expression as in the example) but only
append or prepend another string, there is a simpler solution that
doesn't require writing a replacement function: backreferences.
Replacement can be a string where \1 denotes the first group of the
match, \2 the second and so on. Continuing the example, you could hide
the dates by:
rx.sub(r'\1 in ****', 'I was hired in 2001 in a company of 2001 employees.')

'I was hired in **** in a company of 2001 employees.'

By the way, run the last example without the 'r' in front of the
replacement string and you'll see why it is there for.

HTH,

George

Jul 21 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

75
by: Xah Lee | last post by:
http://python.org/doc/2.4.1/lib/module-re.html http://python.org/doc/2.4.1/lib/node114.html --------- QUOTE The module defines several functions, constants, and an exception. Some of the...
0
by: Thomas | last post by:
Hi, I need to see if one work in a set of words is in a string. You have to ignore punctuation. You have to match whole words. Here is the actual requirement from our client:
2
by: bruce | last post by:
hi... does python provide regex handling similar to perl. can't find anything in the docs i've seen to indicate it does... -bruce
8
by: Xah Lee | last post by:
the Python regex documentation is available at: http://xahlee.org/perl-python/python_re-write/lib/module-re.html Note that, i've just made the terms of use clear. Also, can anyone answer what...
4
by: johnny | last post by:
I need to get the content inside the bracket. eg. some characters before bracket (3.12345). I need to get whatever inside the (), in this case 3.12345. How do you do this with python regular...
4
by: charonzen | last post by:
I have a list of strings. These strings are previously selected bigrams with underscores between them ('and_the', 'nothing_given', and so on). I need to write a regex that will read another text...
1
by: chris fellows | last post by:
My DotNet component needs to parse a phone number and check that it is in the range 01234560020 to 01234562500. It needs to be done using a Regular Expression. (Please don't ask why!!!) Please can...
12
by: pistacchio | last post by:
hi! i'm a php user and a python programmer. i'd love to use python for my server side needs but i can't seem to find what i'm looking for. for most of my php work i use mysql and tinyButStrong...
0
by: Tim N. van der Leeuw | last post by:
Hey Gerhard, Gerhard Häring wrote: I so far forgot to say a "thank you" for the suggestion :-) The sample code as you sent it doesn't do what I need to do, but I did look at it for...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.