473,403 Members | 2,222 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,403 software developers and data experts.

Squezing in replacements into strings

I've got a regular expression that finds certain words from a longer string.
From "Peter Bengtsson PETER, or PeTeR" it finds: 'Peter','PETER','PeTeR'.


What I then want to do is something like this:

def _ok(matchobject):
# more complicated stuff happens here
return 1

def _massage(word):
return "_" + word + "_"

for match in regex.finditer(text):
if not _ok(match):
continue
text = text[:match.start()] +\
_massageMatch(text[match.start():match.end()]) +\
text[match.end():]

This code works and can convert something like "don't allow the fuck swear word"

to "don't allow the _fuck_ swear word".

The problem is when there are more than one matches. The match.start() and

match.end() are for the original string but after the first iteration in the

loop the original string changes (it gains 2 characters in length due to the
"_"'s)

How can I do this this concatenation correctly?

--
Peter Bengtsson,
work www.fry-it.com
home www.peterbe.com
hobby www.issuetrackerproduct.com

Jul 19 '05 #1
3 1112
Peter Bengtsson wrote:
I've got a regular expression that finds certain words from a longer
string.
From "Peter Bengtsson PETER, or PeTeR" it finds: 'Peter','PETER','PeTeR'.
The problem is when there are more than one matches. The match.start() and
match.end() are for the original string but after the first iteration in
the loop the original string changes (it gains 2 characters in length due to the "_"'s
How can I do this this concatenation correctly?


I think sub() is more appropriate than finditer() for your problem, e. g.:
def process(match): .... return "_%s_" % match.group(1).title()
.... re.compile("(peter)", re.I).sub(process, "Peter Bengtsson PETER, or PeTeR")
'_Peter_ Bengtsson _Peter_, or _Peter_'


Peter
Jul 19 '05 #2
As Peter Otten said, sub() is probably what you want. Try:

---------------------------------------------------
import re

def _ok(matchobject):
# more complicated stuff happens here
return 1

def _massage(word):
return "_" + word + "_"
def _massage_or_not(matchobj):
if not _ok(matchobj):
return matchobj.group(0)
else:
word = matchobj.group(0)
return _massage(word)
text = "don't allow the fuck swear word"

rtext = re.sub(r'fuck', _massage_or_not, text)
print rtext
---------------------------------------------------

No need to hassle with the changing length of the replaced string.

Best regards,
Adriano.
Jul 19 '05 #3
Peter Otten <__peter__ <at> web.de> writes:
How can I do this this concatenation correctly?


I think sub() is more appropriate than finditer() for your problem, e. g.:
def process(match): ... return "_%s_" % match.group(1).title()
... re.compile("(peter)", re.I).sub(process, "Peter Bengtsson PETER, or PeTeR")
'_Peter_ Bengtsson _Peter_, or _Peter_'


Ahaa! Great. I didn't realise that I can substitute with a callable that gets
the match object. Hadn't thought of it that way.
Will try this now.

Jul 19 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

26
by: William Park | last post by:
How do you compare 2 strings, and determine how much they are "close" to each other? Eg. aqwerty qwertyb are similar to each other, except for first/last char. But, how do I quantify that? ...
4
by: Jon Glazer | last post by:
I have a small function that simply swaps image A with image B. The problem is that in some situations (i don't know why yet) when A and B are swapped, they simply vanish. Here is the function...
2
by: Sally B. | last post by:
Hi, Using string.replace with regular expressions, is there any way to count the number of replacements that actually happens? I know you can limit the number of replacements w/ a count value, but...
1
by: Karen | last post by:
I am very new to MySQL- I currently have an VBA module using a VBSCript that will find different aspects of a text string that are a unique text string and turn it into a not so unique text string....
8
by: DierkErdmann | last post by:
Hi ! I know that this topic has been discussed in the past, but I could not find a working solution for my problem: sorting (lists of) strings containing special characters like "ä", "ü",......
8
by: Ulysse | last post by:
Hello, I need to clean the string like this : string = """ bonne mentalit&eacute; mec!:) \n <br>bon pour info moi je suis un serial posteur arceleur dictateur ^^* \n ...
95
by: hstagni | last post by:
Where can I find a library to created text-based windows applications? Im looking for a library that can make windows and buttons inside console.. Many old apps were make like this, i guess ...
16
by: mike3 | last post by:
(I'm xposting this to both comp.lang.c++ and comp.os.ms- windows.programmer.win32 since there's Windows material in here as well as questions related to standard C++. Not sure how that'd go over...
1
by: Andrus | last post by:
I have a lot of localizable strings enclosed in double quotes in cs source files. I'm looking for a tool which extracts all those strings for localizaton to a single file. Or even better, it...
3
by: ThatsIT.net.au | last post by:
I have been looking into web config file section replacements I have set it buy putting in the line into the deploy section connectionStrings=connectionStrings.config; I then have a file by that...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.