473,734 Members | 2,806 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

search/replace in Python

Hello,
I'm having some problems understanding Regexps in Python. I want
to replace "<google>PHRASE </google>" with
"<a href=http://www.google.com/search?q=PHRASE >PHRASE</a>" in a block of
text. How can I achieve this in Python? Sorry for the naive question but
the documentation is really bad :-(

Regards,
GVK
Jul 19 '05 #1
5 2904
Hi,

2005/5/28, Vamsee Krishna Gomatam <va******@nospa m.students.iiit .ac.in>:
Hello,
I'm having some problems understanding Regexps in Python. I want
to replace "<google>PHRASE </google>" with
"<a href=http://www.google.com/search?q=PHRASE >PHRASE</a>" in a blockof
text. How can I achieve this in Python? Sorry for the naive question but
the documentation is really bad :-(


it is pretty easy and straightforward . After you imported the re
module, you can do the job with re.sub or if you have to do it often,
then you can compile the regex first with re.compile
and afterwards use the sub method of the compiled regex.

The interesting part is

re.sub(r"<googl e>(.*)</google>",r"<a
href=http://www.google.com/search?q=\1>\1</a>", text)

cause here the job is done. The first raw string is the regex pattern.
The second one is the target where \1 is replaced by everything
enclosed in () in the first regex.

Here is the output of my ipython session.

In [15]: text="This is a <google>Pytho n</google>. And some random
nonsense test....."

In [16]: re.sub(r"<googl e>(.*)</google>",r"<a
href=http://www.google.com/search?q=\1>\1</a>", text)

Out[16]: 'This is a <a
href=http://www.google.com/search?q=Python >Python</a>. And some random
nonsense test.....'

Best regards,
Oliver
Jul 19 '05 #2
Vamsee Krishna Gomatam wrote:
Hello,
I'm having some problems understanding Regexps in Python. I want
to replace "<google>PHRASE </google>" with
"<a href=http://www.google.com/search?q=PHRASE >PHRASE</a>" in a block of
text. How can I achieve this in Python? Sorry for the naive question but
the documentation is really bad :-(
VKG,

Sorry you had such difficulty; if you can explain in a bit more detail,
we could perhaps fix that "really bad" [compared with what?] documentation.

Whether you are doing a substitution programatically in Python, or in
any other language, or manually using a text editor, you need to specify
some input text, a pattern, and a replacement.

The syntax for pattern and replacement for what you want to do differs
in only minor details (if at all) among major languages and common text
editors, and it's been that way for years. For example, see the
retro-computing museum exhibit at the end of this posting.

So, did you have any problem determining that PHRASE is represented by
(.*) -- good enough if there is only one occurrence of your target -- in
the pattern, and by \1 in the replacement?

Did you have a problem with this part of the docs (which is closely
followed by an example with \1 in the replacement), and if so, what was
the problem?

sub( pattern, repl, string[, count])

Return the string obtained by replacing the leftmost non-overlapping

occurrences of pattern in string by the replacement repl.

Have you used regular expressions before? If not, then you shouldn't
expect to learn how to use them from the documentation, which does
adequately _document_ the provided functionality. This is just like how
the manual supplied with a new car documents the car's functionality,
but doesn't attempt to teach how to drive it. If you haven't done so
already, you may like to do what the documentation suggests:

consult the Regular Expression HOWTO, accessible from
http://www.python.org/doc/howto/.

And out with the magic lantern ... here's the promised blast from the
past, using Oliver's sample input:

C:\junk>dir \bin\ed.com
[snip]
30/11/1985 04:43p 18,936 ed.com
[snip]
C:\junk>ed
Memory available : 59K bytesa This is a <google>Pytho n</google>. And some randomnonsense test.....
..p This is a <google>Pytho n</google>. And some randomnonsense test.....s/<google>\(.*\)< \/google>/<a href=http:\/\/www.google.com\/search?q=\1>\1< \/a>/p This is a <a href=http://www.google.com/search?q=Python >Python</a>. And
some randomnonsense test.....


Cheers,
John
Jul 19 '05 #3
Oliver Andrich wrote:
re.sub(r"<googl e>(.*)</google>",r"<a
href=http://www.google.com/search?q=\1>\1</a>", text)


For real-world use you'll want to URL encode and entityify the text:

import cgi
import urllib

def google_link(tex t):
text = text.group(1)
return '<a href="%s">%s</a>' % (cgi.escape(url lib.quote(text) ),
cgi.escape(text ))

re.sub(r"<googl e>(.*)</google>", google_link, "<google>fo o bar</google>)
Jul 19 '05 #4
Leif K-Brooks wrote:
Oliver Andrich wrote:
For real-world use you'll want to URL encode and entityify the text:

import cgi
import urllib

def google_link(tex t):
text = text.group(1)
return '<a href="%s">%s</a>' % (cgi.escape(url lib.quote(text) ),
cgi.escape(text ))

re.sub(r"<googl e>(.*)</google>", google_link, "<google>fo o bar</google>)

Thanks a lot for your reply. I was able to solve it this way:
text = re.sub( "<google>([^<]*)</google>", r'<a
href="http://www.google.com/search?q=\1">\1 </a>', text )
GVK
Jul 19 '05 #5
Vamsee Krishna Gomatam wrote:
text = re.sub( "<google>([^<]*)</google>", r'<a
href="http://www.google.com/search?q=\1">\1 </a>', text )


But see what happens when text contains spaces, or quotes, or
ampersands, or...
Jul 19 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
2397
by: gyromagnetic | last post by:
Hi, I have written a function that searches a text string for various words. The text is searched using a boolean 'and' or a boolean 'or' of the input list of search terms. Since I need to use this function for many long strings and many search words, I would like to use as efficient a method as possible. Are there improvements that can be made to the code below? Are there better alternatives?
1
8726
by: Les Juby | last post by:
A year or two back I needed a search script to scan thru HTML files on a client site. Usual sorta thing. A quick search turned up a neat script that provided great search results. It was fast, returned the hyperlinked page title, filename, and the body txt (30 preceding and following words) in context with the search word highlighted. Excellent.! See it working at: http://www.ipt.co.za Just search for "firearm"
22
11118
by: Phlip | last post by:
C++ers: Here's an open ended STL question. What's the smarmiest most templated way to use <string>, <algorithms> etc. to turn this: " able search baker search charlie " into this: " able replace baker replace charlie "
5
2635
by: pembed2003 | last post by:
Hi all, I need to write a function to search and replace part of a char* passed in to the function. I came up with the following: char* search_and_replace(char* source,char search,char* replace){ char* result; size_t l = strlen(source), r = strlen(replace), i; int number_of_replaces = 0; for(i = 0; i < l; i++){ if(source == search)
32
14881
by: tshad | last post by:
Can you do a search for more that one string in another string? Something like: someString.IndexOf("something1","something2","something3",0) or would you have to do something like: if ((someString.IndexOf("something1",0) >= 0) || ((someString.IndexOf("something2",0) >= 0) ||
6
5897
by: Martin Evans | last post by:
Sorry, yet another REGEX question. I've been struggling with trying to get a regular expression to do the following example in Python: Search and replace all instances of "sleeping" with "dead". This parrot is sleeping. Really, it is sleeping. to This parrot is dead. Really, it is dead.
6
2674
by: DataSmash | last post by:
Hello, I need to search and replace 4 words in a text file. Below is my attempt at it, but this code appends a copy of the text file within itself 4 times. Can someone help me out. Thanks! # Search & Replace file = open("text.txt", "r") text = file.read()
1
7549
Merlin1857
by: Merlin1857 | last post by:
How to search multiple fields using ASP A major issue for me when I first started writing in VB Script was constructing the ability to search a table using multiple field input from a form and having the sql statement dynamically built according to the input provided by the user. I have used the method described here hundreds of times it is quick and adaptive. I generally use a frames page for the search, in this way the search is maintained...
7
164340
by: DannyMc | last post by:
Hi , i am in the middle of creating the script to match the string and replace it in mount.txt mickey:/work1 /work1 bla bla bla mickey:/work2 /work2 bla bla bla micket:/job /job bla bla bla
0
8946
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9449
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9236
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9182
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6735
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4550
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4809
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2724
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2180
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.