473,414 Members | 1,677 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,414 software developers and data experts.

Trouble escaping regex strings

Xx r3negade
I'm using regular expressions to parse HTML hyperlinks and I've run into a problem. I'm trying to escape characters
such as '.' and '?' for use in regular expressions, but it's not working

Expand|Select|Wrap|Line Numbers
  1. # Grabs a link.  For this example, let's say that the string grabbed is '<a href="http://google.com/?q=foo">Click</a>'
  2. link_url_original = GetLink()
  3.  
  4. # Sanitize string for regex use
  5. link_url_original = re.sub("\.", "\.", link_url_original)
  6. link_url_original = re.sub("\?", "\?", link_url_original)
  7.  
  8. toSub = 'http://google.com/?q=foo'
  9. to Repl = 'http://www.yahoo.com'
  10.  
  11. final = re.sub(toSub, toRpl, link_url_original)
  12. print final
  13.  
The output is:
Expand|Select|Wrap|Line Numbers
  1. <a href="http://google\.com/\?q=foo">Click</a>
Why aren't the added slashes being interpretted as escape characters?
Aug 4 '08 #1
4 1347
Das123
2
The regular expression is the match string, not the replace string. The correct syntax would be like
Expand|Select|Wrap|Line Numbers
  1. result = re.sub("\.", ".", subject)
Aug 4 '08 #2
The regular expression is the match string, not the replace string. The correct syntax would be like
Expand|Select|Wrap|Line Numbers
  1. result = re.sub("\.", ".", subject)
Replace a "." with another "."? What???
Aug 5 '08 #3
Das123
2
Ahh, sorry. I didn't understand what you were trying to do.

The answer is that the toSub needs to be escaped, not the link_original_url...

Expand|Select|Wrap|Line Numbers
  1. import re
  2. #link_url_original = GetLink()
  3. link_url_original = '<a href="http://google.com/?q=foo">Click</a>'
  4.  
  5. toSub = "http://google.com/?q=foo"
  6. # Sanitize string for regex use
  7. toSub = re.sub("\.", "\.", toSub)
  8. toSub = re.sub("\?", "\?", toSub)
  9. toRpl = "http://www.yahoo.com"
  10.  
  11. final = re.sub(toSub, toRpl, link_url_original)
  12. print final
  13.  
The result is...

<a href="http://www.yahoo.com">Click</a>
Aug 5 '08 #4
Ah, I can't believe I didn't catch that, thanks.
Aug 5 '08 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

5
by: Henry | last post by:
I have this simple code, string escaped = Regex.Escape( @"`~!@#$%^&*()_=+{}\|;:',<.>/?" + "\"" ); string input = @"a&+" + "\"" + @"@(-d)\e"; Regex re = new Regex( string.Format(@"(+)", escaped),...
22
by: stoppal | last post by:
need to extract all text between the following strings, but not include the strings. "<!-- #BeginEditable "Title name" -->" "<p align="center">#### </p>" I am using preg_match(????, $s,...
4
by: aevans1108 | last post by:
expanding this message to microsoft.public.dotnet.xml Greetings Please direct me to the right group if this is an inappropriate place to post this question. Thanks. I want to format a...
1
by: Theo Chakkapark | last post by:
I'm having issues trying to replace text with PHP. For example, if I have a string of text that reads: {tag} And want to replace that with: $_POST
7
by: bill tie | last post by:
I'd appreciate it if you could advise. 1. How do I replace "\" (backslash) with anything? 2. Suppose I want to replace (a) every occurrence of characters "a", "b", "c", "d" with "x", (b)...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
7
by: lgbjr | last post by:
Hi All, I'm trying to split a string on every character. The string happens to be a representation of a hex number. So, my regex expression is (). Seems simple, but for some reason, I'm not...
11
by: Geoff Caplan | last post by:
Hi folks, The thread on injection attacks was very instructive, but seemed to run out of steam at an interesting point. Now you guys have kindly educated me about the real nature of the issues,...
3
by: placid | last post by:
Hi All, I have these files; which are Merge Request (ClearCase) files that are created by a Perl CGI script (being re-written in Python, as the HTML/ JavaScript have been mixed with Perl,...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.