473,395 Members | 1,386 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Regular Expression for URL

JJ
I can get a set of matches of URL's by using a match expression that has a
named capture group in it, i.e.:

Regex reg_linkTags = new Regex("(?:.....long resular
expression......)(?<url>\\w+|\"[^\"]*\"|'[^']*')(?:(?:\\s+\\w+\\s*=\\s*)(?:
......long regular expression......)", RegexOptions.IgnoreCase |
RegexOptions.Compiled | RegexOptions.Multiline);
MatchCollection tagMatches = reg_linkTags.Matches(myString);

Then get the url named capture group by:
for (int i = 0; i <= (tagMatches.Count - 1); i++)
{

CurrentUrl = tagMatches[i].Result("${url}");
BUT how do I _replace_ the url in the original string ('myString'). I can
match them , but I cannot seem to replace them. I tried using regex.replace,
but I don't think I can use this with either 'non capturing groups' (which I
wrap the named capture group 'url' in) or with named capture groups can I?

Thanks,
JJ
Jun 1 '07 #1
4 1605
On Jun 1, 4:21 pm, "JJ" <a...@xyz.comwrote:
I can get a set of matches of URL's by using a match expression that has a
named capture group in it, i.e.:

Regex reg_linkTags = new Regex("(?:.....long resular
expression......)(?<url>\\w+|\"[^\"]*\"|'[^']*')(?:(?:\\s+\\w+\\s*=\\s*)(?:
.....long regular expression......)", RegexOptions.IgnoreCase |
RegexOptions.Compiled | RegexOptions.Multiline);
MatchCollection tagMatches = reg_linkTags.Matches(myString);

Then get the url named capture group by:
for (int i = 0; i <= (tagMatches.Count - 1); i++)
{

CurrentUrl = tagMatches[i].Result("${url}");
What about simple replace?

CurrentUrl = tagMatches[i].Result("${url}");
myString = myString.replace(CurrentUrl, newUrl);

Jun 1 '07 #2
JJ
>
What about simple replace?

CurrentUrl = tagMatches[i].Result("${url}");
myString = myString.replace(CurrentUrl, newUrl);
Ah thats what I did in the first place. However, when the href and the src
parts of the tag had identical beginning sections, both were replaced - in
my case I didn't want that to happen.

What I ended up doing was creating another reg expression to pull out just
the href url so that I could replace it. Just thought there may be any
easier way to use named capture groups to replace (not just capture) text.
Maybe there is, but as yet I've not a clue how to do it.

Thanks,

JJ
Jun 4 '07 #3
On Jun 4, 7:16 pm, "JJ" <a...@xyz.comwrote:
What about simple replace?
CurrentUrl = tagMatches[i].Result("${url}");
myString = myString.replace(CurrentUrl, newUrl);

Ah thats what I did in the first place. However, when the href and the src
parts of the tag had identical beginning sections, both were replaced - in
my case I didn't want that to happen.

What I ended up doing was creating another reg expression to pull out just
the href url so that I could replace it. Just thought there may be any
easier way to use named capture groups to replace (not just capture) text.
Maybe there is, but as yet I've not a clue how to do it.

Thanks,

JJ
I think there is one more possibility

Change your pattern to return all content in a groups

(?<textbefore>...)(?<url>...)(?<textafter>...)

In this case you could use the following replacement statement

string newUrl = "http://.......";
string result = reg_linkTags.Replace(text, "${textbefore}" + newUrl +
"${textafter}");

You can also use a MatchEvaluator delegate to custom string function
that can be called at each match to evaluate the replacement value.

string result = reg_linkTags.Replace(text, new
MatchEvaluator(NewUrl));

string NewUrl(Match m)
{
string x = m.Groups["Domain"].ToString();
....
return "something";
}

Jun 4 '07 #4
JJ
I didn't think of that. Thanks,

JJ

"Alexey Smirnov" <al************@gmail.comwrote in message
news:11**********************@q75g2000hsh.googlegr oups.com...
On Jun 4, 7:16 pm, "JJ" <a...@xyz.comwrote:
What about simple replace?
CurrentUrl = tagMatches[i].Result("${url}");
myString = myString.replace(CurrentUrl, newUrl);

Ah thats what I did in the first place. However, when the href and the
src
parts of the tag had identical beginning sections, both were replaced -
in
my case I didn't want that to happen.

What I ended up doing was creating another reg expression to pull out
just
the href url so that I could replace it. Just thought there may be any
easier way to use named capture groups to replace (not just capture)
text.
Maybe there is, but as yet I've not a clue how to do it.

Thanks,

JJ

I think there is one more possibility

Change your pattern to return all content in a groups

(?<textbefore>...)(?<url>...)(?<textafter>...)

In this case you could use the following replacement statement

string newUrl = "http://.......";
string result = reg_linkTags.Replace(text, "${textbefore}" + newUrl +
"${textafter}");

You can also use a MatchEvaluator delegate to custom string function
that can be called at each match to evaluate the replacement value.

string result = reg_linkTags.Replace(text, new
MatchEvaluator(NewUrl));

string NewUrl(Match m)
{
string x = m.Groups["Domain"].ToString();
...
return "something";
}

Jun 5 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Kenneth McDonald | last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...
4
by: Buddy | last post by:
Can someone please show me how to create a regular expression to do the following My text is set to MyColumn{1, 100} Test I want a regular expression that sets the text to the following...
4
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go...
11
by: Dimitris Georgakopuolos | last post by:
Hello, I have a text file that I load up to a string. The text includes certain expression like {firstName} or {userName} that I want to match and then replace with a new expression. However,...
3
by: James D. Marshall | last post by:
The issue at hand, I believe is my comprehension of using regular expression, specially to assist in replacing the expression with other text. using regular expression (\s*) my understanding is...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
9
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
1
by: Allan Ebdrup | last post by:
I have a dynamic list of regular expressions, the expressions don't change very often but they can change. And I have a single string that I want to match the regular expressions against and find...
1
by: NvrBst | last post by:
I want to use the .replace() method with the regular expression /^ %VAR % =,($|&)/. The following DOESN'T replace the "^default.aspx=,($|&)" regular expression with "":...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.