473,386 Members | 1,705 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

RegEx : Match and replace term within HTML tags

I have a search app that searches local HTML files for a specified
term. I then display the pages that contain the term.

I would like to highlight the search term within the HTML when it is
viewed.

I have the following regular expression code:

string searchTerm = "(?<STARTTAG>(<[^>]*>.*))(?<MATCHTERM>(" +
lastSearchTerm + "))(?<ENDTAG>(.*<[^>]*>))";

string replaceString = "${STARTTAG}<span
style=\"background-color:#FFFFCC\">${MATCHTERM}</span>${ENDTAG}";

Regex.Replace(htmlBody, searchTerm, replaceString,
RegexOptions.IgnoreCase);

I am trying to match the search term within HTML tags. i.e.

<htmltag>searchterm</htmltag>

and then replace the search term with a span tag to color it, like so:

<htmltag><span
style=\"background-color:#FFFFCC\">searchterm</span></htmltag>

This works, but works inconsitently (and without a discernable pattern
when it fails).

So, does anyone see anything obviously wrong with my Regular
Expressions? I am pretty new to regular expressions, although I
usually know enough to get stuff done.

mike c
Nov 16 '05 #1
1 1395
Hi,
inline

"mike c" <m@foo.com> wrote in message
news:r1********************************@4ax.com...
I have a search app that searches local HTML files for a specified
term. I then display the pages that contain the term.

I would like to highlight the search term within the HTML when it is
viewed.

I have the following regular expression code:

string searchTerm = "(?<STARTTAG>(<[^>]*>.*))(?<MATCHTERM>(" +
lastSearchTerm + "))(?<ENDTAG>(.*<[^>]*>))";

string replaceString = "${STARTTAG}<span
style=\"background-color:#FFFFCC\">${MATCHTERM}</span>${ENDTAG}";

Regex.Replace(htmlBody, searchTerm, replaceString,
RegexOptions.IgnoreCase);

I am trying to match the search term within HTML tags. i.e.

<htmltag>searchterm</htmltag>
Because of the .* (greedy) in ENDTAG it will match the last tag. Even if
you replace it with .*? (non-greedy) there are still some problems:

<h1> searchterm <b> searchterm </b> </h1>
<h1> searchterm <br> searchterm </h1>
<h1> searchterm searchterm </h1>

In all cases only one searchterm will be replaced.
If you have valid html, then you can say that a word isn't inside a tag if
the first following bracket is a < and not a >. So put together with a
positive lookahead this would become:

string searchTerm = lastSearchTerm + "(?=[^>]*<)";

string replaceString = "<span style=\"background-color:#FFFFCC\">"+
lastSearchTerm + "</span>";

It may still do wrong at title and scripts.

hth,
greetings


and then replace the search term with a span tag to color it, like so:

<htmltag><span
style=\"background-color:#FFFFCC\">searchterm</span></htmltag>

This works, but works inconsitently (and without a discernable pattern
when it fails).

So, does anyone see anything obviously wrong with my Regular
Expressions? I am pretty new to regular expressions, although I
usually know enough to get stuff done.

mike c

Nov 16 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: mike c | last post by:
I have a search app that searches local HTML files for a specified term. I then display the pages that contain the term. I would like to highlight the search term within the HTML when it is...
3
by: DDK | last post by:
I am trying to figure out how to Replace tags such as ... with the correct HTML <b>...</b> tags in C#. The code below works however only if one set of tags are found, if you have more than two...
1
by: George Durzi | last post by:
Consider this excerpt from some HTML. (This is a copy from View->Source, except for the comment) <TABLE WIDTH=100% CELLPADDING=0 CELLSPACING=0 border=0> <?xml version="1.0" encoding="UTF-16"?>...
1
by: darrel | last post by:
I have some vb.net code that is running a regex, matching groups, and replacing them. I'm trying to come up with a simple script that will strip all attributes from all HTML tags. This is what I...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
13
by: Chris Lieb | last post by:
I am trying to write a regex that will parse BBcode into HTML using JavaScript. Everything was going smoothly using the string class replace() operator with regex's until I got to the list tag....
2
by: Tim_Mac | last post by:
hi, i have a tricky problem and my regex expertise has reached its limit. i have read other posts on this newsgroup that pull out the plain text from a html string, but that won't work for me...
2
by: Craig Buchanan | last post by:
I have a HTML fragment that looks like this: <tr> <td valign="top" nowrap><span class="textBold">Property ID: </span></td> <td valign="top" nowrap colspan="4"...
1
by: =?Utf-8?B?QWxCcnVBbg==?= | last post by:
I have a regular expression for capturing all occurrences of words contained between {{ and }} in a file. My problem is I need to capture what is between those symbols. For instance, if I have...
1
by: jonnyboy6969 | last post by:
Hi All Really hoping someone can help me out here with my deficient regex skills :) I have a function which takes a string of HTML and replaces a term (word or phrase) with a link. The pupose...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.