473,732 Members | 2,146 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Replacing whole word using regex in C#

1 New Member
Hi All

Really hoping someone can help me out here with my deficient regex skills :)

I have a function which takes a string of HTML and replaces a term (word or phrase) with a link. The pupose is that I seek out terms which are in a glossary on our site, and automatically link to this definition. Its slightly complex becase certain elements have to be ignored, for exampleI dont want to add links within existing links, or for example link terms contained in e.g. <h1><h1> tags.
Anyway I have a function which I didnt actually write, hense the problem I have in modifying it.

The problem I have is that I dont want it to replace terms which are not WHOLE words. So for example if I an searching for the term "fund", it current does the replace if it finds the word "funds".

I have tried using the regex \b escape, but this doesnt seem to work.

Ay help would be much much appreciated.

Thanks in advance
John
Expand|Select|Wrap|Line Numbers
  1. // Replaces all instances of text match in HTML string, ignoring instances in HTML 
  2. #region public static string PlainTextReplace(string html, string oldString, string newString, string Definition)
  3.  
  4. // Regex matches for PlainTextReplace
  5. static Regex rxPlainText = new Regex(@"^[^\<]+", RegexOptions.IgnoreCase);
  6. static Regex rxTag = new Regex(@"</?\s*(?'tagname'[^>\s]+).*?>", RegexOptions.Compiled);
  7. static Regex[] rxForbiddenTags = new Regex[]{
  8. new Regex(@"^h\d$", RegexOptions.Compiled), // Matches <h?>
  9. new Regex("^a$", RegexOptions.Compiled)    // Matches <a>
  10. }; 
  11.  
  12. public static string PlainTextReplace(string html, string oldString, string urlString, string Definition)
  13. {
  14.     int iStringPos=0;
  15.     Stack tagStack = new Stack();
  16.     StringBuilder sbResult = new StringBuilder();
  17.     Match match;
  18.     while (iStringPos < html.Length)
  19.     {
  20.         bool bContainsForbiddenTag = false;
  21.         IEnumerator enumTags = tagStack.GetEnumerator();
  22.  
  23.         while (enumTags.MoveNext())
  24.         {
  25.             string sCurrentTag = (string) enumTags.Current;
  26.             foreach (Regex rxForbiddenTag in rxForbiddenTags) // loop through all enclosing tags and check for forbidden ones.
  27.             {
  28.                 match = rxForbiddenTag.Match(sCurrentTag);
  29.                 if (match.Success)
  30.                 {
  31.                     bContainsForbiddenTag = true;
  32.                     break;
  33.                 }
  34.             }
  35.             if (bContainsForbiddenTag)
  36.                 break;
  37.         }
  38.  
  39.         //if (tagStack.Count == 0) // only perform replacement at tag depth 0.
  40.         if (!bContainsForbiddenTag) // Ignores tag depth. Skips all text enclosed in one or more forbidden tags.
  41.         {
  42.             match = rxPlainText.Match(html, iStringPos, html.Length - iStringPos);
  43.             if (match.Success)
  44.             {
  45.                 string searchString = match.Value;
  46.                 int index = searchString.ToLower().IndexOf(oldString.ToLower());
  47.                 if (index != -1)
  48.                 {
  49.                     searchString = "<a href=\"/" + Globals.SiteAlias + "/jargon-" + urlString + ".aspx\" class=\"jargon\" title=\"" + Definition + "\">" + searchString.Substring(index, oldString.Length) + "</a>";
  50.                 }
  51.  
  52.                 // Do the replace and move on.
  53.                 sbResult.Append( Regex.Replace(match.Value, oldString, searchString, RegexOptions.IgnoreCase) );
  54.                 // THIS DOESNT WORK
  55.                 //sbResult.Append( Regex.Replace(match.Value, @"\b" + oldString + "\b", searchString, RegexOptions.IgnoreCase) );
  56.                 iStringPos = match.Index + match.Length;
  57.             }
  58.         }
  59.  
  60.         match = rxTag.Match(html, iStringPos, html.Length - iStringPos);
  61.         if (match.Success)
  62.         {
  63.             if (match.Value.StartsWith("</"))
  64.             {
  65.                 try
  66.                 {
  67.                     if(match.Groups["tagname"].Value.ToLower().Trim().Equals(((string) tagStack.Peek())))
  68.                         tagStack.Pop();    
  69.                 }
  70.                 catch
  71.                 {
  72.  
  73.                 }                    
  74.             }
  75.             else if (match.Value.EndsWith("/>") || match.Value.StartsWith("<!--"))
  76.             {
  77.                 // ignore
  78.             }
  79.             else
  80.             {
  81.                 tagStack.Push(match.Groups["tagname"].Value.ToLower().Trim());
  82.             }
  83.             sbResult.Append( html.Substring(iStringPos, match.Index + match.Length - iStringPos));
  84.             iStringPos = match.Index + match.Length;
  85.         }
  86.  
  87.     }
  88.     return sbResult.ToString();
  89. }
  90. #endregion
Oct 2 '07 #1
1 12199
gauravbhatt16
1 New Member
Hi,

sbResult.Append (Regex.Replace( match.Value, "\\b" + oldString + "\\b", searchString, RegexOptions.Ig noreCase));

the \\b will work :)

Regards,
Gaurav Bhatt
<link removed>
Apr 27 '10 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

3
13675
by: Christopher R. Barry | last post by:
I need to search and replace multiple words in one pass of an input stream or string. For example, given the input: "The quick brown fox jumped over the lazy dog's back" and given the replacements quick -> slow jump -> walk
0
1332
by: leeonions | last post by:
Hi there, i am trying to use regular expressions to search through a text string and replace a given whole word. take the string = "The matsat on the mat!" (bad example i know) i want to replace the whole word 'mat' with the word 'cat' to give "The matsat on the cat!" (matsat was not replaced as only whole word match, cat on the other
2
11121
by: leeonions | last post by:
Hi there, i am trying to use regular expressions to search through a text string and replace a given whole word. take the string = "The matsat on the mat!" (bad example i know) i want to replace the whole word 'mat' with the word 'cat' to give "The matsat on the cat!" (matsat was not replaced as only whole word match, cat on the other
4
3605
by: shonend | last post by:
I am trying to extract the pattern like this : "SUB: some text LOT: one-word" Described, "SUB" and "LOT" are key words; I want those words, everything in between and one word following the "LOT:". Source text may contain multiple "SUB: ... LOT:" blocks. For example this is my source text:
4
4801
by: seannakasone | last post by:
Is there a way to search a string for a whole word w/o using a regular expression? i.e. mystring.IndexOf(mypattern, *WholeWord); The reason i rather not use a regular expression is because sometimes i want to search for whole words w/o the regular expression special characters being interpreted.
5
10573
by: MrNobody | last post by:
is there a simple way to make it so your regex only matches whole words? i was thinking simply something like: *match_string* but then I think it would fail if the word was at the beginning or end of a line?
6
9667
by: Gary Bond | last post by:
Hi All, Being a bit of a newbie with regex, I am confused when using word boundaries. For instance, I want to replace all the stand alone '.5k' that occur in an input string, with 500. In other words "this is a .5k example" goes to "this is a 500 example" The replace should not touch '.5k' that occurs inside a word. For example:
4
1858
by: wildman | last post by:
RE: Replacing Text without changing case?? This code works great, but case has to be exact. Research.Text = Research.Text.Replace(textboxSearch.Text, "<b>" + textboxSearch.Text + "</b>") I found some c# code, but I can't make it work in vb.net:
5
5367
by: Peng Yu | last post by:
Hi, The following code snippet is from /usr/bin/rpl. I would like the it to match a word, for example, "abc" in ":abc:". But the current one would not match "abc" in ":abc:". I tried to modify it myself. Would you please let me know what is the corrected way to do it? Thanks, Peng
0
8773
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9445
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9234
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9180
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8186
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6733
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6030
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4805
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3259
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.