472,954 Members | 1,682 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,954 software developers and data experts.

Regex replace question

Hi:
I have a XML like
<?xml version="1.0" ?>
<object>
<comments>www.site.com/page.aspx?param1=value1&param2=value2</comments>
</object>

Since "&" is invalid in XML, I need to replace all "&" to "&amp;" only
within <comments> tag, so I need to build a Regex pattern to replace "&"
only between <comments> and </comments>.
Anybody has idea how to make it?
Thanks!

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy
Nov 15 '05 #1
4 4567
I'd try something like:

(?<start>\<comments\>.+?)&(.+?<end>\</comments\>)

The .+? is a non-greedy match, so you won't match anything in between.
You'll need to refere to the start and end captures in your replacement
string so that that part of the text ends up back in the string.

--
Eric Gunnerson

Visit the C# product team at http://www.csharp.net
Eric's blog is at http://weblogs.asp.net/ericgu/

This posting is provided "AS IS" with no warranties, and confers no rights.
"Hardy Wang" <ha********@marketrend.com> wrote in message
news:eS**************@TK2MSFTNGP12.phx.gbl...
Hi:
I have a XML like
<?xml version="1.0" ?>
<object>
<comments>www.site.com/page.aspx?param1=value1&param2=value2</comments>
</object>

Since "&" is invalid in XML, I need to replace all "&" to "&amp;" only
within <comments> tag, so I need to build a Regex pattern to replace "&"
only between <comments> and </comments>.
Anybody has idea how to make it?
Thanks!

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy

Nov 15 '05 #2
Hi
[inline]
"Hardy Wang" <ha********@marketrend.com> wrote in message
news:eS**************@TK2MSFTNGP12.phx.gbl...
Hi:
I have a XML like
<?xml version="1.0" ?>
<object>
<comments>www.site.com/page.aspx?param1=value1&param2=value2</comments>
</object>

Since "&" is invalid in XML, I need to replace all "&" to "&amp;" only
within <comments> tag, so I need to build a Regex pattern to replace "&"
only between <comments> and </comments>.
Anybody has idea how to make it?
Because .NET supports variable lookbehind (which is special) you can do
something like this:
string ouput = Regex.Replace(input,
"(?<=\\<comments\\>[^\\<\\>]*?)&(?=[^\\<\\>]*\\</comments\\>)", "&amp");

Using lookahead and lookbehind to make sure the & is inside comments tags.

HTH,
greetings



Thanks!

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy

Nov 15 '05 #3
Thanks man, because my program need to receive a text file passed from a
third party, unfortunately we cannot control the output from other side. The
text file SHOULD be just a XML document, sadly there are some "&" in it. So
that is the reason I will clean up them.

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy
"Nick Malik" <ni*******@hotmail.nospam.com> wrote in message
news:lDCQb.110670$Rc4.804177@attbi_s54...
You stated >> I have a XML like<<
Clearly, you don't have XML, because the string is not well formed. I
assume, therefore, that you are actually CREATING the xml in your code.

so while creating the XML document in code, and you want to replace all of
the & characters because the resulting XML would be invalid.

Why not just create an XML object, add the <object> node, under it add the
<comments> node, and in that provide the text. The XML object will escape
the chararacters for you when you output the document.

Creating the object in a string is the problem.

On the other hand, if you are creating it in code, you can replace all of
the invalid characters BEFORE placing it in the XML tags. I believe that
there is a method similar to HTMLEncode that will do this for you... and
then you can add the resulting string to the tags.

So, two solutions... neither requiring difficult Regex programming.

Hope this helps,
--- Nick

"Hardy Wang" <ha********@marketrend.com> wrote in message
news:eS**************@TK2MSFTNGP12.phx.gbl...
Hi:
I have a XML like
<?xml version="1.0" ?>
<object>
<comments>www.site.com/page.aspx?param1=value1&param2=value2</comments>
</object>

Since "&" is invalid in XML, I need to replace all "&" to "&amp;" only
within <comments> tag, so I need to build a Regex pattern to replace "&"
only between <comments> and </comments>.
Anybody has idea how to make it?
Thanks!

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy


Nov 15 '05 #4
I think this would do it for you:

string goodXML = Regex.Replace(badXML,
@"(?<=\<comments\>.*)&(?=.*\</comments\>)", "&amp;")

Regular expressions are in a strange but seemingly beautiful domain ;-)

cheers,

mortb

"Hardy Wang" <ha********@marketrend.com> wrote in message
news:%2****************@TK2MSFTNGP11.phx.gbl...
Thanks man, because my program need to receive a text file passed from a
third party, unfortunately we cannot control the output from other side. The text file SHOULD be just a XML document, sadly there are some "&" in it. So that is the reason I will clean up them.

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy
"Nick Malik" <ni*******@hotmail.nospam.com> wrote in message
news:lDCQb.110670$Rc4.804177@attbi_s54...
You stated >> I have a XML like<<
Clearly, you don't have XML, because the string is not well formed. I
assume, therefore, that you are actually CREATING the xml in your code.

so while creating the XML document in code, and you want to replace all of the & characters because the resulting XML would be invalid.

Why not just create an XML object, add the <object> node, under it add the <comments> node, and in that provide the text. The XML object will escape the chararacters for you when you output the document.

Creating the object in a string is the problem.

On the other hand, if you are creating it in code, you can replace all of the invalid characters BEFORE placing it in the XML tags. I believe that there is a method similar to HTMLEncode that will do this for you... and
then you can add the resulting string to the tags.

So, two solutions... neither requiring difficult Regex programming.

Hope this helps,
--- Nick

"Hardy Wang" <ha********@marketrend.com> wrote in message
news:eS**************@TK2MSFTNGP12.phx.gbl...
Hi:
I have a XML like
<?xml version="1.0" ?>
<object>
<comments>www.site.com/page.aspx?param1=value1&param2=value2</comments> </object>

Since "&" is invalid in XML, I need to replace all "&" to "&amp;" only
within <comments> tag, so I need to build a Regex pattern to replace "&" only between <comments> and </comments>.
Anybody has idea how to make it?
Thanks!

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy



Nov 15 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Jon Maz | last post by:
Hi All, Am getting frustrated trying to port the following (pretty simple) function to CSharp. The problem is that I'm lousy at Regular Expressions.... //from...
16
by: Stephane | last post by:
Hi, I'm trying to replace parenthesis using Regex.replace but I'm always having this error: System.ArgumentException: parsing ":-)" - Too many )'s. Parameter name: :-) Here's my code: ...
1
by: Theo Chakkapark | last post by:
I'm having issues trying to replace text with PHP. For example, if I have a string of text that reads: {tag} And want to replace that with: $_POST
6
by: tshad | last post by:
Is there a way to use Regex inside of a tag, such as asp:label? I tried something like this but can't make it work: <asp:label id="Phone" text=Regex.Replace('<%# Container.DataItem("Phone")...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
4
by: Cor | last post by:
Hi Newsgroup, I have given an answer in this newsgroup about a "Replace". There came an answer on that I did not understand, so I have done some tests. I got the idea that someone said,...
9
by: Whitless | last post by:
Okay I am ready to pull what little hair I have left out. I pass the function below my String to search, my find string (a regular expression) and my replace string (another regular expression)....
6
by: Martin Evans | last post by:
Sorry, yet another REGEX question. I've been struggling with trying to get a regular expression to do the following example in Python: Search and replace all instances of "sleeping" with "dead"....
15
by: morleyc | last post by:
Hi, i would like to remove a number of characters from my string (\t \r \n which are throughout the string), i know regex can do this but i have no idea how. Any pointers much appreciated. Chris
0
by: Karch | last post by:
I have these two methods that are chewing up a ton of CPU time in my application. Does anyone have any suggestions on how to optimize them or rewrite them without Regex? The most time-consuming...
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: Aliciasmith | last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...
0
tracyyun
by: tracyyun | last post by:
Hello everyone, I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
2
by: giovanniandrean | last post by:
The energy model is structured as follows and uses excel sheets to give input data: 1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
3
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM) Please note that the UK and Europe revert to winter time on...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.