473,383 Members | 1,872 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

Regex replace question

Hi:
I have a XML like
<?xml version="1.0" ?>
<object>
<comments>www.site.com/page.aspx?param1=value1&param2=value2</comments>
</object>

Since "&" is invalid in XML, I need to replace all "&" to "&amp;" only
within <comments> tag, so I need to build a Regex pattern to replace "&"
only between <comments> and </comments>.
Anybody has idea how to make it?
Thanks!

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy
Nov 15 '05 #1
4 4592
I'd try something like:

(?<start>\<comments\>.+?)&(.+?<end>\</comments\>)

The .+? is a non-greedy match, so you won't match anything in between.
You'll need to refere to the start and end captures in your replacement
string so that that part of the text ends up back in the string.

--
Eric Gunnerson

Visit the C# product team at http://www.csharp.net
Eric's blog is at http://weblogs.asp.net/ericgu/

This posting is provided "AS IS" with no warranties, and confers no rights.
"Hardy Wang" <ha********@marketrend.com> wrote in message
news:eS**************@TK2MSFTNGP12.phx.gbl...
Hi:
I have a XML like
<?xml version="1.0" ?>
<object>
<comments>www.site.com/page.aspx?param1=value1&param2=value2</comments>
</object>

Since "&" is invalid in XML, I need to replace all "&" to "&amp;" only
within <comments> tag, so I need to build a Regex pattern to replace "&"
only between <comments> and </comments>.
Anybody has idea how to make it?
Thanks!

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy

Nov 15 '05 #2
Hi
[inline]
"Hardy Wang" <ha********@marketrend.com> wrote in message
news:eS**************@TK2MSFTNGP12.phx.gbl...
Hi:
I have a XML like
<?xml version="1.0" ?>
<object>
<comments>www.site.com/page.aspx?param1=value1&param2=value2</comments>
</object>

Since "&" is invalid in XML, I need to replace all "&" to "&amp;" only
within <comments> tag, so I need to build a Regex pattern to replace "&"
only between <comments> and </comments>.
Anybody has idea how to make it?
Because .NET supports variable lookbehind (which is special) you can do
something like this:
string ouput = Regex.Replace(input,
"(?<=\\<comments\\>[^\\<\\>]*?)&(?=[^\\<\\>]*\\</comments\\>)", "&amp");

Using lookahead and lookbehind to make sure the & is inside comments tags.

HTH,
greetings



Thanks!

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy

Nov 15 '05 #3
Thanks man, because my program need to receive a text file passed from a
third party, unfortunately we cannot control the output from other side. The
text file SHOULD be just a XML document, sadly there are some "&" in it. So
that is the reason I will clean up them.

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy
"Nick Malik" <ni*******@hotmail.nospam.com> wrote in message
news:lDCQb.110670$Rc4.804177@attbi_s54...
You stated >> I have a XML like<<
Clearly, you don't have XML, because the string is not well formed. I
assume, therefore, that you are actually CREATING the xml in your code.

so while creating the XML document in code, and you want to replace all of
the & characters because the resulting XML would be invalid.

Why not just create an XML object, add the <object> node, under it add the
<comments> node, and in that provide the text. The XML object will escape
the chararacters for you when you output the document.

Creating the object in a string is the problem.

On the other hand, if you are creating it in code, you can replace all of
the invalid characters BEFORE placing it in the XML tags. I believe that
there is a method similar to HTMLEncode that will do this for you... and
then you can add the resulting string to the tags.

So, two solutions... neither requiring difficult Regex programming.

Hope this helps,
--- Nick

"Hardy Wang" <ha********@marketrend.com> wrote in message
news:eS**************@TK2MSFTNGP12.phx.gbl...
Hi:
I have a XML like
<?xml version="1.0" ?>
<object>
<comments>www.site.com/page.aspx?param1=value1&param2=value2</comments>
</object>

Since "&" is invalid in XML, I need to replace all "&" to "&amp;" only
within <comments> tag, so I need to build a Regex pattern to replace "&"
only between <comments> and </comments>.
Anybody has idea how to make it?
Thanks!

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy


Nov 15 '05 #4
I think this would do it for you:

string goodXML = Regex.Replace(badXML,
@"(?<=\<comments\>.*)&(?=.*\</comments\>)", "&amp;")

Regular expressions are in a strange but seemingly beautiful domain ;-)

cheers,

mortb

"Hardy Wang" <ha********@marketrend.com> wrote in message
news:%2****************@TK2MSFTNGP11.phx.gbl...
Thanks man, because my program need to receive a text file passed from a
third party, unfortunately we cannot control the output from other side. The text file SHOULD be just a XML document, sadly there are some "&" in it. So that is the reason I will clean up them.

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy
"Nick Malik" <ni*******@hotmail.nospam.com> wrote in message
news:lDCQb.110670$Rc4.804177@attbi_s54...
You stated >> I have a XML like<<
Clearly, you don't have XML, because the string is not well formed. I
assume, therefore, that you are actually CREATING the xml in your code.

so while creating the XML document in code, and you want to replace all of the & characters because the resulting XML would be invalid.

Why not just create an XML object, add the <object> node, under it add the <comments> node, and in that provide the text. The XML object will escape the chararacters for you when you output the document.

Creating the object in a string is the problem.

On the other hand, if you are creating it in code, you can replace all of the invalid characters BEFORE placing it in the XML tags. I believe that there is a method similar to HTMLEncode that will do this for you... and
then you can add the resulting string to the tags.

So, two solutions... neither requiring difficult Regex programming.

Hope this helps,
--- Nick

"Hardy Wang" <ha********@marketrend.com> wrote in message
news:eS**************@TK2MSFTNGP12.phx.gbl...
Hi:
I have a XML like
<?xml version="1.0" ?>
<object>
<comments>www.site.com/page.aspx?param1=value1&param2=value2</comments> </object>

Since "&" is invalid in XML, I need to replace all "&" to "&amp;" only
within <comments> tag, so I need to build a Regex pattern to replace "&" only between <comments> and </comments>.
Anybody has idea how to make it?
Thanks!

--
WWW: http://hardywang.1accesshost.com
ICQ: 3359839
yours Hardy



Nov 15 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Jon Maz | last post by:
Hi All, Am getting frustrated trying to port the following (pretty simple) function to CSharp. The problem is that I'm lousy at Regular Expressions.... //from...
16
by: Stephane | last post by:
Hi, I'm trying to replace parenthesis using Regex.replace but I'm always having this error: System.ArgumentException: parsing ":-)" - Too many )'s. Parameter name: :-) Here's my code: ...
1
by: Theo Chakkapark | last post by:
I'm having issues trying to replace text with PHP. For example, if I have a string of text that reads: {tag} And want to replace that with: $_POST
6
by: tshad | last post by:
Is there a way to use Regex inside of a tag, such as asp:label? I tried something like this but can't make it work: <asp:label id="Phone" text=Regex.Replace('<%# Container.DataItem("Phone")...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
4
by: Cor | last post by:
Hi Newsgroup, I have given an answer in this newsgroup about a "Replace". There came an answer on that I did not understand, so I have done some tests. I got the idea that someone said,...
9
by: Whitless | last post by:
Okay I am ready to pull what little hair I have left out. I pass the function below my String to search, my find string (a regular expression) and my replace string (another regular expression)....
6
by: Martin Evans | last post by:
Sorry, yet another REGEX question. I've been struggling with trying to get a regular expression to do the following example in Python: Search and replace all instances of "sleeping" with "dead"....
15
by: morleyc | last post by:
Hi, i would like to remove a number of characters from my string (\t \r \n which are throughout the string), i know regex can do this but i have no idea how. Any pointers much appreciated. Chris
0
by: Karch | last post by:
I have these two methods that are chewing up a ton of CPU time in my application. Does anyone have any suggestions on how to optimize them or rewrite them without Regex? The most time-consuming...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.