Connecting Tech Pros Worldwide Help | Site Map
 
 
LinkBack Thread Tools Search this Thread
  #1  
Old November 12th, 2005, 04:39 AM
beanweed
Guest
 
Posts: n/a
Default ?Best way to escape for xml?

I have an Oracle Procedure that returns xml. There was an exception thrown
when I tried to load an XmlDocument (using LoadXml(String)) because the value
returned from Oracle had an exclamation mark in a text node. So I wrote an
escape method to encode the text coming into the .Net class. But now I am not
sure how to write the unescape method to handle the text when it goes back to
the Oracle db.

Here's what I mean:
- if the data in the db is "abc ! & ' " < > efg", the procedure produces
this element

<data>abc &#x21; & " ' < > xyz</data>

- loading this into an XmlDocument using thedocument.LoadXml(oraclestring)
leaves

<data>abc ! & ' " < > xyz</data>

So do I write a method to change the "&", "<" and ">" back to "&<>" in
Oracle? How many other characters are changed and to what? Why was it bad for
me to give an "!" in the string when .Net then uses it unescaped thereafter?
I await enlightenment peacefully.

  #2  
Old November 12th, 2005, 04:39 AM
beanweed
Guest
 
Posts: n/a
Default RE: ?Best way to escape for xml?

Doh! I forgot to escape the message I sent. The middle part should look like
this

Here's what I mean:
- if the data in the db is "abc ! & ' " < > efg", the procedure produces
this element<br>
&lt;data>abc &#x21; &#x26; &#x27; &#x22; &#x3C; &#x3D; xyz</data>
<br>
- loading this into an XmlDocument using thedocument.LoadXml(oraclestring)
leaves
<br>
<data>abc ! &amp; ' " &lt; &gt; xyz</data>
<br>
So do I write a method to change the "&amp;", "&lt;" and "&gt;" back to
"&<>" in
Oracle?

"beanweed" wrote:
[color=blue]
> I have an Oracle Procedure that returns xml. There was an exception thrown
> when I tried to load an XmlDocument (using LoadXml(String)) because the value
> returned from Oracle had an exclamation mark in a text node. So I wrote an
> escape method to encode the text coming into the .Net class. But now I am not
> sure how to write the unescape method to handle the text when it goes back to
> the Oracle db.
>
> Here's what I mean:
> - if the data in the db is "abc ! & ' " < > efg", the procedure produces
> this element
>
> <data>abc ! & " ' < > xyz</data>
>
> - loading this into an XmlDocument using thedocument.LoadXml(oraclestring)
> leaves
>
> <data>abc ! & ' " < > xyz</data>
>
> So do I write a method to change the "&", "<" and ">" back to "&<>" in
> Oracle? How many other characters are changed and to what? Why was it bad for
> me to give an "!" in the string when .Net then uses it unescaped thereafter?
> I await enlightenment peacefully.
>[/color]
  #3  
Old November 12th, 2005, 04:39 AM
Ion Vasilian
Guest
 
Posts: n/a
Default Re: ?Best way to escape for xml?

Hi,

You might want to take a look at http://www.w3.org/TR/REC-xml/#syntax
first. It is unlikely that the problem is caused by the '!'; '&' is the
culprit.
Whatever you're getting is not well-formed xml. You'll need to change the
source so that it applies the following character mapping on serialization:
for character data:
< -> &lt;
& -> &amp;
for attribute values the above plus:
' -> &apos;
" -> &quot;
Unless you have a very specialized scenario you cannot consistently filter
the
input unless you're parsing the xml at the same time. You probably don't
want
to do that. Another alternative is to use CDATA sections for the text.
Again,
that's reasonable provided that you have control of the source
serialization.

Ion

"beanweed" <beanweed@discussions.microsoft.com> wrote in message
news:A593F356-B964-431B-8978-7AAFEEEA1120@microsoft.com...[color=blue]
> I have an Oracle Procedure that returns xml. There was an exception thrown
> when I tried to load an XmlDocument (using LoadXml(String)) because the[/color]
value[color=blue]
> returned from Oracle had an exclamation mark in a text node. So I wrote an
> escape method to encode the text coming into the .Net class. But now I am[/color]
not[color=blue]
> sure how to write the unescape method to handle the text when it goes back[/color]
to[color=blue]
> the Oracle db.
>
> Here's what I mean:
> - if the data in the db is "abc ! & ' " < > efg", the procedure produces
> this element
>
> <data>abc &#x21; & " ' < > xyz</data>
>
> - loading this into an XmlDocument using thedocument.LoadXml(oraclestring)
> leaves
>
> <data>abc ! & ' " < > xyz</data>
>
> So do I write a method to change the "&", "<" and ">" back to "&<>" in
> Oracle? How many other characters are changed and to what? Why was it bad[/color]
for[color=blue]
> me to give an "!" in the string when .Net then uses it unescaped[/color]
thereafter?[color=blue]
> I await enlightenment peacefully.
>[/color]


  #4  
Old November 12th, 2005, 04:39 AM
beanweed
Guest
 
Posts: n/a
Default Re: ?Best way to escape for xml?

Thanks for your help. Subsequent investigation leads me to believe that,
although the original error was and is reported to have been caused by the
line containing "xmldoc.LoadXml(oraclestring)", it was the next line
containing "xmlcontrol.Document = xmldoc" and the tranformation of the xml
using an xsl document that is the fault. (Removing this line stops the
problem, changing the xsl solves it.) I will change my encode and decode
methods (in Java on the Oracle server) to comply with your suggestion and the
W3 specification. But I will remain puzzled as to why .Net just ignored
"&#x21;"(?).

"Ion Vasilian" wrote:
[color=blue]
> Hi,
>
> You might want to take a look at http://www.w3.org/TR/REC-xml/#syntax
> first. It is unlikely that the problem is caused by the '!'; '&' is the
> culprit.
> Whatever you're getting is not well-formed xml. You'll need to change the
> source so that it applies the following character mapping on serialization:
> for character data:
> < -> <
> & -> &
> for attribute values the above plus:
> ' -> &apos;
> " -> "
> Unless you have a very specialized scenario you cannot consistently filter
> the
> input unless you're parsing the xml at the same time. You probably don't
> want
> to do that. Another alternative is to use CDATA sections for the text.
> Again,
> that's reasonable provided that you have control of the source
> serialization.
>
> Ion
>
> "beanweed" <beanweed@discussions.microsoft.com> wrote in message
> news:A593F356-B964-431B-8978-7AAFEEEA1120@microsoft.com...[color=green]
> > I have an Oracle Procedure that returns xml. There was an exception thrown
> > when I tried to load an XmlDocument (using LoadXml(String)) because the[/color]
> value[color=green]
> > returned from Oracle had an exclamation mark in a text node. So I wrote an
> > escape method to encode the text coming into the .Net class. But now I am[/color]
> not[color=green]
> > sure how to write the unescape method to handle the text when it goes back[/color]
> to[color=green]
> > the Oracle db.
> >
> > Here's what I mean:
> > - if the data in the db is "abc ! & ' " < > efg", the procedure produces
> > this element
> >
> > <data>abc ! & " ' < > xyz</data>
> >
> > - loading this into an XmlDocument using thedocument.LoadXml(oraclestring)
> > leaves
> >
> > <data>abc ! & ' " < > xyz</data>
> >
> > So do I write a method to change the "&", "<" and ">" back to "&<>" in
> > Oracle? How many other characters are changed and to what? Why was it bad[/color]
> for[color=green]
> > me to give an "!" in the string when .Net then uses it unescaped[/color]
> thereafter?[color=green]
> > I await enlightenment peacefully.
> >[/color]
>
>
>[/color]
 

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Popular Articles

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over 205,248 network members.