By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,934 Members | 1,366 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,934 IT Pros & Developers. It's quick & easy.

XmlDocument converts "

P: n/a
Hello,
I' m using .NET System.Xml.XmlDOcument.
When I do the following:
XmlDocument xml = new XmlDocument();
xml.Load("blah");
....
xml.Save("blub");

I've got the problem that the following expression:
.... snip ...
<Comment>&apos;Depot&apos;</Comment>
.... snip ...

gets converted to

.... snip ...
<Comment>'Depot'</Comment>
.... snip ...

and the parser of another Company refuses to parse my Xml - File :-(
I've searched the web now quite a while, but didn't find any switch to say:
don't touch this special character.
amp stays, but quot and apos get converted to " and '

would be great if anybody could help me

thanks in advance
tobias
__********@gmx.net__
Nov 12 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
barney wrote:
I've got the problem that the following expression:
... snip ...
<Comment>&apos;Depot&apos;</Comment>
... snip ...

gets converted to

... snip ...
<Comment>'Depot'</Comment>
... snip ...

and the parser of another Company refuses to parse my Xml - File :-(


They are using definitely severely broken parser. There is no difference
between &apos; and ' character - former is just a reference to it. They
are fully equivalent in XML.

--
Oleg Tkachenko [XML MVP]
http://blog.tkachenko.com
Nov 12 '05 #2

P: n/a
"barney" <b-******@gmx.net> wrote in message news:dd**************************@posting.google.c om...
I've got the problem that the following expression:
... snip ...
<Comment>&apos;Depot&apos;</Comment>
... snip ...

gets converted to

... snip ...
<Comment>'Depot'</Comment>
... snip ...

and the parser of another Company refuses to parse my Xml - File :-(


Have you tried wrapping the output filename that you pass to the
XmlDocument::Save( ) method in an XmlTextWriter that substitutes
the &apos; and &quot; back into the text nodes that get written?

- - - KeepEntityXmlTextWriter.cs
using System;
using System.Diagnostics;
using System.IO;
using System.Xml;

public class KeepEntityXmlTextWriter : XmlTextWriter
{
private static readonly string[ ] ENTITY_SUBS = new string[] { "&apos;", "&quot;" };
private static readonly char[ ] REPLACE_CHARS = new char[] { '\'', '"' };
private bool insideAttribute;

public KeepEntityXmlTextWriter( string filename) : base( filename, null) { ; }

public override void WriteStartAttribute( string prefix, string localName, string ns)
{
this.insideAttribute = true;
base.WriteStartAttribute( prefix, localName, ns);
}

private void WriteStringWithReplace( string text)
{
string[ ] textSegments = text.Split( KeepEntityXmlTextWriter.REPLACE_CHARS);

if ( textSegments.Length > 1 )
{
for( int pos = -1, i = 0; i < textSegments.Length; ++i)
{
base.WriteString( textSegments[ i]);
pos += textSegments[ i].Length + 1;

// Assertion: Replace the following if-else when the number of
// replacement characters and substitute entities has grown
// greater than 2.
Debug.Assert( 2 == KeepEntityXmlTextWriter.REPLACE_CHARS.Length );

if ( pos != text.Length )
{
if ( text[ pos ] == KeepEntityXmlTextWriter.REPLACE_CHARS[ 0])
base.WriteRaw( KeepEntityXmlTextWriter.ENTITY_SUBS[ 0]);
else
base.WriteRaw( KeepEntityXmlTextWriter.ENTITY_SUBS[ 1]);
}
}
}
else base.WriteString( text);
}

public override void WriteString( string text)
{
if ( this.insideAttribute )
base.WriteString( text);
else
this.WriteStringWithReplace( text);
this.insideAttribute = false;
}
}
- - -

This specialized XmlTextWriter replaces ' and " in the XML written to a file
(hardcoded to UTF-8 encoding above, but you can add additional ctors)
when these characters appear in child text nodes. It doesn't touch these
characters when they appear in attribute values (depending on an attr's
quote character, one or the other character entity references may be
necessary for escapement). A WriteString( ) will occur after an Write-
StartAttribute( ) if the WriteString( ) is emitting an attribute value, so I
raise a flag in the overload of WriteStartAttribute( ) to detect this scenario.
Everytime WriteString( ) is called, it resets the flag.

When the presence of one or more of these replacement characters are
observed, the fragments of the text not containing these characters are
passed up to the base XmlTextWriter's WriteString( ) to emit them with
any necessary substitutions (like ampersand). To prevent XmlTextWriter
from converting the "&apos;" and "&quot;" entities into "&amp;apos;" and
"&amp;quot;" respectively, these must be written out using WriteRaw( )
to sidestep this conversion from happening.

If this other parser vendor's implementation of other characters gives you
further trouble, the KeepEntityXmlTextWriter should be straightforward to
extend for additional replacement characters and char entity ref substitutes.
Derek Harmon
Nov 12 '05 #3

P: n/a
Thanks Oleg,

but according to Xml rules I found on Internet you have to use the
following entities for those "special" characters:
< &lt;
&gt; & &amp;
" &quot;
' &apos;
I think I have to change them manually ...

bye tobias

"Oleg Tkachenko [MVP]" <oleg@no_!spam!_please!tkachenko.com> wrote in message news:<OH**************@TK2MSFTNGP12.phx.gbl>... barney wrote:
I've got the problem that the following expression:
... snip ...
<Comment>&apos;Depot&apos;</Comment>
... snip ...

gets converted to

... snip ...
<Comment>'Depot'</Comment>
... snip ...

and the parser of another Company refuses to parse my Xml - File :-(


They are using definitely severely broken parser. There is no difference
between &apos; and ' character - former is just a reference to it. They
are fully equivalent in XML.

Nov 12 '05 #4

P: n/a
barney wrote:
but according to Xml rules I found on Internet you have to use the
following entities for those "special" characters:


That's wrong. You MUST to escape < and & (XML API is doing it for you)
and MAY escape >, ' and " (usually it's done only when necessary).

--
Oleg Tkachenko [XML MVP]
http://blog.tkachenko.com
Nov 12 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.