471,056 Members | 1,558 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,056 software developers and data experts.

illegal character in xml file

Hi,

I have an XML file that was created as a DOM tree in .Net 1.1 and serialized
to disk. If I try to put character code 1 inside one of the attributes
(don't ask why), it seems to serialize perfectly ok and I get a file that
looks like this:

<element attribute="" />

which looks perfectly valid but won't open up with an XML viewer because it
says it is an illegal character reference.

what am I missing here? surely it's legal to put any character reference in
an XML file as long as it's correctly encoded? and if it's not, how come the
framework serialized it for me without complaining?

TIA

Andy
Feb 6 '07 #1
2 6041
* Andy Fish wrote in microsoft.public.dotnet.xml:
>what am I missing here? surely it's legal to put any character reference in
an XML file as long as it's correctly encoded? and if it's not, how come the
framework serialized it for me without complaining?
No, that's not legal, see http://www.w3.org/TR/xml for which characters
are allowed. In XML 1.0, U+0001 is not one of them. I think the API docs
warn you that some of the serialization functions do not check for well-
formedness.
--
Björn Höhrmann · mailto:bj****@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Feb 6 '07 #2
Andy Fish wrote:
I have an XML file that was created as a DOM tree in .Net 1.1 and serialized
to disk. If I try to put character code 1 inside one of the attributes
(don't ask why), it seems to serialize perfectly ok and I get a file that
looks like this:

<element attribute="" />

which looks perfectly valid but won't open up with an XML viewer because it
says it is an illegal character reference.

what am I missing here? surely it's legal to put any character reference in
an XML file as long as it's correctly encoded? and if it's not, how come the
framework serialized it for me without complaining?
With XML 1.0  is not well-formed. The XML parser and serializer in
..NET 1.x allows it nevertheless but that is a known flaw. With .NET 2.0
(still supporting XML 1.0) the parsing and serialization is more strict
by default although there are settings you can choose not to check
character references.
See
<http://msdn2.microsoft.com/en-us/library/system.xml.xmlreadersettings.checkcharacters.aspx>

I don't think there is anything you can do with .NET 1.x to have the
parser or serializer throw an error on e.g. .
--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Feb 6 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

16 posts views Thread by DBC User | last post: by
3 posts views Thread by =?Utf-8?B?SG9seXNtb2tl?= | last post: by
1 post views Thread by Andy Fish | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.