According to the XML 1.0 (Third Edition) W3C Recommendation
(http://www.w3.org/TR/2004/REC-xml-20...#sec-line-ends) all #xD, #xA,
and #xD#xA character combinations should be converted to a single #xA
character.
According to the "Reading XML with the XmlReader" section of the ".NET
Framework Developer's Guide" on-line help, the XmlReader will not perform
this normalization by default. You can cause the XmlReader to perform this
normalization by setting the Normalization property to true. This does not
appear to be the case in every situation.
Sample XML File:
<?xml version="1.0"?>
<test>
<input>12345</input>
<input>12
3</input>
<input>12
34</input>
<input>12
3</input>
<input>12
34</input>
<input>12
3</input>
<input>12
34</input>
<input>12
3</input>
<input>12
34</input>
</test>
Sample XSD Schema File:
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="test">
<xsd:complexType>
<xsd:choice minOccurs="0"
maxOccurs="unbounded">
<xsd:element name="input">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:maxLength value="5"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
</xsd:choice>
</xsd:complexType>
</xsd:element>
</xsd:schema>
If the XML File above is loaded using a XmlReader and a XmlValidatingReader
object with the XmlReader.Normalization property to false, the following two
errors are generated:
Error 1:
The 'input' element has an invalid value according to its data type. An
error occurred at file: Test Case.xml, (7, 5).
<input>12
34</input>
^
Error 2:
The 'input' element has an invalid value according to its data type. An
error occurred at file: Test Case.xml, (9, 25).
<input>12
34</input>
^
These errors are expected since the input file was not normalized and the
<input> element can only be 5 characters long. One would assume that setting
the XmlReader.Normalization property to true would eliminate these two
errors, however that is not the case. The following error still exists even
with the XmlReader.Normalization property set to true:
Error 1:
The 'input' element has an invalid value according to its data type. An
error occurred at file: Test Case.xml, (9, 25).
<input>12
34</input>
^
It appears as if the XmlReader does not perform normalization if the CR-LF
appears as a 
. Am I misinterpreting the XML specification or is
the XmlReader not handling this case properly?
----------------------------------------------------------------------------
------
Excerpt from the XML 1.0 (Third Edition) W3C Recommendation
(http://www.w3.org/TR/2004/REC-xml-20...ec-line-ends):
2.11 End-of-Line Handling
XML parsed entities are often stored in computer files which, for editing
convenience, are organized into lines. These lines are typically separated
by some combination of the characters CARRIAGE RETURN (#xD) and LINE FEED
(#xA).
To simplify the tasks of applications, the XML processor MUST behave as if
it normalized all line breaks in external parsed entities (including the
document entity) on input, before parsing, by translating both the two-chara
cter sequence #xD #xA and any #xD that is not followed by #xA to a single
#xA character.