By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,136 Members | 1,087 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,136 IT Pros & Developers. It's quick & easy.

XML and carriage returns

P: n/a
Hi,

I'm using System.Data.DataSet.ReadXml to convert some xml from a webservice
to a DataSet. The xml looks like:

<?xml version="1.0"
encoding="UTF-8"?><root><prioritiesTable><description>blah blah blah
blah</description></prioritiesTable></root>

The important thing here is the which I beleive should be a single
carriage return. However when ReadXml has finished it has been converted to
what appears to be a carriage return and line feed. Which is a big problem
for me because it adds an extra character which inturn can cause a
constraint vialation in my database.

Is it doing it wrong? What can I do to change this behaviour??

please help!
thanks in advance
Josh
Nov 12 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Hi Josh -

ASCII character 13 is the carriage return character - but in your XML you're using #13 which is the hexadecimal form for the number 19. This won't give you the results you're after. Instead use &13; (i.e., remove the # character) or &0d; to represent the carriage return.

Because this might have been a typo when you made your original post, I would also suggest that your also create an XmlTextReader object and supply that to the DataSet.ReadXml() method instead. This will at the very least allow you to experiment with the properies on the XmlTextReader class.

Hope this helps, or points you in the right direction.

--
Tim Roberts
Kilostar Solutions Ltd.
"Josh" wrote:
Hi,

I'm using System.Data.DataSet.ReadXml to convert some xml from a webservice
to a DataSet. The xml looks like:

<?xml version="1.0"
encoding="UTF-8"?><root><prioritiesTable><description>blah blah blah blah</description></prioritiesTable></root>

The important thing here is the which I beleive should be a single carriage return. However when ReadXml has finished it has been converted to
what appears to be a carriage return and line feed. Which is a big problem
for me because it adds an extra character which inturn can cause a
constraint vialation in my database.

Is it doing it wrong? What can I do to change this behaviour??

please help!
thanks in advance
Josh

Nov 12 '05 #2

P: n/a
Hi, thanks for your reply Tim.

I'm probably more confused though! lol
I didnt write the conponet that supplied the xml containing " ". But I
know in the DB where it got the data from it was a carriage return, I
inserted myself using ctrl M. So is 'and hash one there' a valid character
for utf-8 encoding? I thought nothing below 20 other than carriage return
and a couple of others was legal in utf8? If its ascii 19 thats character
DC3 which defenetly isnt what its meant to be.
Is it perhaps a html escape sequence that isnt strictly xml?

Any ideas would be much appreaciated.
thanks
Josh
"tim-kilostar" <ti*********@discussions.microsoft.com> wrote in message
news:9B**********************************@microsof t.com...
Hi Josh -

ASCII character 13 is the carriage return character - but in your XML you're using #13 which is the hexadecimal form for the number 19. This won't
give you the results you're after. Instead use &13; (i.e., remove the #
character) or &0d; to represent the carriage return.
Because this might have been a typo when you made your original post, I would also suggest that your also create an XmlTextReader object and supply
that to the DataSet.ReadXml() method instead. This will at the very least
allow you to experiment with the properies on the XmlTextReader class.
Hope this helps, or points you in the right direction.

--
Tim Roberts
Kilostar Solutions Ltd.
"Josh" wrote:
Hi,

I'm using System.Data.DataSet.ReadXml to convert some xml from a webservice to a DataSet. The xml looks like:

<?xml version="1.0"
encoding="UTF-8"?><root><prioritiesTable><description>blah blah

blah
blah</description></prioritiesTable></root>

The important thing here is the

which I beleive should be a single
carriage return. However when ReadXml has finished it has been converted to what appears to be a carriage return and line feed. Which is a big problem for me because it adds an extra character which inturn can cause a
constraint vialation in my database.

Is it doing it wrong? What can I do to change this behaviour??

please help!
thanks in advance
Josh

Nov 12 '05 #3

P: n/a


tim-kilostar wrote:
Hi Josh -

ASCII character 13 is the carriage return character - but in your XML
you're using #13 which is the hexadecimal form for the number 19.


No, in XML numeric character references in the form
&#dddd;
(d meaning 0..9) uses decimal notation so

is the Unicode character with character code 13 and that is the carriage
return character.
If you wanted to use hexadecimal notation then you need to use
&#xdd;
(d meaning 0..9A..F) for instance for the carriage return you need
&#xD;
see the XML specification
http://www.w3.org/TR/REC-xml/
in particular
http://www.w3.org/TR/REC-xml/#sec-references

--

Martin Honnen
http://JavaScript.FAQTs.com/

Nov 12 '05 #4

P: n/a
Josh wrote:
I'm using System.Data.DataSet.ReadXml to convert some xml from a webservice
to a DataSet. The xml looks like:

<?xml version="1.0"
encoding="UTF-8"?><root><prioritiesTable><description>blah blah blah
blah</description></prioritiesTable></root>

The important thing here is the which I beleive should be a single
carriage return. However when ReadXml has finished it has been converted to
what appears to be a carriage return and line feed. Which is a big problem
for me because it adds an extra character which inturn can cause a
constraint vialation in my database.


XML treats end-of-line characters semantically, not syntactically, thus
requiring normalization of such characters for the platform
independence's sake.
According to the XML spec:
"To simplify the tasks of applications, the XML processor MUST behave as
if it normalized all line breaks in external parsed entities (including
the document entity) on input, before parsing, by translating both the
two-character sequence #xD #xA and any #xD that is not followed by #xA
to a single #xA character."
And when XML is being serialized to bytes, #xA is usually serialized in
platform-dependent way, which is #xD#xA on Windows.

--
Oleg Tkachenko [XML MVP]
http://blog.tkachenko.com
Nov 12 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.