473,513 Members | 2,505 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

losing carriage returns in CDATA section - how do I prevent this?

I am using apache xerces J 2.5.0. I have \r\n feed combinations in the
CDATA sections that get converted to \n (or rather \r gets lost. I am
using sax parsing. I can see in the buffer that is passed that when I
have \n, one character back it has the \r, but the start offset is on
the \n. The source is an XML string, so it did not get lost while
reading the file. In any case, it seems that it should not be removing
the \r in the cdata section during my sax events. I am running this on
windows; so it seems like the bahavior is converting \r\n to \n might be
related. If this is related, this means that the code would not be
portable between unix and windows. It should give it to me as is.
Isn't this one of the purposes of the CDATA? I know that one can put
character entities in the XML and it works, but this is real ugly. We
just want to get some text from source location and put it into the XML
without having to replace \r with 
.
Jul 20 '05 #1
3 13378
In article <xa***************@newssvr14.news.prodigy.com>,
CarlosRivera <Ca**********@badnamefornospam.to> wrote:
I am using apache xerces J 2.5.0. I have \r\n feed combinations in the
CDATA sections that get converted to \n (or rather \r gets lost.


XML parsers convert CR-LF and CR to LF, so that you don't have to worry
about what platform you're using.

If you really want to preserve CRs, you have to use a character
reference, but think carefully before doing this: XML is a text
format, and dependence on platform-specific line-end sequences
is not usually a good idea.

-- Richard
Jul 20 '05 #2
Richard Tobin wrote:
In article <xa***************@newssvr14.news.prodigy.com>,
CarlosRivera <Ca**********@badnamefornospam.to> wrote:

I am using apache xerces J 2.5.0. I have \r\n feed combinations in the
CDATA sections that get converted to \n (or rather \r gets lost.

XML parsers convert CR-LF and CR to LF, so that you don't have to worry
about what platform you're using.


To be more specific, here is an excerpt from the XML 1.0 spec:

====

2.11 End-of-Line Handling

XML parsed entities are often stored in computer files which, for
editing convenience, are organized into lines. These lines are typically
separated by some combination of the characters CARRIAGE RETURN (#xD)
and LINE FEED (#xA).

To simplify the tasks of applications, the XML processor MUST behave as
if it normalized all line breaks in external parsed entities (including
the document entity) on input, before parsing, by translating both the
two-character sequence #xD #xA and any #xD that is not followed by #xA
to a single #xA character.

====

XML 1.1 generalizes that requirement a bit.
John Bollinger
jo******@indiana.edu
Jul 20 '05 #3
jacok
1 New Member
Yes, sure, standard formatting and all that.
No problem with that.
Except that this happens with CDATA sections...
ANY character data should be able to go into CDATA sections, even binary data.

I'm working on a security application that signs and verifies XML, and have the same problem when I'm signing data that contains CRLF's.

As far as I'm concerned, any parser should leave CDATA sections as they are.
Apr 28 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
9867
by: Dr. Laurence Leff | last post by:
I am writing a Java program to read in XML file, modify some elements slightly, and then write it out. That XML file is prepared in Docbook. It works fine, except that it is disturbing the...
4
13866
by: Simon Harris | last post by:
Hi All, I am trying to write a function that will remove all carriage returns from a string, so that the string of words can be used as meta keywords. So far I have: Function...
8
4065
by: Xah Lee | last post by:
what does it mean when a style tag gives something like the following? <style type="text/css" media="screen,projection">/*<!]>*/</style> is this standard? Xah xah@xahlee.org ∑...
2
5712
by: Alin Popovici | last post by:
Hi! I have this problem. I am sending as a parameter for a webmethod a string containing '\r\n' sequences. For some reason, when I debug my webmethod, the paramter is received with the carriage...
8
14296
by: Kevin Burton | last post by:
I have a string that is returned from a web service that is XML so the string is enclosed in CDATA]. The string/data enclosed in the CDATA section is real XML. If I try to LoadXml into an...
2
5225
by: Steveino | last post by:
Hello, Just wondering if anyone could shed any light on this, it's probably me just being silly... I have a dataset that I've used to create an XmlDataDocument, in order to apply XSL. The XSL...
12
3948
by: Peter Michaux | last post by:
Hi, I am experimenting with some of the Ruby on Rails JavaScript generators and see something I haven't before. Maybe it is worthwhile? In the page below the script is enclosed in //<!]> ...
2
17506
by: Pugi! | last post by:
Using AJAX I want to send some information from the server (php-page) as XML to the client. The contents can be very divers so I have to use XML instead of text. On one occasion the contents is...
18
758
by: sim.sim | last post by:
Hi all. i'm faced to trouble using minidom: #i have a string (xml) within CDATA section, and the section includes "\r\n": iInStr = '<?xml version="1.0"?>\n<Data><!]></Data>\n' #After i...
0
7260
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7161
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7539
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7101
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
1
5089
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
3222
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1596
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
802
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
456
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.