469,945 Members | 2,357 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,945 developers. It's quick & easy.

Strip CDATA with regex

Hi,

Can sombody here please help me a bit with a regular expression.
I have a xml file where I need to strip the CDATA sections of any
contained data.

Eg.
<xml>
<tag><[CDATA[ some data ]]></tag>
<tag><[CDATA[ some more data ]]></tag>
</xml>

Should end up like this:
<xml>
<tag><[CDATA[]]></tag>
<tag><[CDATA[]]></tag>
</xml>

Now, I have the start and end of the range
(\[CDATA\[)
and
(\]\]>)

But I cannot figure out how I match any character that is not like the
end of the range.

That is > is ok, ] is ok
but ]]> is not ok.

Thanks in advance,
Balaras
Jul 23 '05 #1
3 5664


Balaras wrote:

Can sombody here please help me a bit with a regular expression.
I have a xml file where I need to strip the CDATA sections of any
contained data.

Eg.
<xml>
<tag><[CDATA[ some data ]]></tag> It should be
<![CDATA[ <tag><[CDATA[ some more data ]]></tag>
</xml>

Should end up like this:
<xml>
<tag><[CDATA[]]></tag>
<tag><[CDATA[]]></tag>
</xml>


How about parsing the XML into a DOM document and then manipulating
those CDATA section nodes and serializing back, Mozilla example:

var xmlMarkup = [
'<xml>',
'<tag><![CDATA[ some data ]]></tag>',
'<tag><![CDATA[ some more data ]]></tag>',
'</xml>'
].join('\r\n');

var xmlDocument = new DOMParser().parseFromString(xmlMarkup,
'application/xml');

var tagElements = xmlDocument.getElementsByTagName('tag');
for (var i = 0; i < tagElements.length; i++) {
var cdataSection = tagElements[i].firstChild;
if (cdataSection.nodeType == 4) {
cdataSection.data = '';
}
}

var newXmlMarkup = new XMLSerializer().serializeToString(xmlDocument);

That yields

<xml>
<tag><![CDATA[]]></tag>
<tag><![CDATA[]]></tag>
</xml>
--

Martin Honnen
http://JavaScript.FAQTs.com/
Jul 23 '05 #2
Thanks Martin,

Actually I posted this to c.l.javascript by accident, it was ment for a
php group. I have to do some preprocessing before the xml is sent to the
client.

However your post helped me in another manner :)

var newXmlMarkup = new XMLSerializer().serializeToString(xmlDocument);


I did not know about the XMLSerializer, and I need it :)

Does IE have an equivallent or does a .innerHTML return valid xml ?

/Balaras
Jul 23 '05 #3


Balaras wrote:

var newXmlMarkup = new XMLSerializer().serializeToString(xmlDocument);


I did not know about the XMLSerializer, and I need it :)

Does IE have an equivallent or does a .innerHTML return valid xml ?


An XML DOM document (or any XML DOM node) with IE has a property named
xml which gives you the serialized markup so with IE/MSXML you can use
xmlDocument.xml
to get the markup.

--

Martin Honnen
http://JavaScript.FAQTs.com/
Jul 23 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

19 posts views Thread by Dr Clue | last post: by
3 posts views Thread by Michal A. Valasek | last post: by
2 posts views Thread by Daniel M. Hendricks | last post: by
3 posts views Thread by John A Grandy | last post: by
4 posts views Thread by Steve | last post: by
1 post views Thread by Dariusz Tomoń | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.