473,387 Members | 1,520 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

XPath and CDATA

Hi,

Is it possible to retrieve CDATA Sections from an XML document? If so,
could someone give me a syntax example?

Cheers

Aidy

Mar 17 '06 #1
15 18426
aidy wrote:
Is it possible to retrieve CDATA Sections from an XML document?


Yes, just parse it up as a textfile with Perl, or somesuch. This is
probably not what you want.

If you want to work with XML though, you'll probably work through some
tool with a DOM interface. You can't see a CDATA section through this
because there just _isn't_ one. CDATA is not part of XML-Infoset, it's
solely an artefact of the particular serialisation of that instance of
that file.
http://www.w3.org/TR/2004/REC-xml-in...40204/#omitted

<a>foo</a>
and
<a><![CDATA[foo]]></a>
are not only "indistinguishable" when viewed through the DOM, they are
absolutely _the_same_thing_. Either of them is an equally valid
serialisation of the same underlying XML content.

<a><![CDATA[<foo>]]></a> can of course be analysed by looking at its
text and you could set a flag for
"some_encooding_maybe_a_cdata_is_needed", but that's a question of
writing, not reading.
It's fundamental to XML (or at least to good XML design) that you can't
see CDATA and similar issues, and you don't care about them either.
_Use_ the tools, don't fight them. Transparency is good, you don't care
about whether there's a CDATA in there or not. Your app works equally
well either way and doesn't need to know. If it does, then you're doing
something badly wrong.

Mar 17 '06 #2
The XPath data model considers CDATA to be just an alternative markup of
text, so no, you can't distinguish CDATA sections from any other form of
text. You can, of course, retrieve their text value...

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Mar 17 '06 #3


aidy wrote:
Is it possible to retrieve CDATA Sections from an XML document? If so,
could someone give me a syntax example?


The XPath data model does not distinguish between normal text nodes and
CDATA section nodes the way the W3C DOM does, in the XPath 1.0 data
model there are only text nodes which you select with
text()
So for XPath it does not matter whether you have e.g.
<element>Kibo &amp; Xibo</element>
or
<element><![CDATA[Kibo & Xibo]]></element>
you would use e.g.
/element/text()
to select the text node with XPath and its string value is
Kibo & Xibo

--

Martin Honnen
http://JavaScript.FAQTs.com/
Mar 17 '06 #4
aidy wrote:
Hi,

Is it possible to retrieve CDATA Sections from an XML document? If so,
could someone give me a syntax example?


See http://xml.silmaril.ie/authors/cdata/

If you're using XML software, the answer is no. The CDATA markup
simply prevents its contents being parsed for more markup: the
result is passed through to the processor as text, untouched. So
an XML application never sees the CDATA markup and is unaware that
it ever existed.

If you want to get at it via a non-XML method, write a script in
your favourite language to do so. Warning: this is non-trivial.

It would help if you could tell us why you want to do this. There
may be another way around the problem.

///Peter
--
XML FAQ: http://xml.silmaril.ie/
Mar 17 '06 #5
Andy Dingley <di*****@codesmiths.com> wrote:
aidy wrote:

Is it possible to retrieve CDATA Sections from an XML document?

Yes, just parse it up as a textfile with Perl, or somesuch. This is
probably not what you want.

If you want to work with XML though, you'll probably work through some
tool with a DOM interface. You can't see a CDATA section through this
because there just _isn't_ one.


CDATAs are in the DOM
http://java.sun.com/j2se/1.4.2/docs/...TASection.html

but CDATAs are not part of the data model, so -as Andy explained- it is
useless to deal with them ; what your application need is to retrieve
some text content
high level tools such as XPath don't allow you to handle CDATAs,
everything you'll find is some text : sibling text nodes (for example a
mix of CDATAs and non-CDATA texts) will be supplied as a single text item

CDATAs are just a convenient way to escape characters in XML, but an
application don't care how the text content had been written

CDATA is not part of XML-Infoset, it's solely an artefact of the particular serialisation of that instance of
that file.
http://www.w3.org/TR/2004/REC-xml-in...40204/#omitted

<a>foo</a>
and
<a><![CDATA[foo]]></a>
are not only "indistinguishable" when viewed through the DOM, they are
absolutely _the_same_thing_. Either of them is an equally valid
serialisation of the same underlying XML content.

<a><![CDATA[<foo>]]></a> can of course be analysed by looking at its
text and you could set a flag for
"some_encooding_maybe_a_cdata_is_needed", but that's a question of
writing, not reading.
It's fundamental to XML (or at least to good XML design) that you can't
see CDATA and similar issues, and you don't care about them either.
_Use_ the tools, don't fight them. Transparency is good, you don't care
about whether there's a CDATA in there or not. Your app works equally
well either way and doesn't need to know. If it does, then you're doing
something badly wrong.

--
Cordialement,

///
(. .)
--------ooO--(_)--Ooo--------
| Philippe Poulard |
-----------------------------
http://reflex.gforge.inria.fr/
Have the RefleX !
Mar 17 '06 #6
Hi,

<SNIP>

<?xml version="1.0" ?>
- <SAFS_LOG>
<LOG_OPENED date="16-03-2006" time="15:50:19" />
<LOG_VERSION major="1" minor="1" />
- <LOG_MESSAGE type="GENERIC" date="16-03-2006" time="15:50:19">
- <MESSAGE_TEXT>
- <![CDATA[ getTrimmedField result 'TESTID_10' assigned to variable
'^TestData'.]]>
</MESSAGE_TEXT>
</LOG_MESSAGE>
- <LOG_MESSAGE type="GENERIC" date="16-03-2006" time="15:50:20">
- <MESSAGE_TEXT>
- <![CDATA[ TESTID_10]]>
</MESSAGE_TEXT>
</LOG_MESSAGE>

<SNIP>
This is a snippet of the XML document. What I am struggling with is, is
to retrieve through XPath all the CDATA where the text includes
'TESTID' (third line from the bottom).

So in the HTML I will be producing I will have

TESTID

TestID_10
TestID_20
TestID_30

is it possible to conditionally extract text?

Aidy

Mar 24 '06 #7
As far as XPath is concerned CDATA is just text. So write an XPath which
searches for text nodes (or, probably more accurately for your case,
<MESSAGE_TEXT> elements) whose text value contains "TESTID".

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Mar 24 '06 #8
> What I am struggling with is, is
to retrieve through XPath all the CDATA where the text includes
'TESTID' (third line from the bottom).


Try this
select="//MESSAGE_TEXT [contains (translate (text(),'TESID_-
','tesid'), 'TESTID')]"
(It might work. But it's Friday, so don't be surprised if it's buggy)

Mar 24 '06 #9
aidy wrote:
Hi,

<SNIP>

<?xml version="1.0" ?>
- <SAFS_LOG>
<LOG_OPENED date="16-03-2006" time="15:50:19" />
<LOG_VERSION major="1" minor="1" />
- <LOG_MESSAGE type="GENERIC" date="16-03-2006" time="15:50:19">
- <MESSAGE_TEXT>
- <![CDATA[ getTrimmedField result 'TESTID_10' assigned to variable
'^TestData'.]]>
</MESSAGE_TEXT>
</LOG_MESSAGE>
- <LOG_MESSAGE type="GENERIC" date="16-03-2006" time="15:50:20">
- <MESSAGE_TEXT>
- <![CDATA[ TESTID_10]]>
</MESSAGE_TEXT>
</LOG_MESSAGE>

<SNIP>
This is a snippet of the XML document. What I am struggling with is, is
to retrieve through XPath all the CDATA where the text includes
'TESTID' (third line from the bottom).

So in the HTML I will be producing I will have

TESTID

TestID_10
TestID_20
TestID_30

is it possible to conditionally extract text?


Yes, but by the time your application receives the information from
the parser, there won't be any evidence of there having been CDATA
markup, so what you really mean is "can I conditionally extract the
text in MESSAGE_TEXT elements?"

<xsl:template match="MESSAGE_TEXT[contains(.,'TESTID')]">
<xsl:value-of select="."/>
</xsl:template>

///Peter
--
XML FAQ: http://xml.silmaril.ie/
Mar 25 '06 #10
Below is the XSL I have got. I am trying to assign the value of the
MESSAGE_TEXT node to a variable, then do a contains to gather whether
that text has a substring of 'TESTID'. Then I wanna write that value to
the HTML. However when I run the transformation, I receive this error:
'A string literal was expected, but no opening quote character was
found. <xsl:if test=contains("$host1","TESTID")', so at the moment I
am a bit stuck.

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
<HTML>
<xsl:for-each select="SAFS_LOG/LOG_MESSAGE">
<P>
<!-- <xsl:value-of select="MESSAGE_TEXT"/> -->

<xsl:variable name="host1" select="MESSAGE_TEXT"/>
<xsl:if test=contains("$host1","TESTID")
<xsl:value-of select="$host1"/>
</P>
</xsl:if>
</xsl:for-each>
</HTML>
</xsl:template>
</xsl:stylesheet>

Cheers

Aidy

Mar 25 '06 #11
That sounds like you're trying to style HTML. HTML is not XML -- it's
SGML -- and permits some things that XML doesn't, such as attribute
values without quotes around them.

If that's what you're trying to do, you need to get a parser that can
read HTML (I believe the W3C's "tidy" tool can be persuaded to do this
for you, or try the NekoHTML parser based on the Apache Xerces system)
and do a bit of simple API programming to use that to feed the document
to the stylesheet.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Mar 25 '06 #12
aidy wrote:
Below is the XSL I have got. I am trying to assign the value of the
MESSAGE_TEXT node to a variable, then do a contains to gather whether
that text has a substring of 'TESTID'. Then I wanna write that value to
the HTML. However when I run the transformation, I receive this error:
'A string literal was expected, but no opening quote character was
found. <xsl:if test=contains("$host1","TESTID")', so at the moment I
am a bit stuck.


I think you need to remove the quotes around "$host1".

///Peter
Mar 26 '06 #13
Hi,

This is a snippet of the XML

<SAFS_LOG>
<LOG_MESSAGE type="GENERIC" date="27-03-2006" time="10:56:06" >
<MESSAGE_TEXT><![CDATA[.TESTID_10]]></MESSAGE_TEXT>
</LOG_MESSAGE>
<LOG_MESSAGE type="FAILED" date="27-03-2006" time="10:56:08" >
<MESSAGE_TEXT><![CDATA[Country Code: GB <> AU </MESSAGE_TEXT>
</LOG_MESSAGE>
<LOG_MESSAGE type="GENERIC" date="27-03-2006" time="10:56:10" >
<MESSAGE_TEXT><![CDATA[.TESTID_20]]></MESSAGE_TEXT>
</LOG_MESSAGE>
<LOG_MESSAGE type="PASSED" date="27-03-2006" time="10:56:13" >
<MESSAGE_TEXT><![CDATA[AddressServiceWin DOES EXIST as
expected.]]></MESSAGE_TEXT>
</LOG_MESSAGE>

</SAFS_LOG>

I have managed to extract the TESTID's from the MESSAGE_TEXT by using
this xsl

<xsl:template match="/">
<HTML>
<head><title>Address Service Test Log </title></head>

<body>
<h2>Test Summary</h2>
<tr><th><B> TEST ID</B></th></tr>
<xsl:for-each select="SAFS_LOG/LOG_MESSAGE">
<xsl:variable name="host1" select="MESSAGE_TEXT"/>
<xsl:if test="(contains($host1,'.TESTID'))">
<table border="1">
<td> <xsl:value-of select="$host1"/> </td>
</table>
</xsl:if>
</xsl:for-each>
</body>
</HTML>
</xsl:template>
</xsl:stylesheet>

In the HTML I get something like this

Test Summary
TEST ID

..TESTID_10
..TESTID_20

Now I want to extract from the xml whether these tests have passed or
failed - as we can see above we have got a 'FAILED' on the TESTID_10
and a 'PASSED' on TESTID_20

The code I have added is enclosed in asterisks

<xsl:for-each select="SAFS_LOG/LOG_MESSAGE">
<xsl:variable name="host1" select="MESSAGE_TEXT"/>
<xsl:if test="(contains($host1,'.TESTID'))">
<table border="1">
<td> <xsl:value-of select="$host1"/> </td>

************************************************** ************
<xsl:variable name="host2" select="@type"/>
<xsl:if test="(contains($host2,'FAILED'))">
<td> <xsl:value-of select="MESSAGE_TEXT"/> </td>
</xsl:if>
************************************************** ************

</table>
</xsl:if>
</xsl:for-each>

I don't seem to be returning any 'FAILED' even though they are in
<LOG_MESSAGE>. Does anyone know why?

Cheers

Aidy

Mar 27 '06 #14
I need to extract a data from CDATA section of XML.That is not an issue
for me as what CDATA secion contains.I just need that section only as a
string or file or whatever. Pls help.


*** Sent via Developersdex http://www.developersdex.com ***
Mar 31 '06 #15
In article <er**************@news.uswest.net>,
Rajesh Kochhar <ra************@abnamro.com> wrote:
I need to extract a data from CDATA section of XML.That is not an issue
for me as what CDATA secion contains.I just need that section only as a
string or file or whatever. Pls help.


You can't extract a CDATA section with XPath. You will have to specify
the characters you want by some other means, such as the element they
are contained in.

-- Richard
Mar 31 '06 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Nold Fuchs | last post by:
Hi ! I need to isolate some nodes without CDATA inside, the problem is that I can not get an XPATH to differentiate simple text nodes and CDATA nodes. Sample XML : <nodes> <node>some...
1
by: Chris Fink | last post by:
What is a valid XPath Statement to retrieve the data found within the CDATA tag below? Using the XPath statement "/oid" returns "" <oid><!]></oid>
3
by: Sébastien Ros | last post by:
I tries to migrate an existing application from 1.1 to 2.0 but it seems that one of my XPath expressions no more works on this version. I tried a SelectNodes().Count. The result is 1 in v1.1 and...
2
by: vjethava | last post by:
Hi, I'm relatively new to xml. I want to have a xml element of the form: <!ELEMENT CAR (MAKE, COMPANY)> <!ATTLIST CAR TYPE PCDATA PRICE PCADATA> <!ELEMENT COMPANY (HQ, COMPANY_TYPE)>
11
by: ericms | last post by:
Can anybody show me how to insert a CDATA section using XPathNavigator ? I have tried the follwing with no luck: XmlDocument docNav = new XmlDocument(); docNav.LoadXml(xmlString);...
5
by: dotnetnoob | last post by:
i got xml file that have <bac:BACnetDevicesand <bac:BACnetDevicetag how do you make a xpath query with that type of tags this is the query that i come up with...
3
by: werD | last post by:
Hello I have an xml document that im currently using a forward only .net repeater on and using some xpath queries to display the data The xml is quite simple <?xml version="1.0"...
2
by: Pugi! | last post by:
Using AJAX I want to send some information from the server (php-page) as XML to the client. The contents can be very divers so I have to use XML instead of text. On one occasion the contents is...
6
by: dkyadav80 | last post by:
Hi sir, I'm new about xml, javascript. I have two selection field(html) first is city and second is state. the city and state values should be store in xml file. when user select city then all...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.