By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,784 Members | 1,248 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,784 IT Pros & Developers. It's quick & easy.

xml4c child nodes

P: n/a
I'm trying to iterate through a list of child nodes. It seems like to
get the text value of the node, you have to do a
node->getFirstChild()->getNodeValue. This being said, there is a
hasChildNodes method, but if I use that, it includes the "text" nodes
also, which I don't want ot include.

if this is my xml:

<A>
<B></B>
<C></C>
</A>

if I have a node for B, I thought getNextSibling would return C, but it
didn't. it returned #text.

confused.

Oct 12 '06 #1
Share this Question
Share on Google+
14 Replies


P: n/a
ma*****@yahoo.com wrote:
<A>
<B></B>
<C></C>
</A>

if I have a node for B, I thought getNextSibling would return C, but it
didn't. it returned #text.
If you'd stopped to look at the value of that text node, you'd have
answered your own question -- it's the whitespace (newline and
indentation) between the B's end-tag and the start-tag for C.

XML doesn't know whether that whitespace text is meaningful or not, so
XML APIs will deliver it. Your app needs to deal with that appropriately.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Oct 12 '06 #2

P: n/a
Joe Kesselman wrote:
XML doesn't know whether that whitespace text is meaningful or not, so
XML APIs will deliver it. Your app needs to deal with that appropriately.
Some XML APIs may report such whitespace as "ignorable". This is
whitespace between elements where the DTD does not allow PCDATA. This
assumes that there is a DTD.

But they are still nodes in the infoset.
// Magnus
Oct 12 '06 #3

P: n/a


ma*****@yahoo.com wrote:
I'm trying to iterate through a list of child nodes. It seems like to
get the text value of the node, you have to do a
node->getFirstChild()->getNodeValue. This being said, there is a
hasChildNodes method, but if I use that, it includes the "text" nodes
also, which I don't want ot include.

if this is my xml:

<A>
<B></B>
<C></C>
</A>

if I have a node for B, I thought getNextSibling would return C, but it
didn't. it returned #text.
Then check nodeType (respectively getNodeType()) till you find an
element node (node type is 1).

--

Martin Honnen
http://JavaScript.FAQTs.com/
Oct 12 '06 #4

P: n/a
Magnus Henriksson wrote:
Some XML APIs may report such whitespace as "ignorable". This is
whitespace between elements where the DTD does not allow PCDATA. This
assumes that there is a DTD.
Good point. *If* there is a DTD or Schema available which provides that
information, some tools can be asked to suppress whitespace that appears
where only elements where expected. That's getting beyond straight
parsing into preliminary processing/filtering, since as Magnus says it
involves delivering a modified infoset.

Since that support is not always supported by the API -- or may be
supported in theory but not actually implemented on all parsers -- you
need to exercise a bit of care in relying on it. I've generally
preferred not to do so, for that reason and because sometimes users want
the whitespace preserved even when it isn't "meaningful" to the document.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Oct 12 '06 #5

P: n/a

Joe Kesselman wrote:
Magnus Henriksson wrote:
Some XML APIs may report such whitespace as "ignorable". This is
whitespace between elements where the DTD does not allow PCDATA. This
assumes that there is a DTD.
<AGood point. *If* there is a DTD or Schema available which provides
that
information, some tools can be asked to suppress whitespace that appears
where only elements where expected. That's getting beyond straight
parsing into preliminary processing/filtering, since as Magnus says it
involves delivering a modified infoset.

Since that support is not always supported by the API -- or may be
supported in theory but not actually implemented on all parsers -- you
need to exercise a bit of care in relying on it. I've generally
preferred not to do so, for that reason and because sometimes users want
the whitespace preserved even when it isn't "meaningful" to the document.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Thanks for the replies. But going back to my original XML example.

<A>
<B>Data</B>
<C>Data</C>
</A>

How can I determine if A has children ? calling hasChildNodes seems
worthless to me since it will always have the text node underneath it.
I guess I have to write my own version that doesn't look at the
TextNodes ?

TIA.

Oct 12 '06 #6

P: n/a


ma*****@yahoo.com wrote:

But going back to my original XML example.

<A>
<B>Data</B>
<C>Data</C>
</A>

How can I determine if A has children ? calling hasChildNodes seems
worthless to me since it will always have the text node underneath it.
Why, if you have e.g.
<A/>
or
<A />
or
<A></A>
then the element is really emtpy and hasChildNodes is false.
If you are looking for element child nodes only then you can use the
getElementsByTagName("*").length check (reports all descendant elements)
or use XPath if you API supports that (e.g. selectNodes("*").length,
reports all child elements).

--

Martin Honnen
http://JavaScript.FAQTs.com/
Oct 12 '06 #7

P: n/a

Martin Honnen wrote:
ma*****@yahoo.com wrote:

But going back to my original XML example.

<A>
<B>Data</B>
<C>Data</C>
</A>

How can I determine if A has children ? calling hasChildNodes seems
worthless to me since it will always have the text node underneath it.

Why, if you have e.g.
<A/>
or
<A />
or
<A></A>
then the element is really emtpy and hasChildNodes is false.
If you are looking for element child nodes only then you can use the
getElementsByTagName("*").length check (reports all descendant elements)
or use XPath if you API supports that (e.g. selectNodes("*").length,
reports all child elements).

--

Martin Honnen
http://JavaScript.FAQTs.com/
I was incorrect in my question. I was meaning to ask about B. I got a
DOM_Node for B and then check hasChildNodes and it returns True, when
there are no "real" child nodes. I didn't realize you could use a "*"
in the getElements, so I can use this instead of the hasChildNodes
call. Thanks for the help.

Oct 12 '06 #8

P: n/a

marf...@yahoo.com wrote:
Martin Honnen wrote:
ma*****@yahoo.com wrote:

But going back to my original XML example.
>
<A>
<B>Data</B>
<C>Data</C>
</A>
>
How can I determine if A has children ? calling hasChildNodes seems
worthless to me since it will always have the text node underneath it.
Why, if you have e.g.
<A/>
or
<A />
or
<A></A>
then the element is really emtpy and hasChildNodes is false.
If you are looking for element child nodes only then you can use the
getElementsByTagName("*").length check (reports all descendant elements)
or use XPath if you API supports that (e.g. selectNodes("*").length,
reports all child elements).

--

Martin Honnen
http://JavaScript.FAQTs.com/
I was incorrect in my question. I was meaning to ask about B. I got a
DOM_Node for B and then check hasChildNodes and it returns True, when
there are no "real" child nodes. I didn't realize you could use a "*"
in the getElements, so I can use this instead of the hasChildNodes
call. Thanks for the help.
sorry to bother again. But can someone please explain the difference
between a DOM_Node and a DOM_Element. Is a DOM_Element just a "type"
of DOM_Node ?

What I did was a getElementsByTagName for the DOM_Document to give me a
NodeList, then for each of those nodes, I was going to use a
getElementsByTag to get the child elemnentnodes("*").length to
determine if that node has any child elements, but can't because
getElementsByTagname is not part of DOM_Node, but DOM_Element. What is
the correct way of doing this please ? I'm new to DOM as you can see.

Oct 12 '06 #9

P: n/a


ma*****@yahoo.com wrote:

But can someone please explain the difference
between a DOM_Node and a DOM_Element. Is a DOM_Element just a "type"
of DOM_Node ?
Yes, node is usually an abstract base class (or interface) that is
extended by several concrete sub classes (or interfaces) (e.g. for
document, element, attribute, text, processing instruction, cdata
section, comment nodes).
What I did was a getElementsByTagName for the DOM_Document to give me a
NodeList, then for each of those nodes, I was going to use a
getElementsByTag to get the child elemnentnodes("*").length to
determine if that node has any child elements, but can't because
getElementsByTagname is not part of DOM_Node, but DOM_Element. What is
the correct way of doing this please ? I'm new to DOM as you can see.
You need to cast that DOM_Node that you have to a DOM_Element. With Java
you would simply do e.g.
Element el = (Element)node;
Can't help with exact xml4c syntax.

--

Martin Honnen
http://JavaScript.FAQTs.com/
Oct 12 '06 #10

P: n/a

Martin Honnen wrote:
ma*****@yahoo.com wrote:

But can someone please explain the difference
between a DOM_Node and a DOM_Element. Is a DOM_Element just a "type"
of DOM_Node ?

Yes, node is usually an abstract base class (or interface) that is
extended by several concrete sub classes (or interfaces) (e.g. for
document, element, attribute, text, processing instruction, cdata
section, comment nodes).
What I did was a getElementsByTagName for the DOM_Document to give me a
NodeList, then for each of those nodes, I was going to use a
getElementsByTag to get the child elemnentnodes("*").length to
determine if that node has any child elements, but can't because
getElementsByTagname is not part of DOM_Node, but DOM_Element. What is
the correct way of doing this please ? I'm new to DOM as you can see.

You need to cast that DOM_Node that you have to a DOM_Element. With Java
you would simply do e.g.
Element el = (Element)node;
Can't help with exact xml4c syntax.

--

Martin Honnen
http://JavaScript.FAQTs.com/
Thanks. I got it.

Oct 12 '06 #11

P: n/a

ma*****@yahoo.com wrote:
Martin Honnen wrote:
ma*****@yahoo.com wrote:

But can someone please explain the difference
between a DOM_Node and a DOM_Element. Is a DOM_Element just a "type"
of DOM_Node ?
Yes, node is usually an abstract base class (or interface) that is
extended by several concrete sub classes (or interfaces) (e.g. for
document, element, attribute, text, processing instruction, cdata
section, comment nodes).
What I did was a getElementsByTagName for the DOM_Document to give me a
NodeList, then for each of those nodes, I was going to use a
getElementsByTag to get the child elemnentnodes("*").length to
determine if that node has any child elements, but can't because
getElementsByTagname is not part of DOM_Node, but DOM_Element. What is
the correct way of doing this please ? I'm new to DOM as you can see.
You need to cast that DOM_Node that you have to a DOM_Element. With Java
you would simply do e.g.
Element el = (Element)node;
Can't help with exact xml4c syntax.

--

Martin Honnen
http://JavaScript.FAQTs.com/

Thanks. I got it.
Since getElementsByTag("*") returns all element nodes, is there an easy
way to only get the next level of elements.

i.e.
<A>
<B>
<C>
</C>
</B>
<B>
<C>
</C>
</B>
</A>

I would only want the NodeList returned to contain the B's element
nodes (and not C's)

Thanks.

Oct 12 '06 #12

P: n/a
ma*****@yahoo.com wrote:
Since getElementsByTag("*") returns all element nodes, is there an easy
way to only get the next level of elements.
Simplest: getFirstChild followed by repeated getNextSibling, ignoring
those which aren't elements.

Overkill: Use one of the mechanisms in the DOM Level 2 Traversal
feature, setting its filters to show you only the nodes you're
interested in.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Oct 12 '06 #13

P: n/a

Joseph Kesselman wrote:
ma*****@yahoo.com wrote:
Since getElementsByTag("*") returns all element nodes, is there an easy
way to only get the next level of elements.

Simplest: getFirstChild followed by repeated getNextSibling, ignoring
those which aren't elements.

Overkill: Use one of the mechanisms in the DOM Level 2 Traversal
feature, setting its filters to show you only the nodes you're
interested in.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Thanks. I've heard that using DOM can be very memory intensive because
of the tree and it might not be the best approach on "large" XML
documents. Does anyone have any numbers on what large would be and
where it might not be the appropriate method to use.

The XML we're talking about here could be around 30-40K, with about
1000 simultaneous users. Each thread would have their own parser
instance, which based on my understanding of what I've been reading,
that shouldn't be an issue. But I'm a little concerned over what I'm
reading about the memory usage.

Any ideas what kind of sizes we're talking about here ?

Oct 12 '06 #14

P: n/a
See http://www.w3.org/DOM/faq.html#SAXandDOM

The DOM is only an API; if you want to talk about memory usage, you need
to discuss specific implementations, especially since the storage behind
that API doesn't necessarily use the same structure as that presented to
the user.

Oct 12 '06 #15

This discussion thread is closed

Replies have been disabled for this discussion.