By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
431,805 Members | 1,270 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 431,805 IT Pros & Developers. It's quick & easy.

Need help with PHP DOMXML - get_elements_by_tagname

P: n/a
I'm having some issues with PHP DOMXML - in particular the
get_elements_by_tagname method. Now, the PGP docs on this are, well,
sparse, so maybe I'm just doing something stupid. I thought this
method would behave like the 'findnodes' XML method in Perl. Namely
that you can pass it an xpath statement and it will find nodes that
match:
$array = $node->get_elements_by_tagname($xpath);
This is long so here's a pagebreak:

And, indeed, this seems to have worked when I've used it in the past.
But I'm working on a more complex system now which does a lot more
sub-node access, etc, and it is failing on me. I put together this
example to demonstrate the failure modes I'm experiencing.

----------------------------------------------------------------------
<?php
if (PHP_OS == "WIN32" || PHP_OS == "WINNT") {
define('EOL', "\r\n");
}
else {
define('EOL', "\n");
}
header("Content-Type: text/plain");
print PHP_VERSION . EOL;

$xml =
'<TestDoc>
<level1>
<level2>
<level3>
<level4>
<fubar>fubar</fubar>
<fubaz>fubaz</fubaz>
</level4>
</level3>
</level2>
</level1>
</TestDoc>';

if (!$domResponse = domxml_open_mem($xml)) {
print("failed!");
}

print pcGetField(&$domResponse, "fubar") . EOL;

lookup(&$domResponse);

function lookup($domResponseRef) {
print $domResponseRef->dump_mem();
$node =
$domResponseRef->get_elements_by_tagname("level4");
$node = $node[0];
print_r($node);
print pcGetField2(&$node, "//fubar") . EOL;
print pcGetField(&$domResponseRef, "fubar") . EOL;
print pcGetField(&$node, "fubar") . EOL;
$node =
$node->get_elements_by_tagname("fubar");
print $node[0]->get_content() . EOL;
$node = $node[0];
print_r($node);
}

function pcGetField($nodeRef, $tag) {
if(!isset($nodeRef)) {
print("pcGetField: node is blank [" . $tag . "]");
return "";
}
$node = $nodeRef->get_elements_by_tagname($tag);
$node = $node[0];
return $node->get_content();
}

function pcGetField2($nodeRef, $tag) {
if(!isset($nodeRef)) {
print("pcGetField: node is blank [" . $tag . "]");
return "";
}
$xpath = xpath_new_context($nodeRef);
$node = &xpath_eval($xpath, $tag);
$node = $node->nodeset[0];

return $node->get_content();
}
?>
----------------------------------------------------------------------

With it like this everything is fine:
----------------------------------------------------------------------
4.2.2
fubar
<?xml version="1.0"?>
<TestDoc>
<level1>
<level2>
<level3>
<level4>
<fubar>fubar</fubar>
<fubaz>fubaz</fubaz>
</level4>
</level3>
</level2>
</level1>
</TestDoc>
DomElement Object
(
[type] => 1
[tagname] => level4
[0] => 3
[1] => 137395784
)
fubar
fubar
fubar
fubar
DomElement Object
(
[type] => 1
[tagname] => fubar
[0] => 2
[1] => 138335176
)
----------------------------------------------------------------------

You see, I'm jumping down to 'level4' so fubar is an immediate child
node. But if I start from *anywhere* except the root node or the
immediate parent, it fails. So if I change:
$node = $domResponseRef-&gt;get_elements_by_tagname("level4");
to
$node = $domResponseRef-&gt;get_elements_by_tagname("level1");
I get:
----------------------------------------------------------------------
4.2.2
fubar
<?xml version="1.0"?>
<TestDoc>
<level1>
<level2>
<level3>
<level4>
<fubar>fubar</fubar>
<fubaz>fubaz</fubaz>
</level4>
</level3>
</level2>
</level1>
</TestDoc>
DomElement Object
(
[type] => 1
[tagname] => level1
[0] => 3
[1] => 138064304
)
fubar
fubar
<br />
<b>Fatal error</b>: Call to a member function on a non-object in
<b>/home/megazone/scripts/PHP/Kiosk/test.php</b> on line <b>56</b><br />
----------------------------------------------------------------------

Note that the it works when checking from the docroot and later when
using xpath_eval and from the reference to the doc root - but fails
the first time you try it from the subnode. It will fail with
TestDoc, level1, level2, and level3. it will fail if I try
get_elements_by_tagname("fubar"), get_elements_by_tagname("//fubar"),
etc. It also fails if you're at say level3 and try to look for
"level4/fubar". I think I've tried every combination I can think of.

So is this just a limitation that it only works when working with the
root node of the document or when looking for an immediate child of
the current node? What the heck am I not seeing?

Thanks.

-MZ, RHCE #806199299900541, ex-CISSP #3762
--
<URL:mailto:megazoneatmegazone.org> Gweep, Discordian, Author, Engineer, me.
"A little nonsense now and then, is relished by the wisest men" 508-755-4098
<URL:http://www.megazone.org/> <URL:http://www.eyrie-productions.com/> Eris

Jul 17 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
I had a quick skim read of your post (as it was quite long) and it seems
like you don't check to see if get_elements_by_tagname() is even
returning at least one node.

instead of
$node = $node[0];

you should have
if(count($node)) {
$node=$node[0];
...
$node =
$node->get_elements_by_tagname("fubar");
print $node->get_content() . EOL;
}
else {
$node = null;
print "There was no level4 element!\n";
}

and also in your function you are using isset() to test if it is a node.
not gonna work.

if(!isset($nodeRef))

well of course it will exist. it will exist no matter what is in it
because it is in the function call. The line should be

if(is_object($nodeRef))

this would be an improvement. It would be even better if you tested it
using the is_a() function inside the object test.

DOM can be laboursome. However, it is definately the best way to create
XML data. DOM is fairly low-level in terms of describing a document,
which is why I've written a library which attempts to provide higher
level functionality for app designers.
http://xao-php.sf.net

There is a method to fetch one element based on it's name
http://xao-php.sourceforge.net/api/X...thodndGetOneEl

There is a method for getting an array of nodes from an XPath query
http://xao-php.sourceforge.net/api/X...thodarrNdXPath

These are basically shortcuts to having to write low-level DOM, but they
also do obligatory/mundane sanity checks that you are missing here.
http://xao-php.sourceforge.net/api//....php.html#a392

XAO is for object oriented programmers who would rather leverage code
than have to re-invent the basics every time.

XAO allows you to declare call-back functions based on element names
and/or XPath queries. This provides a custom-tag facility in addition to
DOM functionality.
http://xao-php.sourceforge.net/api/X...cessCustomTags


Jul 17 '05 #2

P: n/a
Terence <tk******@fastmail.fm> shaped the electrons to say:
I had a quick skim read of your post (as it was quite long) and it seems
like you don't check to see if get_elements_by_tagname() is even
returning at least one node.
In the example I did, no, I didn't bother with any real error
checking. I wrote it solely to illustrate the problem I'm having with
a much larger codebase. I stripped most of that stuff to try to keep
the sample from being monsterous. The overall codebase is a few
thousand lines.
and also in your function you are using isset() to test if it is a node.
not gonna work.

if(!isset($nodeRef))
Actually that does work. If you do the lookup and get 0 results, then
set the node to the 0th element of the array, it is null. And the
isset check does catch passing in an unset node. I've had it happen
in the production code. But I'll check is_object and is_a, sounds
like they may do more appropriate checks.
DOM can be laboursome. However, it is definately the best way to create
XML data. DOM is fairly low-level in terms of describing a document,


I know, I've worked with DOM for a while. This codebase already
exists in other languages, the PHP version is more recent. One of the
over riding requirements is keeping the code structures of the various
implementations of the code similar. The Perl, ASP and CF
implementations all use DOM. I wrote the Perl code before we added
PHP as a supported platform. The Perl is using XML::LibXML for this,
which uses libxml2 - same as DOMXML. CF5 and ASP use MSXML, CF MX
uses the built in XML handler CF MX provides.

It was working in our 2.5.5 revision - the 2.6.0 revision restructured
the functions and introduced more pass-by-reference calls, and more
work from subnodes instead of always working from the root node. It
worked fine in all the other languages - but when the changes were
made to PHP, it stopped working. And it stems from this function.

Doing more digging since I posted (pretty much what I've been doing
all day), it looks like I'm going to have to change anyway. While
get_elements_by_tagname worked with xpath in some situations, the
response to one bug report at php.net indicates this has been changed
in newer versions of PHP and the method will *not* support xpath. The
reply indicated that if you want to use xpath, you need to explicitly
use the xpath methods. Unfortunate, since the other languages support
xpath in their equivalent methods (ASP/VB's SelectSingleNode, Perl's
findnodes, etc). But I think the simplest solution may be replacing
all the occurances of get_elements_by_tagname with the xpath
functions. At least all the occurances that break at this time.

Since the code is a sample framework that goes out to customers one of
the general requirements is to try not to depend on being too current
on the releases and trying to stick to libraries that are as common as
possible. XML handling is mandatory since the framework communicates
with a kiosk system that speaks XML only. Since the requirement for
XML was there, all of the configs and such are also stored in XML
since it is a nice format.

I'm looking forward to PHP5 since the XML support is a fundamental
feature. I only got back into PHP a few months ago (I had used it in
PHP3 days, and played with it in the PHP/FI days) when we added PHP to
our list of supported platforms. The initial port of Perl 2.5.5 to
PHP 2.5.5 actually went extremely smoothly. I was kind of surprised
to run into this trouble with 2.6.0 after that experience. There are
two main contexts for the framework - one is working, the other is not.

-MZ, RHCE #806199299900541, ex-CISSP #3762
--
<URL:mailto:megazoneatmegazone.org> Gweep, Discordian, Author, Engineer, me.
"A little nonsense now and then, is relished by the wisest men" 508-755-4098
<URL:http://www.megazone.org/> <URL:http://www.eyrie-productions.com/> Eris
Jul 17 '05 #3

P: n/a
I assumed you were asking why you got the exception about trying to use
a method on a non-object.

The fact that xpath doesn't work with get_elements_by_tagnam() is
testiment to the fact that a standards based approach neccesitates
implementing only features of the lowest common denominator. If you are
writing a cross-platform (ie. multi language) framework, you will always
have these limitations. While PHP5 will be using libxml2, I don't know
if this means it will support XPath in get_elements_by_tagnam(). The
behaviour is non-standard.

Jul 17 '05 #4

P: n/a
Terence <tk******@fastmail.fm> shaped the electrons to say:
I assumed you were asking why you got the exception about trying to use
a method on a non-object.
That was one symptom - why goes get_elements_by_tagname() work if you
start from the docroot or the immediate parent, but no where in
between? Perhaps this is fixed in a later version of PHP - from the
docs it sounds like it *should* recurse the structure no matter where
you start from.
The fact that xpath doesn't work with get_elements_by_tagnam() is
It does appear to work some of the time, and one of the comments left
in the online PHP documentation illustrates using XPath:
http://us2.php.net/manual/en/functio...by-tagname.php

That, and other examples I found when looking for sample code, seemed
to indicate XPath was valid. It seemed to make sense as that also fit
with the behavior of other languages.

But this evening I found this:
http://bugs.php.net/bug.php?id=26205

From one of the comments:
---
ch****@php.net

you are using xpath-expressions and not simple element-names. This may
had worked with older php versions, but the internal code was changed
later.

If you want to use something like "timeopen/year" then use the
appropriate xpath methods (see manual..)
---

So it sounds like the right thing to do is switch to straight tagnames
where possible, and use the xpath functions where it isn't.
implementing only features of the lowest common denominator. If you are
writing a cross-platform (ie. multi language) framework, you will always
It is part of a payment system - the framework exists in different
languages because it is the piece merchant can integrate with their
backend. So we provide it in whatever language the merchant is using
for their site. Before I joined the company a year ago it was
basically a Windows shop so it was ASP and CF, I did the Perl and PHP
implementations. JSP is on the roadmap. But I readily admit I'm
still learning PHP - I tend to implement any new features in Perl
first since that's my primary language, and then port it to PHP. For
the most part that's worked rather well, the language structures are
close enough that a lot of the 'porting' can be done with emacs
regexps. Looks like this is one of the 'gotchas' though - it looked
like 'get_elements_by_tagname' was a drop in replacement for
'findnodes', but that no long seems to be the case.
have these limitations. While PHP5 will be using libxml2, I don't know
if this means it will support XPath in get_elements_by_tagnam(). The
behaviour is non-standard.


Based on what I found tonight, it sounds like the move is away from
allowing XPath in the PHP functions - except for the xpath_eval, etc,
family.

I've probably made my share of newbie errors in the PHP port - I know
I used 'isset' in a number of places and some of them I caught and
made 'is_object', but some are probably still off. Like most things,
it was done with the "we need it yesterday" mandate, so I didn't have
a lot of time to refresh my PHP knowledge. I picked up O'Reilly's
Programming PHP, the PHP.net documentation, and hit the ground
running. ;-)

-MZ, RHCE #806199299900541, ex-CISSP #3762
--
<URL:mailto:megazoneatmegazone.org> Gweep, Discordian, Author, Engineer, me.
"A little nonsense now and then, is relished by the wisest men" 508-755-4098
<URL:http://www.megazone.org/> <URL:http://www.eyrie-productions.com/> Eris
Jul 17 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.