By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,521 Members | 1,492 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,521 IT Pros & Developers. It's quick & easy.

loadHTMLFile

P: n/a
I read about a PHP5 function loadHTMLFile*, which I could use to grab
any URL on the Web -- say, http://www.cnn.com tag soup -- and apply
XPath, like "//body/h2/*". Is this for real or does it just work on
well-formed XHTML? Sounds too good to be true. (AFAIK in PHP4 you
mostly need to do regex screen-scraping.)

*Also found here
<http://www.php.net/manual/en/function.dom-domdocument-loadhtmlfile.php>
and here <http://trash.chregu.tv/phpconf2003/examples/src/html.php>.

--
Google Blogoscoped
http://blog.outer-court.com
Jul 17 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
*** Philipp Lenssen wrote/escribió (23 Jun 2004 15:12:03 GMT):
I read about a PHP5 function loadHTMLFile*, which I could use to grab
any URL on the Web -- say, http://www.cnn.com tag soup -- and apply
XPath, like "//body/h2/*". Is this for real or does it just work on
well-formed XHTML?


The very same page you link says:

"Unlike loading XML, HTML does not have to be well-formed to load."
--
--
-- Álvaro G. Vicario - Burgos, Spain
--
Jul 17 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.