Connecting Tech Pros Worldwide Help | Site Map

Parsing Html

  #1  
Old July 17th, 2005, 01:51 AM
Colum
Guest
 
Posts: n/a
Anyone have any ideas how to parse a html document.

I am trying to extract out specific information from the page.
Also, what do you do if the page is dynamic (e.g. a cgi generated page) how
do you find it??

Thanks
Colum.


  #2  
Old July 17th, 2005, 01:51 AM
Pedro
Guest
 
Posts: n/a

re: Parsing Html


Colum wrote:[color=blue]
> I am trying to extract out specific information from the page.
> Also, what do you do if the page is dynamic (e.g. a cgi generated page) how
> do you find it??[/color]

It depends *very*much* on what you're trying to extract.
I once had my motd come from

<?php
$x = `curl -s http://www.care2.com/`;
$t = strpos($x, 'DAILY QUACK UP');
$y = substr($x, $t, 300);
$t = strpos($y, '</td>');
$z = substr($y, 0, $t);
$z = str_replace('</b></font></a><br>', '', $z);
$z = str_replace('<BR>', '', $z);
echo $z;
?>

Just retested this ... still works :)

--
I have a spam filter working.
To mail me include "urkxvq" (with or without the quotes)
in the subject line, or your mail will be ruthlessly discarded.
  #3  
Old July 17th, 2005, 01:51 AM
Manuel Lemos
Guest
 
Posts: n/a

re: Parsing Html


Hello,

On 10/30/2003 07:46 PM, Colum wrote:[color=blue]
> Anyone have any ideas how to parse a html document.
>
> I am trying to extract out specific information from the page.
> Also, what do you do if the page is dynamic (e.g. a cgi generated page) how
> do you find it??[/color]

You may want to try these classes:

Class: HTMLparser
http://www.phpclasses.org/browse.html/package/244.html

Class: HTMLSax
http://www.phpclasses.org/htmlsax


--

Regards,
Manuel Lemos

Free ready to use OOP components written in PHP
http://www.phpclasses.org/

Closed Thread


Similar Threads
Thread Thread Starter Forum Replies Last Post
Parsing HTML? Benjamin answers 7 June 27th, 2008 05:21 PM
Parsing HTML, extracting text and changing attributes. sebzzz@gmail.com answers 9 June 18th, 2007 08:05 PM
Parsing HTML to remove pictures and stylesheets Seb answers 3 October 21st, 2006 07:25 PM
Parsing HTML files into Java firelli answers 0 July 13th, 2006 12:27 PM
Help with a Simple Question Terry answers 16 July 20th, 2005 02:32 PM