By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
459,700 Members | 1,451 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 459,700 IT Pros & Developers. It's quick & easy.

Downloading and parsing web-stuff

P: n/a
Very basic:

What is the easiest way in php to download the source code (HTML etc.)
of a given URL (say, http://www.google.com) and parse this code for
certain patterns?

I guess my question can be split in two:

1) How do I download a webpage (into a string or whatever)?

2) How can I do string manupulation, regexp matching, information
extraction etc. on the downloaded information?

/David

Jul 17 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a

David Rasmussen wrote:
I guess my question can be split in two:

1) How do I download a webpage (into a string or whatever)?
$string = file_get_contents('http://some.url/blah');
2) How can I do string manupulation, regexp matching, information
extraction etc. on the downloaded information?


now look at the docs for preg_match or ereg
I prefer preg_match

if ( preg_match('|<title>(.*?)</title>|',$string,$matches) )
{
print_r($matches);
}

Jul 17 '05 #2

P: n/a
Treat a full URL as a file.

$contents = implode( file("http://www.google.com/", ''\n") );

Then go to www.php.net/preg_match/ to read up on PCRE (Perl compatible
regular expressions). See also ereg_* functions.

HTH.

-Mike

--
Melt away the Cellulite with Cellulean!
http://www.MeltAwayCellulite.com/
"David Rasmussen" <da*************@gmx.net> wrote in message
news:42*********************@dtext02.news.tele.dk. ..
Very basic:

What is the easiest way in php to download the source code (HTML etc.)
of a given URL (say, http://www.google.com) and parse this code for
certain patterns?

I guess my question can be split in two:

1) How do I download a webpage (into a string or whatever)?

2) How can I do string manupulation, regexp matching, information
extraction etc. on the downloaded information?

/David

Jul 17 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.