473,236 Members | 1,694 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,236 software developers and data experts.

Downloading and parsing web-stuff

Very basic:

What is the easiest way in php to download the source code (HTML etc.)
of a given URL (say, http://www.google.com) and parse this code for
certain patterns?

I guess my question can be split in two:

1) How do I download a webpage (into a string or whatever)?

2) How can I do string manupulation, regexp matching, information
extraction etc. on the downloaded information?

/David

Jul 17 '05 #1
2 1293

David Rasmussen wrote:
I guess my question can be split in two:

1) How do I download a webpage (into a string or whatever)?
$string = file_get_contents('http://some.url/blah');
2) How can I do string manupulation, regexp matching, information
extraction etc. on the downloaded information?


now look at the docs for preg_match or ereg
I prefer preg_match

if ( preg_match('|<title>(.*?)</title>|',$string,$matches) )
{
print_r($matches);
}

Jul 17 '05 #2
Treat a full URL as a file.

$contents = implode( file("http://www.google.com/", ''\n") );

Then go to www.php.net/preg_match/ to read up on PCRE (Perl compatible
regular expressions). See also ereg_* functions.

HTH.

-Mike

--
Melt away the Cellulite with Cellulean!
http://www.MeltAwayCellulite.com/
"David Rasmussen" <da*************@gmx.net> wrote in message
news:42*********************@dtext02.news.tele.dk. ..
Very basic:

What is the easiest way in php to download the source code (HTML etc.)
of a given URL (say, http://www.google.com) and parse this code for
certain patterns?

I guess my question can be split in two:

1) How do I download a webpage (into a string or whatever)?

2) How can I do string manupulation, regexp matching, information
extraction etc. on the downloaded information?

/David

Jul 17 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Luke StClair | last post by:
Only marginally belonging in this newsgroup... but oh well. I've just started writing in python, and I want to make the files available on the web. So I did the standard <a...
0
by: TJ | last post by:
Hi, I've written code web-based uploading and downloading. Here is some code for it. For saving file into MS-SQL database, SaveFileIntoDB(HttpPostedFile file) { int fileLength =...
6
by: Shawn | last post by:
Hi. How can I download a file and store it on the web server. I have a complete URL to the file, but I never know what kind of file it is. It can be pdf, jpg, tif, doc, xls etc. Thanks, Shawn
4
by: Joe | last post by:
I'm hosting my web service on a Windows 2003 box which is remotely located. When trying to add a web reference to a C# project I get an error message 'There was an error downloading...
4
by: Richard L Rosenheim | last post by:
I know that I can download a file from a web server by using the WebClient.DownloadFile method. But, does anyone know of an example of downloading a file from a web server with the ability to...
23
by: Doug van Vianen | last post by:
Hi, Is there some way in JavaScript to stop the downloading of pictures from a web page? Thank you. Doug van Vianen
2
by: Tomas Martinez | last post by:
Hi there! I'm trying to download a file in my asp.net web, but when downloading it from a Firefox browser, instead of downloading the example.exe file, it's downloading example.exe.htm. My code...
1
by: Lespaul36 | last post by:
I am trying to make a downloader using sockets to download pictures from a website I have to log in to the website, so I am adding a line for authentication "Authentication Basic...
4
by: Nik0001 | last post by:
Hello everyone! I have the following problem I need to download several HTML pages and get meta-tags out of the code. I decided it would be better to download only the meta-tags rather than...
1
by: shahidrasul | last post by:
i want to download a file which user select from gridview, downloading is completing without problem but after download i want to refresh my page because i do some changes in db . but when...
0
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
0
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.