By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
431,966 Members | 2,054 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 431,966 IT Pros & Developers. It's quick & easy.

php to spider a website

P: n/a
I am looking for a script that I can use to spider a website, and then pull
the images... I know how to do it for a single page, but, I would like to be
able to do this for the entire site. Any suggestions?

Thanks,
Kyle Mizell
http://www.pimpinonline.com
Jul 17 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
jn
"Kyle Mizell" <ky**@pimpinonline.comNOSPAM> wrote in message
news:qewyb.174752$Dw6.686810@attbi_s02...
I am looking for a script that I can use to spider a website, and then pull the images... I know how to do it for a single page, but, I would like to be able to do this for the entire site. Any suggestions?

Thanks,
Kyle Mizell
http://www.pimpinonline.com


I don't know about your question, but pimpinonline.com is awesome.
Jul 17 '05 #2

P: n/a
Kyle Mizell wrote:
I am looking for a script that I can use to spider a website, and then pull
the images... I know how to do it for a single page, but, I would like to be
able to do this for the entire site. Any suggestions?


Why php? Use wget if all you want is a somple spider job.

Jul 17 '05 #3

P: n/a
On Mon, 01 Dec 2003 00:49:26 GMT, "Kyle Mizell" <ky**@pimpinonline.comNOSPAM>
wrote:
I am looking for a script that I can use to spider a website, and then pull
the images... I know how to do it for a single page, but, I would like to be
able to do this for the entire site. Any suggestions?


PHP has HTTP client functions; you can simply use file() with a URL.

However, to extract information from the HTML, you need an HTML parser
(regular expressions alone are not sufficient). PHP doesn't have one built in
or as one of the standard extensions. Personally I'd use Perl for this (e.g.
HTML::Parser). I think there is an HTML parser for PHP called HTML-Sax, have a
search for that.

--
Andy Hassall (an**@andyh.co.uk) icq(5747695) (http://www.andyh.co.uk)
Space: disk usage analysis tool (http://www.andyhsoftware.co.uk/space)
Jul 17 '05 #4

P: n/a
"Kyle Mizell" <ky**@pimpinonline.comNOSPAM> wrote in message news:<qewyb.174752$Dw6.686810@attbi_s02>...
I am looking for a script that I can use to spider a website, and then pull
the images... I know how to do it for a single page, but, I would like to be
able to do this for the entire site. Any suggestions?

Thanks,
Kyle Mizell
http://www.pimpinonline.com


As you do for one page do for all your pages.
In one array store all links foud on first page (eliminate
duplicates), then do for all this pages as for first page.
I think the beset is to make function, which save one page and return
found links, then call your function with all urls.
While you are saving a page you have to replace links because static
names will be diferent
i.e.
me*************************************@pimpinonli ne.com&unset_search=true
replace with
members_php_search_sex_Male_search_kyle_pimpinonli ne_com_unset_search_true.HTML

and so name all stored pages.

enjoy
Jul 17 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.