Connecting Tech Pros Worldwide Forums | Help | Site Map

How To Extract/Fetch HTML source code from another website?

Newbie
 
Join Date: Jun 2007
Posts: 1
#1: Jun 23 '07
Hi, I'm trying to code my php that allows me to extract or fetch the html codes from another website, then i'll filter it myself to get only the specific text i want and display or echo it directly to my page.

e.g. you goto my page, and it will display a list of google's search result based on a fixed search string i code into the page.

e.g.search "asdf"

in google it will show "http://www.google.com.my/search?hl=en&q=asdf&btnG=Google+Search&meta="

in my page it will show:

Quote:
asdf
www.asdf.com/ - 3k - Cached - Similar pages

What is asdf?
www.asdf.com/whatisasdf.html - 5k - Cached - Similar pages

CLiki : asdf
www.cliki.net/asdf - 17k - Cached - Similar pages

CLiki : ASDF-Install
www.cliki.net/ASDF-Install - 34k - Cached - Similar pages

Association Of Synchronous Data Formats
www.asdf.org/ - 4k - Cached - Similar pages

Home row - Wikipedia, the free encyclopedia
en.wikipedia.org/wiki/Home_row - 16k - Cached - Similar pages

asdf Manual
constantly.at/lisp/asdf/ - 11k - Cached - Similar pages

ASDF - A Simple DVD Frontend for MPlayer
asdf-mplayer.sourceforge.net/ - 4k - Cached - Similar pages

asdf-jkl - Google Code
code.google.com/p/asdf-jkl/ - 7k - Cached - Similar pages

www.myspace.com/asdfrock
profile.myspace.com/index.cfm?fuseaction=user.viewprofile&friendid=318 56324 - 138k - 21 Jun 2007 - Cached -
these text adn hyperlinks are extracted instantly the moment they goto my site.


i know its a dumb function, but i have my reasons.

please help me.

thanks.

Member
 
Join Date: Jun 2007
Posts: 101
#2: Jun 23 '07

re: How To Extract/Fetch HTML source code from another website?


I don't know if it will work on remote sites but on my local server, I indexed pages just by using file_get_contents('http://localhost/xyz").

Try checking the functions in the php manual eg - http://www.php.net/manual/en/functio...t-contents.php

Henry
pbmods's Avatar
Site Moderator
 
Join Date: Apr 2007
Location: Texas
Posts: 5,435
#3: Jun 23 '07

re: How To Extract/Fetch HTML source code from another website?


Heya, zerodevice. Welcome to TSDN!

As an extension to henryrhenryr's suggestion, once you've loaded the HTML source from Google's results, parsing it isn't too difficult.

All you have to do is examine Google's result page source code. Look for common HTML tags that precede every search result. Then just explode() or preg_split() by that string. Then you can just harvest your results from the beginning of each resulting array index, after the 0th one.
Reply