Connecting Tech Pros Worldwide Forums | Help | Site Map

Scrape parts of a page (php)

Newbie
 
Join Date: Nov 2009
Posts: 2
#1: 3 Weeks Ago
I'm looking to scrape an image off a website. The image changes daily. The image name however consistently starts with "product_image". How could I search for any image with that beginning, and then display it on a page, probably appending the url of the site to it as well.

Also, is it pretty simple once I pull the image to drop it in a database?

Thanks so much for any help you can give!

Atli's Avatar
Moderator
 
Join Date: Nov 2006
Location: Iceland
Posts: 3,751
#2: 3 Weeks Ago

re: Scrape parts of a page (php)


Hey.

Yes, you only need to add the URL of the page to the image source.
Like: http://www.example.com/images/image.jpg

Quote:

Originally Posted by edskellington View Post

Also, is it pretty simple once I pull the image to drop it in a database?

Could you explain that a little better?
Don't really understand what yo mean.
Newbie
 
Join Date: Nov 2009
Posts: 2
#3: 3 Weeks Ago

re: Scrape parts of a page (php)


I'm looking to build an aggregation site which will contain 6 images from 6 different sites.

So for instance, I want to pull down the detail image from http://teefury.com. Here is the code.

Expand|Select|Wrap|Line Numbers
  1. <div id="product_design">
  2. <table width="100%" cellspacing="0" cellpadding="0" border="0">
  3. <tbody>
  4. <tr>
  5. </tr>
  6. <tr>
  7. <td valign="top" colspan="3">
  8. <img border="0" src="products_large_images/bottom-sfl.jpg"/>
  9. </td>
  10. </tr>
  11. </tbody>
  12. </table>
  13. </div>
This image changes everyday because the website sells a new shirt everyday, (like WOOT.com)

I want to display THAT one image... but when I pull down that chunk of code, obviously the image isn't working cause the full path isn't on there.... so how do I append the first part of the url "http://teefury.com/"

The end part of that url will change "bottom-sfl.jpg" but the first part "products_large_images" won't so that is my constant. If that makes sense...?

Thanks, hope that helps! Thanks in advance!
Newbie
 
Join Date: Dec 2008
Posts: 2
#4: 2 Weeks Ago

re: Scrape parts of a page (php)


I guess you want to construct the correct url from the source html. Here is the script.

Expand|Select|Wrap|Line Numbers
  1. # Script ImageURL.txt
  2. var str html, url
  3. cat "http://teefury.com" > $html
  4. stex -p -c -r "^<img&\.jpg&\>^" $html > $url
  5. sap -c -r "^src&=&\"^" "http://teefury.com/" $url

We are simply extracting the <img...> from the html, and appending "http://teefury.com/" after src=" . I am using regular expressions. See documentation at http://www.biterscripting.com/helppages/RE.html .

Save the script in file C:/Scripts/ImageURL.txt, and enter the following command in biterscripting

Expand|Select|Wrap|Line Numbers
  1. script "C:/Scripts/ImageURL.txt"

This script can also be called from a php or any other code.

Needless to say, you should acquire the web site owner's permission before referring to their web site urls from your web pages.
Reply


Similar PHP bytes