By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
434,640 Members | 2,094 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 434,640 IT Pros & Developers. It's quick & easy.

Parsing pricerunner.com results via regular expression.

P: 59
hi have been trying to write a regular expression in php that will get the price of any product page at pricerunner.com, if you could suggest a regular expression i would be very gratefull.
thnaks
May 18 '07 #1
Share this Question
Share on Google+
18 Replies


pbmods
Expert 5K+
P: 5,821
Changed forum title to better match contents.

Heya, ojsimon. Welcome to TSDN!

First thing to do is to write the script to connect to PriceRunner and grab the results page. Then all you need to do is examine the results, locate the element that contains the price and program a [regex] search for it.

Once you get to that point, let us know if you have any further problems.
May 18 '07 #2

P: 59
I have been trying to write a regular expression to do this i have done the other things this is where my problem lies
May 19 '07 #3

pbmods
Expert 5K+
P: 5,821
I have been trying to write a regular expression to do this i have done the other things this is where my problem lies
Post a snippet of the pricerunner.com stream that your script needs to parse so we can see how the regular expression needs to be structured.
May 19 '07 #4

P: 59
mpare prices : Nokia N93
Talk time: 5, standby time: 240, Camera: Yes, Integrated, 180 gram, WAP, GPRS, MP3 More product info
Price range:
405.38 - 409.99

from here i need the price range.
Thanks
May 20 '07 #5

pbmods
Expert 5K+
P: 5,821
mpare prices : Nokia N93
Talk time: 5, standby time: 240, Camera: Yes, Integrated, 180 gram, WAP, GPRS, MP3 More product info
Price range:
405.38 - 409.99
Is pricerunner sending your script plain text like that, or are you receiving HTML or an RSS feed?
May 20 '07 #6

P: 59
html, and i want it to work for all pricerunner product pages
Thanks
May 20 '07 #7

pbmods
Expert 5K+
P: 5,821
html, and i want it to work for all pricerunner product pages
Thanks
All you have to do is find the HTML tags that contain the data you need, then use create a backreferences to capture the values you need.

So for example, if your data were located here:
Expand|Select|Wrap|Line Numbers
  1. <div>Price Range:</div>405.38 - 409.99
  2.  
Expand|Select|Wrap|Line Numbers
  1. /(?<=<div>Price Range:<\/div>)(\d+\.\d{2})\s-\s(\d+\.\d{2})/
  2.  
Run that through preg_match, and your match array will be:
Expand|Select|Wrap|Line Numbers
  1. array
  2. (
  3.     [0] => 405.38 - 409.99
  4.     [1] => 405.38
  5.     [2] => 409.99
  6. )
  7.  
May 20 '07 #8

P: 1
Hi There

I'm from PriceRunner.

You can just access our API and get everything back in XML format. That would be much easier for you and we would not have the load on our server :)

Send me a mail and I will ensure that you get going.

Best
Martin Andersen
GM, PriceRunner.com
May 24 '07 #9

P: 59
Sorry such a late reply but how do i use
/(?<=<div>Price Range:<\/div>)(\d+\.\d{2})\s-\s(\d+\.\d{2})/
in order to get the price i don't understand how you put this in a preg match and replace. and what i am doing at the moment is a simple php get source command is that ok.
Thanks
Jul 4 '07 #10

pbmods
Expert 5K+
P: 5,821
Heya, ojsimon.

Sorry such a late reply but how do i use ... in order to get the price i don't understand how you put this in a preg match and replace. and what i am doing at the moment is a simple php get source command is that ok.
The regex uses lookbehind and lookahead to match (but not include) the block that contains the data you want.

But as PriceRunnerUS mentioned, there is an API for retrieving the info you're looking for.
Jul 4 '07 #11

P: 59
i cannot find an api for the uk version of pricerunner, sorry, but could you please show me how to put it into the preg match and preg replace, i still do not understand this despite quite a lot of research.
Thanks
Jul 5 '07 #12

pbmods
Expert 5K+
P: 5,821
Heya, ojsimon.

It looks like to get access to their API, you must first become a partner:
http://www.pricerunner.com/partner/partner.html

Not sure if that means that you have to give them money. I sent a PM to PriceRunnerUS and asked him to provide more details. We'll see what happens.

The search results page looks a little tricky to parse, but it looks like every price is listed like this:

Expand|Select|Wrap|Line Numbers
  1. <span class="listprice">184.99</span>
So you need to grab the '184.99' inside of that SPAN. To do that, you must preg_match_all() using a lookbehind:

Expand|Select|Wrap|Line Numbers
  1. $html = file_get_contents('http://pricerunner.co.uk/search?q=' . $searchOrWhateverYouCalledIt);
  2. preg_match_all('/(?<=<span class="listprice">)\d+\.\d{2}/', $html, $matches);
  3.  
Jul 5 '07 #13

P: 59
Thanks for all your help, i tried the code you suggested and it returned a blank page, i tried to echo the $matches and $html but neither worked, as i am an absolute begginer with php i have no idea what to do could you please explain thanks again for all your help.
Olie
Jul 5 '07 #14

P: 59
sorry, how do i use preg match and replace could you tell me any sites where i can learn how to use them to fulfill my request previously
Thanks
Jul 11 '07 #15

P: 2
sorry, how do i use preg match and replace could you tell me any sites where i can learn how to use them to fulfill my request previously
Thanks

Here's a simple example on how to use those functions:

$pattern = "/[^a-zA-Z0-9]/";
$replacement = " _";
$replaced_name= preg_replace($pattern,$replacement,$original_name) ;

This example shows you the pattern you are searching for, in this case, anything that is not a letter or number and replacing it with an underscore. It takes an original name variable (i.e. Tac k#y) and after it goes through preg_replace returns something like "Tac_k_y"

Hope that helps
Jul 11 '07 #16

pbmods
Expert 5K+
P: 5,821
Heya, Olie.

A blank page means that your script is probably generating errors.

Check out this article.
Jul 11 '07 #17

P: 59
Heya, Olie.

A blank page means that your script is probably generating errors.

Check out this article.

Sorry for very late reply
but with echo $matches the script retuns 'array' and with echo $html it returns the whole page.
How can i fix this?

[PHP]<?php
$html = file_get_contents('http://pricerunner.co.uk/search?q=ipod');
preg_match_all('/(?<=<span class="listprice">)\d+\.\d{2}/', $html, $matches);

echo $html;
?>[/PHP]
Thanks
Jun 27 '08 #18

pbmods
Expert 5K+
P: 5,821
Try
Expand|Select|Wrap|Line Numbers
  1. print_r($matches);
Jun 27 '08 #19

Post your reply

Sign in to post your reply or Sign up for a free account.