By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,441 Members | 1,766 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,441 IT Pros & Developers. It's quick & easy.

What is best way to turn local link into complete url?

P: 290
I am using curl and DOMDocument
to extract the links from my website.

This is my script:

Expand|Select|Wrap|Line Numbers
  1. require("my_functions.php");
  3. $target_url = "";
  4. $userAgent = 'Googlebot/2.1 (';
  6. echo "<br>Starting<br>Target_url: $target_url";
  8. // make the cURL request to $target_url
  9. $ch = curl_init();
  10. curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
  11. curl_setopt($ch, CURLOPT_URL,$target_url);
  12. curl_setopt($ch, CURLOPT_FAILONERROR, true);
  13. curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
  14. curl_setopt($ch, CURLOPT_AUTOREFERER, true);
  15. curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
  16. curl_setopt($ch, CURLOPT_TIMEOUT, 10);
  17. $page= curl_exec($ch);
  18. if (!$page) {
  19.     echo "<br />cURL error number:" .curl_errno($ch);
  20.     echo "<br />cURL error:" . curl_error($ch);
  21.     exit;
  22. }
  24. // parse the html into a DOMDocument
  25. $doc = new DOMDocument();
  26. $doc->loadHTML($page);
  28. //echo $doc->saveHTML();
  30. $params = $doc->getElementsByTagName('a'); // Find  the a hrefs
  31. $k=0;
  32. foreach ($params as $param) //go to each section 1 by 1
  33. {
  34.          echo "Section Attribute :-> ".$params->item($k)->getAttribute('href')."<br>";   //get a
  36. $k++;   
  38. }
  39. ?> 
As you can see the target page is this one:

customer service software

and the output is:


Section Attribute :-> index.php
Section Attribute :-> works.php
Section Attribute :-> pricing.php
Section Attribute :-> special.php
Section Attribute :-> contact.php
Section Attribute :-> login.php
Section Attribute :-> Customer-Service-Software.php
Section Attribute :-> articles.php
Section Attribute :-> Why-Get-An-Internet-Security-Seal.php
Section Attribute :-> The-Fantastic-Return-on-Investment-from-Trust-Seals.php
Section Attribute :-> Turn-Browsers-Into-Buyers-Increase-Your-Sales-Conversion.php
Section Attribute :-> Selecting-The-Best-Trust-Seal-To-Boost-Your-Sales-Conversions.php
Section Attribute :-> Give-Great-Customer-Service-And-Get-A-Trust-Seal-to-Prove-It.php
Section Attribute :-> Customer-Service-Software-Solutions-For-Online-Business.php
Section Attribute :-> 73-Per-Cent-Of-Buyers-Abort-Their-Purchases-How-To-Change-It.php
Section Attribute :-> Why-Are-Your-Visitors-Not-Buying-Your-Products.php
Section Attribute :->
Section Attribute :->
Section Attribute :-> terms.php
Section Attribute :-> privacy.php
Section Attribute :-> earnings_disclaimer.php
Section Attribute :-> articles.php

Works quite well, but some of the links are local and some are full urls.

Given the code I am already using, what is the best way to get
all these links shown as complete urls.

Is there a DOMDoc method to do this ?

Also I want to get out and store the website address
i.e. just the "" part.

I realize that I could do it with a preg_match.

It could also be done with strpos and substr - but it would be a bit messy.

But I just thought that if there is something in the DOM class that can do the job then it may be quicker and more efficient.

What do you professionals think is the best way to get the data that I want ?
And how should I construct it ?
Oct 24 '09 #1
Share this Question
Share on Google+
1 Reply

Expert 100+
P: 1,168
I would simply define your domain name, and then with an if function, add it to those where it's not detected.

If there are different folders for each link, then it's more complicated and I am yet to think of an efficient way.
Oct 25 '09 #2

Post your reply

Sign in to post your reply or Sign up for a free account.