473,396 Members | 2,115 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

How to extract domain from string with regex?

I'm sure someone has passed this way before...

I want to check to see is a domain name is contained in a string, and if one is,
I want to extract it. In these strings, domains are always preceded by
"http://" or "http : //www" (without the spaces).

in pseudo code, I thought it might look like this:

if (eregi("http: //", $mystring))
{
$domain = explode("http: //", $mystring);
$domain = array_reverse($domain);
}
$parts = domain[0];
explode(".", $parts);
if ($parts[0] == "www")
{
$extracted = $parts[1]."."$parts[2];
}
else
{
$extracted = $parts[0]."."$parts[1];
}

Does this look about right?

Thanks in advance.

Aug 26 '06 #1
3 18345
here's a cleaner example:

if (eregi("http://", $mystring))
{
$mystring = explode("http://", $mystring);
$mystring = array_reverse($mystring);
$domain = $mystring[0];
$domain = explode(".", $domain);
if ($domain[0] == "www")
{
$extracted = $domain[1].".".$domain[2];
}
else
{
$extracted = "$domain[0].".".$domain[1];
}
}

Can I egrep on "http://" ? or do I need to escape the "/" ?
Aug 26 '06 #2
*** deko escribió/wrote (Fri, 25 Aug 2006 23:09:28 -0700):
In these strings, domains are always preceded by
"http://" or "http : //www" (without the spaces).
Without the spaces? Then, why do you add the spaces?

Given that precondition, I wouldn't use regex:

parse_url Parse a URL and return its components

usage:
array parse_url ( string url )

Parameters
url
The URL to parse

Return Values
On seriously malformed URLs, parse_url() may return FALSE and emit a
E_WARNING. Otherwise an associative array is returned, whose components may
be (at least one):

scheme - e.g. http
host
port
user
pass
path
query - after the question mark ?
fragment - after the hashmark #
in pseudo code, I thought it might look like this:

if (eregi("http: //", $mystring))
Sorry, but I just can't understand all that story about spaces/not spaces
:-?

--
-+ http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
++ Mi sitio sobre programación web: http://bits.demogracia.com
+- Mi web de humor con rayos UVA: http://www.demogracia.com
--
Aug 26 '06 #3
Thanks for the tip on parse_url.

But I still have to find the URL (which could be anywhere) in the string.
Without the spaces? Then, why do you add the spaces?
Here's a psuedocode example without the spaces:

if (eregi("http://", $mystring))
{
$mystring = explode("http://", $mystring);
$mystring = array_reverse($mystring);
$domain = $mystring[0];
$domain = explode(".", $domain);
if ($domain[0] == "www")
{
$extracted = $domain[1].".".$domain[2];
}
else
{
$extracted = "$domain[0].".".$domain[1];
}
}

Would it be better to use preg_match here?

preg_match('@^(?:http://)?([^/]+)@i',
"http://www.php.net/index.html", $matches);
$host = $matches[1];

// get last two segments of host name
preg_match('/[^.]+\.[^.]+$/', $host, $matches);
echo "domain name is: {$matches[0]}\n";

But would this work if the URL is buried in a string?

Aug 26 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Ori | last post by:
Hi, I have a HTML text which I need to parse in order to extract data from it. My html contain a table contains few rows and two columns. I want to extract the data from the 2nd column in...
4
by: Alex Ayzin | last post by:
Hi, I have a var-sized URL passed into my method. I need to trim it, so instead of : "123abc.MyDomain.com", I ended up with "MyDomain". The size of the initial string is not fixed. IndexOf...
2
by: Thief_ | last post by:
I've got this type of info on a web page: ---------------------------------------------------------------------------- -------------------------------------------- <tr height="25"> <td nowrap...
13
by: Tony Girgenti | last post by:
Hello. Using VS.NET 2003 VB. If i have a string similar to the attached, how would i extract the "Truckname=" data from it in a loop and stay in the loop until the end of the string is reached...
5
by: Mrinal | last post by:
Hi , I am dealing with a strange issue , that , i initially thought would be a sitter to implement , let me know if you have some clue to resolve the issue : In one of my business logic , i...
5
by: deko | last post by:
If I have random and unpredictable user agent strings containing URLs, what is the best way to extract the URL? For example, let's say the string looks like this: registered NYSE 943 <a...
14
by: deko | last post by:
geturl.php Too much code to paste here, but have a look at http://www.liarsscourge.com/ So far, I have not found a string that can break this... Any built-in functions or suggestions for...
4
by: Ciaran | last post by:
Hi can someone give me hand with this please? What's the best way to extract the extension from the url? example: $string="http://www.domain.co.uk/anypage.html" In this example, I'd be...
1
by: GS | last post by:
I need to extract sections out of a long string of about 5 to 10 KB, change any date format of dd Mmm yyyy to yyyy-mm-dd, then further from each section extract columns of tables. what is the...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.