473,569 Members | 2,762 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

php extract variable from a string using regexp

Hi can someone give me hand with this please?
What's the best way to extract the extension from the url?

example:
$string="http://www.domain.co.u k/anypage.html"
In this example, I'd be looking for: "co.uk" but it could be "com",
"net" , or any other extension

Thanks
Ciarán

Jul 17 '07 #1
4 7286
Rik
On Tue, 17 Jul 2007 21:28:00 +0200, Ciaran <cr*******@hotm ail.comwrote:
Hi can someone give me hand with this please?
What's the best way to extract the extension from the url?

example:
$string="http://www.domain.co.u k/anypage.html"
In this example, I'd be looking for: "co.uk" but it could be "com",
"net" , or any other extension
Why? And there is no easy way to allow for 'double extentions' like co.uk
whithout you giving it a list of accepted doubles. There's no 'logical'
way to allow for this.

Well, to get the TLD:

$string ="http://www.domain.co.u k/anypage.html";

//NON REGEX WAY:
$urlinfo = parse_url($stri ng);
$domaincomponen ts = explode('.',$ur linfo['host']);
$extention = end($domaincomp onents);

//REGEX
preg_match('%
^ #match at start
(?:[a-z]+://)? #possible protocol
[^/]*? #domainstring
([^/.]+) #TLD
(?:/|$) #start of path or end of string
%six',$string,$ match);
$extention = $match[1];

If you want to allow for doubles, you'll have to provide a list of
acceptable 'doubles', and match it to the end of the hostname.

--
Rik Wasmus
Jul 17 '07 #2
Rik wrote:
If you want to allow for doubles, you'll have to provide a list of
acceptable 'doubles', and match it to the end of the hostname.
http://publicsuffix.org/list/

If I get bored this evening I might have a go.

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.12-12mdksmp, up 27 days, 17:51.]

PHP Linkifier
http://tobyinkster.co.uk/blog/2007/07/18/linkify/
Jul 18 '07 #3
On Jul 17, 8:59 pm, Rik <luiheidsgoe... @hotmail.comwro te:
On Tue, 17 Jul 2007 21:28:00 +0200, Ciaran <cronok...@hotm ail.comwrote:
Hi can someone give me hand with this please?
What's the best way to extract the extension from the url?
example:
$string="http://www.domain.co.u k/anypage.html"
In this example, I'd be looking for: "co.uk" but it could be "com",
"net" , or any other extension

Why? And there is no easy way to allow for 'double extentions' like co.uk
whithout you giving it a list of accepted doubles. There's no 'logical'
way to allow for this.

Well, to get the TLD:

$string ="http://www.domain.co.u k/anypage.html";

//NON REGEX WAY:
$urlinfo = parse_url($stri ng);
$domaincomponen ts = explode('.',$ur linfo['host']);
$extention = end($domaincomp onents);

//REGEX
preg_match('%
^ #match at start
(?:[a-z]+://)? #possible protocol
[^/]*? #domainstring
([^/.]+) #TLD
(?:/|$) #start of path or end of string
%six',$string,$ match);
$extention = $match[1];

If you want to allow for doubles, you'll have to provide a list of
acceptable 'doubles', and match it to the end of the hostname.

--
Rik Wasmus
Hey thanks a lot Rik, This helps a lot!

Jul 19 '07 #4
Toby A Inkster wrote:
http://publicsuffix.org/list/
If I get bored this evening I might have a go.
OK -- I've written a PHP class that is capable of extracting whichever
components you like out of a hostname, including its "effective TLD"
that is:

Host ETLD
--------------------------------------------
groups.google.c o.uk. co.uk.
www.ealing.nhs.uk. nhs.uk.
www.ipart.nsw.gov.au. nsw.gov.au.
www.google.com. com.
www.last.fm. fm.
www.british-library.uk. uk.
del.icio.us. us.

The syntax for this is:

<?php
include "Domain.class.p hp";
$x = new Domain('www.eal ing.nhs.uk');
echo $x->get_reg_domain ()."\n"; // ealing.nhs.uk.
echo $x->get_etld()."\n "; // nhs.uk.
echo $x."\n"; // www.ealing.nhs.uk.
?>

Link to download is in my signature below...

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.12-12mdksmp, up 28 days, 13:08.]

PHP Domain Class
http://tobyinkster.co.uk/blog/2007/0...-domain-class/
Jul 19 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
39334
by: Anand Pillai | last post by:
To search a word in a group of words, say a paragraph or a web page, would a string search or a regexp search be faster? The string search would of course be, if str.find(substr) != -1: domything() And the regexp search assuming no case restriction would be,
5
14144
by: Ones Self | last post by:
Hi all: I'm trying to replace using a regexp read from a file: $string = '123 456 789'; # these two are usualy read from a file, # and so have to be in variables. $re = '()'; $rep = '|$1|';
15
7585
by: Miguel Orrego | last post by:
Hi, I have a variable in an app called GenericTitle which contains text, a persons job title funnily enough. I want to check whether this variable contains the word "director" and if it does, then redirect to another page for example. Can somebody post some code that would let me check this?
1
1760
by: Martin John Brindle | last post by:
I need to build a regular expression where the expression contents are variable. for example I have a string that i need to search for, but the string can change. If i have a variable called string I need to look at the contents of string otherwise /string/ obviously doesn't work!
9
16961
by: Sharon | last post by:
hi, I want to extract a string from a file, if the file is like this: 1 This is the string 2 3 4 how could I extract the string, starting from the 10th position (i.e. "T") and extract 35 characters (including "T") from a file and then go to next line?
7
2874
by: teo | last post by:
hallo, I need to extract a word and few text that precedes and follows it (about 30 + 30 chars) from a long textual document. Like the description that Google returns when it has found a given word. In example from:
6
5949
by: Dave | last post by:
Hope someone can help! I have a memo fiels in which there are a few numbers including dates but what I want to do is extract a number which is 6 figures long. Can anyone help me? Thanks Dave
0
743
by: Ciaran | last post by:
Hi what's the best way to extract a var from a string based on a regexp? I can't seem to find the right function. I want to get the domain extension from any url. examples: http://www.domain.com/test.php $extension="com"; http://www.domain.co.uk/test.php $extension="co.uk"; Thanks Ciarán
1
4783
by: Alberto Sartori | last post by:
Hello, I have a html text with custom tags which looks like html comment, such: "text text text <p>text</ptext test test text text text <p>text</ptext test test <!-- @MyTag@ -->extract this<!-- /@MyTag@ --> text text text <p>text</ptext test test <!-- @MyTag@ -->and this<!-- /@MyTag@ --> text text text <p>text</ptext test test"
2
11670
by: X l e c t r i c | last post by:
Here: http://bigbangfodder.fileave.com/res/sandr.html I'm trying to use string.replace() for a basic search and replace form using textarea values as the regexp and replacement values for string.replace(). When I tried to use the textarea variable name for regexp it didn't work as I thought it would. For example:
0
7609
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7921
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
7666
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
7964
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
1
5504
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5217
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3651
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
1208
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
936
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.