By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,116 Members | 1,319 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,116 IT Pros & Developers. It's quick & easy.

Eregi pattern matching - bit of a challenge I thinks

P: n/a
Hi,. I'm trying to detect any links that are contained within an html page
using eregi pattern matching. I was wondering if there are any pattern
matching geniuses out there who could write a pattern that merges all the
different manners in which a link could be wriiten,

Current patterns I can think of include:

<a href=x.com> no spaces betwen href, equals and url, no quotation marks
around url
<a href =x.com> space between href and equals, no space between equals and
url, no quotation marks round url
<a href= x.com> no space between href and equals, space between equals and
url, no quotation marks around url
<a href = x.com> space between href and equals, space between equals and
url, no quotation marks round url
<a href='x.com'> no spaces betwen href, equals and url, single quotation
marks around url
<a href ='x.com'> space between href and equals, no space between equals and
url, single quotation marks round url
<a href= 'x.com'> no space between href and equals, space between equals and
url, single quotation marks around url
<a href = 'x.com'> space between href and equals, space between equals and
url, single quotation marks round url

<a href="x.com"> no spaces betwen href, equals and url, double quotation
marks around url
<a href ="x.com"> space between href and equals, no space between equals and
url, double quotation marks round url
<a href= "x.com"> no space between href and equals, space between equals and
url, double quotation marks around url
<a href = "x.com"> space between href and equals, space between equals and
url, double quotation marks round url

<a href='x.com"> no spaces betwen href, equals and url, mismatched quotation
marks around url - single open, double to close
<a href ='x.com"> space between href and equals, no space between equals and
url, mismatched quotation marks around url - single open, double to close
<a href= 'x.com"> no space between href and equals, space between equals and
url,mismatched quotation marks around url - single open, double to close
<a href = 'x.com"> space between href and equals, space between equals and
url, mismatched quotation marks around url - single open, double to close

<a href="x.com'> no spaces betwen href, equals and url, mismatched quotation
marks around url - double open, single to close
<a href ="x.com'> space between href and equals, no space between equals and
url, mismatched quotation marks around url - double open, single to close
<a href= "x.com'> no space between href and equals, space between equals and
url,mismatched quotation marks around url - double open, single to close
<a href = "x.com'> space between href and equals, space between equals and
url,mismatched quotation marks around url - double open, single to close
I guess whats needed is something more advanced than

eregi("href=\"/(.*)\">",string,$arryaholding_results))

I'd appreciate any help you could give,

Thanks
NimP


Jul 17 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
"NimP" <st*@sturobbie.co.uk> wrote:
Hi,. I'm trying to detect any links that are contained within an html
page using eregi pattern matching. I was wondering if there are any
pattern matching geniuses out there who could write a pattern that
merges all the different manners in which a link could be wriiten,

I'm sure there is an easier solution out there somewhere, but by going
through your examples I came up with that (wouldn't validate an URL
though):

preg_match("/<a(\s)+href(\s)*=(\s)*(['\"])*([a-z0-9_\-\.])+(['\"])*>/i",
$string, $matches);

echo htmlentities($matches[0]);

JOn
Jul 17 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.