473,473 Members | 1,484 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Parsing content for links

I have a content management system that has links within the content
field in the database and I need to verify if those links are correct.
What I need to have happen is have a php script query the database and
then parse through the content field to find all the <a hreftags to
get the href attribute value and the link text.

Does anyone have a way of doing this or a regex to do this?

Thanks,
Tony

Feb 21 '07 #1
2 1427
Tony schreef:
I have a content management system that has links within the content
field in the database and I need to verify if those links are correct.
What I need to have happen is have a php script query the database and
then parse through the content field to find all the <a hreftags to
get the href attribute value and the link text.

Does anyone have a way of doing this or a regex to do this?
preg_match_all ("/a[\s]+[^>]*?href[\s]?=[\s\"\']+".
"(.*?)[\"\']+.*?>"."([^<]+|.*?)?<\/a>/",
$html, &$matches);
--
Arjen
http://www.hondenpage.com - Mijn site over honden
Feb 22 '07 #2
Tony wrote:
I have a content management system that has links within the content
field in the database and I need to verify if those links are correct.
What I need to have happen is have a php script query the database and
then parse through the content field to find all the <a hreftags to
get the href attribute value and the link text.

Does anyone have a way of doing this or a regex to do this?

Thanks,
Tony
Yeah, regex would be easiest, and there should be plenty out there,
but I might do something like this:

$re = '%
<a[^<>]+ # href may or may not come first
href=([\'"]) # capture single/double quote

# match a valid URI
(
[\w.-]+:(?://)? # scheme
[^?"]+ # authority

# possible query string and fragment
(?:
\\? [^#]+
(?: \\# [^"]+ )?
)?
)

\1 # captured quote from above
[^<>]* # possible remaining attributes
>( .*? ) # allow for nested tags
</a> # closing <atag
%xi';

The match for the URI would be in $match[2] and the text for the <a>
tag is in $match[3].

Just use this $re var in the preg_* functions.

Hope this helps,
Curtis
Feb 22 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: YoBro | last post by:
Hi I have used some of this code from the PHP manual, but I am bloody hopeless with regular expressions. Was hoping somebody could offer a hand. The output of this will put the name of a form...
8
by: Gerrit Holl | last post by:
Posted with permission from the author. I have some comments on this PEP, see the (coming) followup to this message. PEP: 321 Title: Date/Time Parsing and Formatting Version: $Revision: 1.3 $...
0
by: Naren | last post by:
I have an XML like the one below. I am using SAX parsing and I need to get the information between the tags of the Email element. First i try to access the content and print it out and it gives...
3
by: Girish | last post by:
Hi All, I have written a component(ATL COM) that wraps Xerces C++ parser. I am firing necessary events for each of the notifications that I have handled for the Content and Error handler. The...
0
by: cpavon | last post by:
Hello everyone, I am fairly new to MACT, I am currently trying to parse the oResponse.Body to retrive a dynamic values...store in an array and then randomly post those values. Does anyone...
0
by: Pentti | last post by:
Can anyone help to understand why re-parsing occurs on a remote database (using database links), even though we are using a prepared statement on the local database: Scenario: ======== We...
9
by: ankitdesai | last post by:
I would like to parse a couple of tables within an individual player's SHTML page. For example, I would like to get the "Actual Pitching Statistics" and the "Translated Pitching Statistics"...
3
by: Aaron | last post by:
I'm trying to parse a table on a webpage to pull down some data I need. The page is based off of information entered into a form. when you submit the data from the form it displays a...
0
by: taa | last post by:
Hi there I’m trying to come up with a smart way of parsing content from textboxes in C#. I have about 7-10 boxes with different content; dates, times, numbers and text that has to be parsed and...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.