473,788 Members | 3,068 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regular Expression help - Extract links from certain tag

Hi Guys,

I was wondering if someone could help me out with the following
requirements
<mydocument>
<div id="other">
<a href="linkother ">linkother </a>
</div>

<div id="hello">
<a href="link1url" >link1</a>
<a href="link2url" >link2</a>
</div>
</mydocument>

If I wanted to extract all links from the div tag id="hello" how do I
go about it
Desired result would be:
link1url
link2url

So far I'm extracting links like this: <a href="[^"]+">[^<]+</a> but
how do I go about only making sure they are from a particular tag
group?

Regards DotnetShadow

Jun 29 '06 #1
1 1436
On 29 Jun 2006 06:14:40 -0700, ro********@gmai l.com wrote:
Hi Guys,

I was wondering if someone could help me out with the following
requirements
<mydocument>
<div id="other">
<a href="linkother ">linkother </a>
</div>

<div id="hello">
<a href="link1url" >link1</a>
<a href="link2url" >link2</a>
</div>
</mydocument>

If I wanted to extract all links from the div tag id="hello" how do I
go about it
Desired result would be:
link1url
link2url

So far I'm extracting links like this: <a href="[^"]+">[^<]+</a> but
how do I go about only making sure they are from a particular tag
group?

Regards DotnetShadow


You could look at loading it into a DOM tree (possibly XML DOM if the
document is well formed). Then you just have to navigate the tree
looking for div tags with an id attribute of hello, then fetch all the A
tags under that node (could be possible using XPATH).
Jun 29 '06 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
2623
by: Keith Morris | last post by:
Hi all! I'm creating a mini CMS that will store content in a MySQL database. What I am trying to do is parse the content and replace certain keywords with a link. The keywords and associated links are kept in a MySQL table. Here is an example. $keyword = "Widgets Technology Co."; $location = "http://www.widgets.com/about";
8
2432
by: Michael McGarry | last post by:
Hi, I am horrible with Regular Expressions, can anyone recommend a book on it? Also I am trying to parse the following string to extract the number after load average. ".... load average: 0.04, 0.02, 0.01" how can I extract this number with RE or otherwise?
1
4184
by: Kenneth McDonald | last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make regular expressions easier to create and use (and in my experience as a regular expression user, it makes them MUCH easier to create and use.) I'm still working on formal documentation, and in any case, such documentation isn't necessarily the...
4
305
by: Steve | last post by:
Hi all I have to validate a password to determine whether or not it adheres to certain rules. For example, the password must contain at least 1 number, at least 1 uppercase character and at least 1 lowercase character. Are regular expressions the way to perform this validation? I know I can hard-code it in at the moment, but the rules may change in the future and I'd like to store the expression in an external file so it can be altered...
3
2029
by: Tom | last post by:
I have struggled with the issue of whether or not to use Regular Expressions for a long time now, and after implementing many text manipulating solutions both ways, I've found that writing specialized code instead of an RE is almost always the better solution. Here is why.... RE's are complex. Sure it is one line of code, but it is on hell of a line. Some of my RE remind me of the obfuscated code contest winners, where one line of...
4
3231
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go over each document, find out if it contains a header and/or a footer and extract only the main content part. The headers and the footers have no specific format and I have to detect and remove them using a list of strings that may appear as...
3
5058
by: jarod1701 | last post by:
Hi, I'm currently trying to create a regular expression that can extract certain elements from a url. The url will be of the following form: http://user:pass@www.sitename.com I want a regex that matches the "user" part of that url.
18
3043
by: Q. John Chen | last post by:
I have Vidation Controls First One: Simple exluce certain special characters: say no a or b or c in the string: * Second One: I required date be entered in "MM/DD/YYYY" format: //+4 How ??
9
3359
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use an app call The Regulator, which makes it pretty easy to build and test regular expressions. As a warning, I'm real weak with regular expressions. Let's say my regular expression is:
0
9498
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10363
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10172
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9964
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7517
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6749
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5398
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4069
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3670
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.