473,405 Members | 2,300 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,405 software developers and data experts.

Regex problem - please help.

My problem is simple, but I spent lot of time playing with regex and I am
going nuts.

I need to automatically (many times per day) extract HEADING and
DESCRIPTION from the html code below?
HTML CODE:

<a href="http://www.mylink.com">HEADING</a><br>DESCRIPTION<br>
I am able to get the link already using this regex command:
"a.*href\s*=\s*(?:""(?<1>[^""]*)""|(?<1>\S+))"

Can someone write REGEX command to get HEADING and DESCRIPTION.
Please, it would be really appreciated.

K.
Nov 20 '05 #1
4 1283
Hi,

Check out regexlib.org. Has a ability to search for regex and link
to the regulator a regex tester.

http://www.regexlib.com/Default.aspx
http://www.regexlib.com/Search.aspx?k=html
Ken
---------------------
"Krakatioison" <Kr**********@huh.com> wrote in message
news:41**********@Usenet.com...
My problem is simple, but I spent lot of time playing with regex and I am
going nuts.

I need to automatically (many times per day) extract HEADING and
DESCRIPTION from the html code below?
HTML CODE:

<a href="http://www.mylink.com">HEADING</a><br>DESCRIPTION<br>
I am able to get the link already using this regex command:
"a.*href\s*=\s*(?:""(?<1>[^""]*)""|(?<1>\S+))"

Can someone write REGEX command to get HEADING and DESCRIPTION.
Please, it would be really appreciated.

K.

Nov 20 '05 #2
This I already went to... hm..
I guess there is no one who can fix me with the code, just by looking at it.
K.
"Ken Tucker [MVP]" <vb***@bellsouth.net> wrote in message
news:u7**************@TK2MSFTNGP09.phx.gbl...
Hi,

Check out regexlib.org. Has a ability to search for regex and link to the regulator a regex tester.

http://www.regexlib.com/Default.aspx
http://www.regexlib.com/Search.aspx?k=html
Ken
---------------------
"Krakatioison" <Kr**********@huh.com> wrote in message
news:41**********@Usenet.com...
My problem is simple, but I spent lot of time playing with regex and I am
going nuts.

I need to automatically (many times per day) extract HEADING and
DESCRIPTION from the html code below?
HTML CODE:

<a href="http://www.mylink.com">HEADING</a><br>DESCRIPTION<br>
I am able to get the link already using this regex command:
"a.*href\s*=\s*(?:""(?<1>[^""]*)""|(?<1>\S+))"

Can someone write REGEX command to get HEADING and DESCRIPTION.
Please, it would be really appreciated.

K.

Nov 20 '05 #3
Krakatioison,
This one should work, I tested it against the sample you provided. You
may want to include the ignore case option. Let me know how that works out
for you.
Jared

(?:<a\s+href=[\"\'](?<Link>.+?)[\"\'>]+(?<Heading>(\w+))</a>(<(\w+)>)(?<Description>.*)\2)

"Krakatioison" <Kr**********@huh.com> wrote in message
news:41**********@Usenet.com...
This I already went to... hm..
I guess there is no one who can fix me with the code, just by looking at
it.
K.
"Ken Tucker [MVP]" <vb***@bellsouth.net> wrote in message
news:u7**************@TK2MSFTNGP09.phx.gbl...
Hi,

Check out regexlib.org. Has a ability to search for regex and

link
to the regulator a regex tester.

http://www.regexlib.com/Default.aspx
http://www.regexlib.com/Search.aspx?k=html
Ken
---------------------
"Krakatioison" <Kr**********@huh.com> wrote in message
news:41**********@Usenet.com...
My problem is simple, but I spent lot of time playing with regex and I am
going nuts.

I need to automatically (many times per day) extract HEADING and
DESCRIPTION from the html code below?
HTML CODE:

<a href="http://www.mylink.com">HEADING</a><br>DESCRIPTION<br>
I am able to get the link already using this regex command:
"a.*href\s*=\s*(?:""(?<1>[^""]*)""|(?<1>\S+))"

Can someone write REGEX command to get HEADING and DESCRIPTION.
Please, it would be really appreciated.

K.


Nov 20 '05 #4
Jared,
thanks a lot for the time you spent with this.
I'll test it and get back to you
k.


"Jared" <VB***********@email.com> wrote in message
news:10*************@corp.supernews.com...
Krakatioison,
This one should work, I tested it against the sample you provided. You
may want to include the ignore case option. Let me know how that works out
for you.
Jared

(?:<a\s+href=[\"\'](?<Link>.+?)[\"\'>]+(?<Heading>(\w+))</a>(<(\w+)>)(?<Desc
ription>.*)\2)
"Krakatioison" <Kr**********@huh.com> wrote in message
news:41**********@Usenet.com...
This I already went to... hm..
I guess there is no one who can fix me with the code, just by looking at
it.
K.
"Ken Tucker [MVP]" <vb***@bellsouth.net> wrote in message
news:u7**************@TK2MSFTNGP09.phx.gbl...
Hi,

Check out regexlib.org. Has a ability to search for regex and

link
to the regulator a regex tester.

http://www.regexlib.com/Default.aspx
http://www.regexlib.com/Search.aspx?k=html
Ken
---------------------
"Krakatioison" <Kr**********@huh.com> wrote in message
news:41**********@Usenet.com...
My problem is simple, but I spent lot of time playing with regex and I am going nuts.

I need to automatically (many times per day) extract HEADING and
DESCRIPTION from the html code below?
HTML CODE:

<a href="http://www.mylink.com">HEADING</a><br>DESCRIPTION<br>
I am able to get the link already using this regex command:
"a.*href\s*=\s*(?:""(?<1>[^""]*)""|(?<1>\S+))"

Can someone write REGEX command to get HEADING and DESCRIPTION.
Please, it would be really appreciated.

K.



Nov 20 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: H | last post by:
This is kind of an followup on oneof my previous questions, and it has with RegEx to do. I have a string containing of several words. What would a good regex expression looklike to get one match...
4
by: Garibaldi | last post by:
Folks, I'm having a bad regex day and can sure use your help, please.. I have a Regex expression that works fine. It's purpose is to isolate all data from the start of a string begining with...
6
by: Dave | last post by:
I'm struggling with something that should be fairly simple. I just don't know the regext syntax very well, unfortunately. I'd like to parse words out of what is basically a boolean search...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
2
by: larry | last post by:
Hi, I need to use "regex" to validate user input into a text box. I'm having trouble adding the 'regex' class to my project. Essentially, I haven't learned how to add it. can someone tell me...
4
by: JS | last post by:
I am writing a C# app that needs to parse a sentence entered by the user for a simple boolean search. I need to capture all of the AND words that are not inside of double quotes. However, I am...
9
by: jmchadha | last post by:
I have got the following html: "something in html ... etc.. city1... etc... <a class="font1" href="city1.html" onclick="etc."click for <b>info</bon city1 </a> ... some html. city1.. can repeat...
2
by: Alex Maghen | last post by:
This is a bit of an abuse of this group. Just a nit, but I'm hoping someone really good with Regular Expressions can help me out here. I need to write a regular expression that will do the...
6
by: Phil Barber | last post by:
I am using Regex to validate a file name. I have everything I need except I would like the dot(.) in the filename only to appear once. My question is it possible to allow one instance of character...
0
by: Support Desk | last post by:
That’s it exactly..thx -----Original Message----- From: Reedick, Andrew Sent: Tuesday, June 03, 2008 9:26 AM To: Support Desk Subject: RE: regex help The regex will now skip anything with...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.