473,513 Members | 13,099 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

regexp question

I have a regalar expression to extract an html link from a page:

href=([\"']?)([^>\\1]*\.html)\\1(?: [^>]*)?>

It looks after the "href" for an optional quote and then looks for something
that is not the quote or the endarrow.

The problematic part is [^>\\1]*. It should exclude anything with the quote,
but somehow that doesn't work. Maybe \\1 is not allowed inside brackets?
I would like some advice on how to handle this.

Thanks,
Wim
Jul 17 '05 #1
1 1672
Wim Roffal wrote:
I have a regalar expression to extract an html link from a page:
Use an HTML parser.
href=([\"']?)([^>\\1]*\.html)\\1(?: [^>]*)?>

It looks after the "href" for an optional quote and then looks for something
that is not the quote or the endarrow.

The problematic part is [^>\\1]*. It should exclude anything with the quote,
but somehow that doesn't work. Maybe \\1 is not allowed inside brackets?


Back references aren't recognised in character classes.

[ ... ]

HAGW!

--
Jock
Jul 17 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2182
by: python_charmer2000 | last post by:
I want to match several regexps against a large body of text. What I have so far is similar to this: re1 = <some regexp> re2 = <some regexp> re3 = <some regexp> big_re = re.compile(re1 +...
19
2165
by: Magnus Lie Hetland | last post by:
I'm working on a project (Atox) where I need to match quite a few regular expressions (several hundred) in reasonably large text files. I've found that this can easily get rather slow. (There are...
5
2337
by: Lukas Holcik | last post by:
Hi everyone! How can I simply search text for regexps (lets say <a href="(.*?)">(.*?)</a>) and save all URLs(1) and link contents(2) in a dictionary { name : URL}? In a single pass if it could....
4
7449
by: Jon Maz | last post by:
Hi All, I want to strip the accents off characters in a string so that, for example, the (Spanish) word "práctico" comes out as "practico" - but ignoring case, so that "PRÁCTICO" comes out as...
3
1705
by: Sped Erstad | last post by:
There must be a simple regexp reason for this little question but it's driving me nuts. Below is a simple regexp to determine if a string contains only numbers. I'm running these two strings...
2
1469
by: Bill McCormick | last post by:
Hello, I'm new to VB.NET but have used regexp in Perl and VI. I'd like to read a regular expression from a file and apply it to a string read from another file. The regexp is simple word...
26
2084
by: Matt Kruse | last post by:
Are there any current browsers that have Javascript support, but not RegExp support? For example, cell phone browsers, blackberrys, or other "minimal" browsers? I know that someone using Netscape...
7
3423
by: Csaba Gabor | last post by:
I need to come up with a function function regExpPos (text, re, parenNum) { ... } that will return the position within text of RegExp.$parenNum if there is a match, and -1 otherwise. For...
11
2898
by: HopfZ | last post by:
I coudn't understand some behavior of RegExp.test function. Example html code: ---------------- <html><head></head><body><script type="text/javascript"> var r = /^https?:\/\//g;...
8
3299
by: Darryl Kerkeslager | last post by:
Currently I am using the RegExp object to parse a large dataset in an Access table - but this table was exported from SQL Server, and the very correct question was asked - why not just do it in SQL...
0
7254
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7432
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7094
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7519
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
5079
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
3230
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3218
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1585
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
796
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.