By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,278 Members | 1,358 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,278 IT Pros & Developers. It's quick & easy.

regexp question

P: n/a
I have a regalar expression to extract an html link from a page:

href=([\"']?)([^>\\1]*\.html)\\1(?: [^>]*)?>

It looks after the "href" for an optional quote and then looks for something
that is not the quote or the endarrow.

The problematic part is [^>\\1]*. It should exclude anything with the quote,
but somehow that doesn't work. Maybe \\1 is not allowed inside brackets?
I would like some advice on how to handle this.

Thanks,
Wim
Jul 17 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
Wim Roffal wrote:
I have a regalar expression to extract an html link from a page:
Use an HTML parser.
href=([\"']?)([^>\\1]*\.html)\\1(?: [^>]*)?>

It looks after the "href" for an optional quote and then looks for something
that is not the quote or the endarrow.

The problematic part is [^>\\1]*. It should exclude anything with the quote,
but somehow that doesn't work. Maybe \\1 is not allowed inside brackets?


Back references aren't recognised in character classes.

[ ... ]

HAGW!

--
Jock
Jul 17 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.