473,396 Members | 1,809 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Retrievel Hyperlinks for a web page in code

Hi folks,

I am retrieving a website for a site using httpWebRequest. What I want to
do with the retrieved webpage is list all the hyperlinks in the page. If I
do a simple regex search for <a then I get links that are commented out in
code and I don't want that. I want links that are actually active. This is
to do with reciprocal link check.

Can someone please point me in the right direction.

Thanks.

--
<a href="http://1pakistangifts.com">Send Gifts to Pakisan at #Pakistan Gifts
Store</a| <a href="http://dotspecialists.com">Leading Software offshoring
and outsourcing service provider</a| <a
href="http://websitedesignersrus.com">Professional Websites at affordable
prices</a>

Aug 14 '07 #1
2 1117
On Aug 14, 8:01 am, "Enigma Boy" <enigma...@pp.newsgroups.userwrote:
Hi folks,

I am retrieving a website for a site using httpWebRequest. What I want to
do with the retrieved webpage is list all the hyperlinks in the page. If I
do a simple regex search for <a then I get links that are commented out in
code and I don't want that. I want links that are actually active. This is
to do with reciprocal link check.
Hi, I think you can try to clean the text before you get the links.
For example:

html_code = Regex.Replace(html_code, "<!--((.|\n)*?)-->", "");

This will replace all commented code by an empty string and then you
can get the links.

Aug 14 '07 #2
Hello Enigma,
Hi folks,

I am retrieving a website for a site using httpWebRequest. What I
want to do with the retrieved webpage is list all the hyperlinks in
the page. If I do a simple regex search for <a then I get links that
are commented out in code and I don't want that. I want links that
are actually active. This is to do with reciprocal link check.

Can someone please point me in the right direction.

Thanks.
Have a look at the HTML Agility pack. It allows you to treat the HTML as
it were XML.

http://www.codeplex.com/Wiki/View.as...tmlagilitypack

--
Jesse Houwing
jesse.houwing at sogeti.nl
Aug 14 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: middletree | last post by:
I have a page which takes data in an HTML table, and exports it to an Excel file. It works fine, but I want several things about the spreadsheet to look different from the HTML version. Things like...
3
by: Aasp | last post by:
Hello! I'm a complete Javascript newbie and my question is surely very naive, but believe me, I've spent this whole day searching thru the net and didn't find any solution to my little problem....
1
by: Darryl Neale | last post by:
I have been tasked with reviving an old database that stopped working about 3 years ago :( On one of the tabs is a list of links to other databases utilising the Hyperlink feature of MS Access...
1
by: John | last post by:
Hi all, My app contains a datagrid with hyperlinks inside and these hyperlinks all point back to the same page. The problem is that I have a 'search' user control and the viewstate changes once...
4
by: Seefor | last post by:
Hi, I want my text hyperlinks to have a dotted border underneath, so I did this which works fine: a, a:link, a:visited, a:hover, a:active { color: #000; text-decoration: none;
9
by: Viken Karaguesian | last post by:
Hello all, I'm making a calendar section in a website. Each month is its own page and navigated by Previous / Next links. I have it working now with standard hyperlinks, but I want to learn how...
3
by: pradeepm | last post by:
Hi Pls anybody have the coding for the "Storage and Retrievel of Data with B+ Trees" programme.Pls very urgent.Help me Thanks in advance Pradeep .M
2
by: Nathan | last post by:
I'm trying to use a LocalReport in a ReportViewer and make a couple fields in the report show up as hyperlinks so I can navigate to other pages after running the report. The hyperlinks themselves...
2
by: =?Utf-8?B?S3VtYXIuQS5QLlA=?= | last post by:
HI I want to disable the underline of a hyperlink text in a hyperlink column of datagrid/gridview in the content page. If I am writing a css class to disable it in the master page then its...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.