473,856 Members | 1,577 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Finding Broken Link in WebSite

3 New Member
Hi everyone,
i want .net(VB or C#) code for finding broken links in a website.
The requirement is that the user will be able to type the
url in a text box so once the button is clicked , it has to show
whether there are any broken links in that particular page.
Please help me out in this.


Thanks
Sridhar.S
Feb 11 '08 #1
8 5321
wimpos
19 New Member
You can download the website entered into the textbox.
try using the webclient class
Now you have the source html code of the webpage.
Than find al the links in the webpages <a href="**"></a> (maybe a method in webclient, otherwise use a regex)
extract the ** and try to download this website as you did before. If it succeeds the link is alive, otherwise it is dead.

This is a guidline you can follow. try it, if you run into trouble don't hesitate to get more detailed info.

regards
W.
Feb 11 '08 #2
kenobewan
4,871 Recognized Expert Specialist
I find this question interesting because a link isn't really broken until you click and get the 404 error. Also an ok local link may be broken on the web. Having said that there are third party tools that may help, but my assumption is that it may be a case of 'physician heal thyself".

Having said that a site map may help, but you would need to customize to cover all dynamic links. Here is a great resource on TDD, applied mainly to software but if web programming was done this way...
My Uber-Test-Driven Development ("TDD") Links Listing
Feb 11 '08 #3
Plater
7,872 Recognized Expert Expert
W3.org offers testing for broken links.
All they do is check to see if a good status is returned when they attempt to navigate to the address inside an anchor tag.
Feb 11 '08 #4
sristhrashguy
3 New Member
Hi all,

I'm storing a set of "<a href" tags in a list. Now i want to check all these links are valid or not? i.e, the list contains any broken links or not.

This is the code.....

the code will allow user to enter the url and it will render all the html code back
and display only the href tags. Now i want to check the validity of all these links
and display the result.... i.e all the links are not dead.....

Expand|Select|Wrap|Line Numbers
  1. // make an object of the WebClient class 
  2.             WebClient objWebClient = new WebClient();
  3.          // gets the HTML from the url written in the textbox
  4.             aRequestHTML = objWebClient.DownloadData(TextBox1.Text); 
  5.          // creates UTf8 encoding object
  6.             UTF8Encoding utf8 = new UTF8Encoding();  
  7.         // gets the UTF8 encoding of all the html we got in aRequestHTML
  8.             myString = utf8.GetString(aRequestHTML); 
  9.  
  10.  
  11.             ArrayList list = new ArrayList();
  12.             int curindex = 0;
  13.             int index = 0;
  14.             do
  15.             {
  16.                 index = myString.IndexOf("<a href=", curindex);
  17.                 if (index==-1) {break;}
  18.                 curindex = myString.IndexOf(">", index);
  19.                 string ancordata = myString.Substring(index, curindex - index);
  20.                 if (ancordata.ToLower().IndexOf("javascript") < 0)
  21.                 {                    
  22.                     list.Add(ancordata);
  23.                 }
  24.             } 
  25.             while (index != -1);
  26.               GridView1.DataSource = list; 
  27.         //// binds the databind
  28.             GridView1.DataBind();
  29.  
Feb 13 '08 #5
sristhrashguy
3 New Member
You can download the website entered into the textbox.
try using the webclient class
Now you have the source html code of the webpage.
Than find al the links in the webpages <a href="**"></a> (maybe a method in webclient, otherwise use a regex)
extract the ** and try to download this website as you did before. If it succeeds the link is alive, otherwise it is dead.

This is a guidline you can follow. try it, if you run into trouble don't hesitate to get more detailed info.

regards
W.
Hi thanks for ur reply,

i have passed half the ocean.
Now i have all the list of "<a href" tags in a array list.
I'm totally stuck here. Please help me.
Feb 13 '08 #6
Plater
7,872 Recognized Expert Expert
Try using an HttpWebRequest to see if they return a good status or not?
Feb 14 '08 #7
jallred
6 New Member
Try using an HttpWebRequest to see if they return a good status or not?
Spot on. For each url, use the code at http://msdn2.microsoft .com/en-us/library/system.net.http webrequest.getr esponse.aspx, check the status code of the response. 200 is OK, while 404 is the typical broken link. Other status codes will require some judgement.

John
http://blogs.msdn.com/usisvde/
Feb 18 '08 #8
wimpos
19 New Member
Tip:
It might be not that important but I still recommend using a regular expression to search for the <a href="" >

It's cleaner, it 's faster, more reliable

regards
Feb 20 '08 #9

Sign in to post your reply or Sign up for a free account.

Similar topics

2
3091
by: Danny | last post by:
Anyone knows how to detect broken link using ASP? Just say the link is got from query string, ie: http://www.mydomain.com/check_validity_link.asp?www.otherdomain.com/1.htm TIA.
1
1866
by: talyabn | last post by:
Hi, I'm trying to invoke the 'Broken Hyperlinks' option in the FrontPage application. The problem is that I get all the links in a given HTML page instead of getting only the broken links. I'm using automation in my Visual Basic program and I'd like to know if there is any way to get only the broken links in a web page.
1
2241
by: | last post by:
I am planning to develop a directory website (ASP.NET) which will contain links to hundreds of external web pages. In an effort to keep the directory up to date, I would like to trap (perhaps as an event) and then log when the user clicks on a broken link (page not found or server not available). Keep in mind that these are links to 3rd party pages not hosted on my website or server. I think I can do this using client side Java scripting,...
0
975
by: Martin Atukunda | last post by:
On this page: http://techdocs.postgresql.org/techdocs/ this section: Development the link: http://techdocs.postgresql.org/redir.php?link=http://www.postgresql.org/docs/momjian/writing_apps is broken.
28
3435
by: Craig Cockburn | last post by:
I have a tool which tells me the number of times that visitors attempt to access a link from my site to an external site and what the response code received was. In the event of the remote site returning an error code, they are not sent to the remote site - why bother, it wouldn't work! Since I have over 1000 external links, this allows me to locate the broken links that people see the most often and fix those first. Conventional link...
11
2935
by: sweetbox | last post by:
Hi Guys! I am new on this forum and I am hoping that I could acquire help to improve my PERL skill. My goal is to be able to create a report of all broken links from a website. Based from my understanding, PERL can go through to server directory and search for htm/html file and put it on an array. From this array I believe PERL can search for a "http://" string and emulate to click the URL. If the URL contains "Page cannot be found" then...
4
1378
by: Mel | last post by:
My link does not work. I want to create a link to a file that exists on a different server. The path to the file is this "\\Junebug\Groups \90000003.pdf". My web server is "10.10.1.111". How do I get this link to work? When I hover over the link the path looks like this: http://10.10.1.111/\\Juno\Groups\90000003.pdf
8
2299
by: punk86 | last post by:
Hi, i been working on this codes but i keep getting broken links for the pictures. Im using apache. Need help for this please. I think its just the codes in my index.php is wrong and i do not know what is the solution. Code for my addstar.php <?php $HOST = 'localhost'; $USERNAME = 'root'; $PASSWORD = '';
0
9916
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10696
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10782
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10384
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9531
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
7094
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5761
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4575
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3201
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.