473,749 Members | 2,402 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

A broken link preventer

I have a tool which tells me the number of times that visitors attempt
to access a link from my site to an external site and what the response
code received was. In the event of the remote site returning an error
code, they are not sent to the remote site - why bother, it wouldn't
work!

Since I have over 1000 external links, this allows me to locate the
broken links that people see the most often and fix those first.
Conventional link checkers offer a complimentary service and detect
instances of broken links rather than instances of frequency seen.
The output from the program can generate reports based on time, link
accessed, page on my site where the link occurred and so on.

This means that on my site, I now have much better control over what
happens if the visitor would see a 404 on an external link and I can
offer them more options.

Try it out here
http://www.siliconglen.com/Scotland/2_2.html

Whilst accepting that broken links are a generally bad thing, this tool
at least helps me to manage them more effectively.

comments, feedback welcome. This is an early release so there may be
bugs but I hope not :-)

--
Craig Cockburn ("coburn"). http://www.SiliconGlen.com/
Home to the first online guide to Scotland, founded 1994.
Scottish FAQ, weddings, website design, stop spam and more!
Dec 6 '05 #1
28 3307
Craig Cockburn wrote:
I have a tool which tells me the number of times that visitors attempt
to access a link from my site to an external site and what the response
code received was. In the event of the remote site returning an error
code, they are not sent to the remote site - why bother, it wouldn't work!

Since I have over 1000 external links, this allows me to locate the
broken links that people see the most often and fix those first.
Conventional link checkers offer a complimentary service and detect
instances of broken links rather than instances of frequency seen.
The output from the program can generate reports based on time, link
accessed, page on my site where the link occurred and so on.

This means that on my site, I now have much better control over what
happens if the visitor would see a 404 on an external link and I can
offer them more options.

Try it out here
http://www.siliconglen.com/Scotland/2_2.html

Whilst accepting that broken links are a generally bad thing, this tool
at least helps me to manage them more effectively.

comments, feedback welcome. This is an early release so there may be
bugs but I hope not :-)

Craig,

Seems to work here and the suggestions provided to the user are helpful.

1. Rather than depend on UCSD, I'd suggest you provide your own
explanation of the error, showing only the one appropriate to the
immediate situation.

2. I used Netscape 7.1. When I see a list of links like those in your
example, I tend to keep the page with the list open in one tab, then
right click on each link I'm interested in and select "Open in new tab"
from the resulting popup menu. But something in your code prevents that
option (and several others) from appearing in the popup.

Chris Beall

Dec 6 '05 #2
Krustov wrote:
TMK if a website uses custom 404 pages then it wont show up as a broken
link .


Sometimes yes. But well configured web servers return 404 headers even
when displaying a custom 404 page. There are, of course, many badly
configured web servers out there.

Steve

Dec 6 '05 #3
"Steve Pugh" wrote:
Krustov wrote:
TMK if a website uses custom 404 pages then it wont show up as a broken
link .


Sometimes yes. But well configured web servers return 404 headers even
when displaying a custom 404 page. There are, of course, many badly
configured web servers out there.


I think the most common mistake is to use a fully qualified URL in the
ErrorDocument directive. For example:

ErrorDocument 404 http://example.com/error-docs/not_found.html

will cause the server to issue a 301 redirect header to the error page when
it can't find the requested document. The eror page will then be served with
a '200 OK" header.

It's all explained in the Apache documentation.

--
phil [dot] ronan @ virgin [dot] net
http://vzone.virgin.net/phil.ronan/

Dec 6 '05 #4
On Tue, 06 Dec 2005 14:15:50 GMT, Philip Ronan
<in*****@invali d.invalid> wrote:
"Steve Pugh" wrote:
Krustov wrote:
TMK if a website uses custom 404 pages then it wont show up as a broken
link .


Sometimes yes. But well configured web servers return 404 headers even
when displaying a custom 404 page. There are, of course, many badly
configured web servers out there.


I think the most common mistake is to use a fully qualified URL in the
ErrorDocument directive. For example:

ErrorDocument 404 http://example.com/error-docs/not_found.html

will cause the server to issue a 301 redirect header to the error page when
it can't find the requested document. The eror page will then be served with
a '200 OK" header.

It's all explained in the Apache documentation.


Explained? That's an interesting term to use with regard to the Apache
documentation! I find the Apache documentation to be slightly less
intelligible than if it were written in Ancient Greek.

And as this is being widely cross-posted, perhaps a challenge could go
out for another techinal author - one who can decipher the Apache
documentation - to produce a version which can be widely understood.

Matt
--
The Probert Encyclopaedia - Beyond Britannica
http://www.probertencyclopaedia.com
Dec 6 '05 #5
"Matt Probert" wrote:
Explained? That's an interesting term to use with regard to the Apache
documentation! I find the Apache documentation to be slightly less
intelligible than if it were written in Ancient Greek.


This seems perfectly clear to me:
Note that when you specify an ErrorDocument that points to a remote URL
(ie. anything with a method such as "http" in front of it), Apache will
send a redirect to the client to tell it where to find the document,
even if the document ends up being on the same server. This has several
implications, the most important being that the client will not receive
the original error status code, but instead will receive a redirect
status code. This in turn can confuse web robots and other clients
which try to determine if a URL is valid using the status code.


<http://httpd.apache.or g/docs/1.3/mod/core.html#error document>

Where's the problem?

--
phil [dot] ronan @ virgin [dot] net
http://vzone.virgin.net/phil.ronan/

Dec 6 '05 #6
Matt Probert wrote:
Explained? That's an interesting term to use with regard to the Apache
documentation! I find the Apache documentation to be slightly less
intelligible than if it were written in Ancient Greek.


OK, here's a simple challenge. Find another complex product with
documentation that's more readable than Apache's, while not being
misleading or downright wrong.

--
Nick Kew
Dec 6 '05 #7
Nick Kew wrote:
Matt Probert wrote:
Explained? That's an interesting term to use with regard to the Apache
documentation! I find the Apache documentation to be slightly less
intelligible than if it were written in Ancient Greek.

OK, here's a simple challenge. Find another complex product with
documentation that's more readable than Apache's, while not being
misleading or downright wrong.


MySQL
Microsoft's Visual Studio products
AutoCad
Websphere
Exim

To start.

Apache's documentation is some of the worst I've ever seen.

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Dec 7 '05 #8
Jerry Stuckle wrote:
Nick Kew wrote:
Matt Probert wrote:
Explained? That's an interesting term to use with regard to the Apache
documentation! I find the Apache documentation to be slightly less
intelligible than if it were written in Ancient Greek.
OK, here's a simple challenge. Find another complex product with
documentation that's more readable than Apache's, while not being
misleading or downright wrong.


MySQL


Hmmm, that's a very readable manual, too.
Microsoft's Visual Studio products
You must be joking! Where do you find anything that isn't just a
longwinded explanation of how to use GUI menus? It certainly
never told me anything that wasn't bleedin' obvious.

Unlike back in the 1980s, when a microsoft manual was somewhat
helpful in learning C.
AutoCad
never used it.
Websphere
Put off even looking by the webpages and ambiguous license
(not sure if that's changed since IBM started to get more
serious about opensource).
Exim
Well, I chose postfix in preference when I last changed MTA,
and find postfix's documentation much harder than Apache's -
though nevertheless adequately workable.

To start.

Apache's documentation is some of the worst I've ever seen.


How so? Instead of whinging, how about some constructive criticism
that might offer some ideas for improving it?

--
Nick Kew
Dec 7 '05 #9
Nick Kew wrote:
Jerry Stuckle wrote:
Nick Kew wrote:
Matt Probert wrote:

Explained? That's an interesting term to use with regard to the Apache
documentation! I find the Apache documentation to be slightly less
intelligible than if it were written in Ancient Greek.


OK, here's a simple challenge. Find another complex product with
documentation that's more readable than Apache's, while not being
misleading or downright wrong.


MySQL

Hmmm, that's a very readable manual, too.
Microsoft's Visual Studio products

You must be joking! Where do you find anything that isn't just a
longwinded explanation of how to use GUI menus? It certainly
never told me anything that wasn't bleedin' obvious.

Unlike back in the 1980s, when a microsoft manual was somewhat
helpful in learning C.
AutoCad

never used it.
Websphere

Put off even looking by the webpages and ambiguous license
(not sure if that's changed since IBM started to get more
serious about opensource).
Exim

Well, I chose postfix in preference when I last changed MTA,
and find postfix's documentation much harder than Apache's -
though nevertheless adequately workable.

To start.

Apache's documentation is some of the worst I've ever seen.


How so? Instead of whinging, how about some constructive criticism
that might offer some ideas for improving it?


Let's see...

More examples on how to do things. More information on how different
commands interrelate. How to effectively use .htaccess (or place those
commands in your httpd.conf file if you have access to it).

And how about some developer documentation? There isn't anything other
than an old Apache 1.x book mainly written for Perl with C as a second
thought.

If the documentation is so good, why are there so many messages on
usenet by people trying to figure out how to do things?

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Dec 7 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
3084
by: Danny | last post by:
Anyone knows how to detect broken link using ASP? Just say the link is got from query string, ie: http://www.mydomain.com/check_validity_link.asp?www.otherdomain.com/1.htm TIA.
1
2233
by: | last post by:
I am planning to develop a directory website (ASP.NET) which will contain links to hundreds of external web pages. In an effort to keep the directory up to date, I would like to trap (perhaps as an event) and then log when the user clicks on a broken link (page not found or server not available). Keep in mind that these are links to 3rd party pages not hosted on my website or server. I think I can do this using client side Java scripting,...
0
970
by: Martin Atukunda | last post by:
On this page: http://techdocs.postgresql.org/techdocs/ this section: Development the link: http://techdocs.postgresql.org/redir.php?link=http://www.postgresql.org/docs/momjian/writing_apps is broken.
2
1592
by: vbgunz | last post by:
Hello! this is the main error: http://img406.imageshack.us/img406/5218/screenshotxchmerror1ae.png navigation link images broken here: http://img406.imageshack.us/img406/2822/screenshotxchmv12python24docum.png when I first open up the docs, the main page and Global Module Index links in the tree are unaccessible. They give me errors. While
3
4142
by: Giampaolo Rodola' | last post by:
Hi there, I would like to know if such function would be correct for verifying if a link is broken and/or circular. def isvalidlink(path): assert os.path.islink(path) try: os.stat(path) except os.error: return 1
0
892
by: John Dalberg | last post by:
Every link on the web I tested to download the C# snippets links to page on MS's which is broken. Anyone knows a download link which work? Broken page: http://msdn.microsoft.com/vstudio/downloads/codesnippets/default.aspx
4
2152
by: rando1000 | last post by:
I'm sending some automated e-mails through CDO.message. There is a text string being passed as a link in the message. It's a lengthy string, and when viewed in Outlook Express, the link becomes broken, as if OE is inserting a carriage return or something. The link works fine when viewed in Office Outlook. Any suggestions?
8
2297
by: punk86 | last post by:
Hi, i been working on this codes but i keep getting broken links for the pictures. Im using apache. Need help for this please. I think its just the codes in my index.php is wrong and i do not know what is the solution. Code for my addstar.php <?php $HOST = 'localhost'; $USERNAME = 'root'; $PASSWORD = '';
0
8996
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
1
9333
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8256
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6800
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6078
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4608
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4879
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3319
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2217
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.