473,493 Members | 3,174 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Bad links looking like good ones, why?: Link Checkers

Hi;

I had a big link checking job to do and it has been years since I have
done anything like that so I found a test page to use that I knew had
bad links on it( a friends site ) and I decided to test the various
free services out.

I tried about 5 different link checkers on the test page I had ,
including Xenu and NetMechanic. I got 5 sets of identical results.

All of these link checkers reported a bad link as good. It is the
first link on this sample of the test page I used. The link is called
"Mini Pigs":

http://beforewisdom.com/dcorgs.html

My question is, what would cause a link checker to report a bad link as
good and aside from manual checks is there anyway around this problem?

It looks like the link is being redirected to a porn site and then
redirected to a 404 page. Do redirects bollucks up link checkers?

I also used a link checker extension to firefox that deals with
"forwarded or forbidden" links, which also reported the afforemention
link as good.

I am curious :).

Aug 20 '06 #1
8 2427
Steve wrote:
All of these link checkers reported a bad link as good. It is the
first link on this sample of the test page I used. The link is called
"Mini Pigs":

http://beforewisdom.com/dcorgs.html

My question is, what would cause a link checker to report a bad link as
good and aside from manual checks is there anyway around this problem?

It looks like the link is being redirected to a porn site and then
redirected to a 404 page. Do redirects bollucks up link checkers?
When I tried it, that link went to a page claiming to have "DC
Vegetarian Organizations", which in turn had two links, one of which was
"404 Not Found". Not a very useful page, but it was an actual Web page,
so link checkers properly reported it as a good link. No redirects (to
porn or otherwise) were present when I looked.

--
== Dan ==
Dan's Mail Format Site: http://mailformat.dan.info/
Dan's Web Tips: http://webtips.dan.info/
Dan's Domain Site: http://domains.dan.info/
Aug 20 '06 #2
On Sun, 20 Aug 2006 11:33:59 -0400, "Daniel R. Tobias" <da*@tobias.name>
wrote:
>Steve wrote:
>All of these link checkers reported a bad link as good. It is the
first link on this sample of the test page I used. The link is called
"Mini Pigs":

http://beforewisdom.com/dcorgs.html

My question is, what would cause a link checker to report a bad link as
good and aside from manual checks is there anyway around this problem?

It looks like the link is being redirected to a porn site and then
redirected to a 404 page. Do redirects bollucks up link checkers?

When I tried it, that link went to a page claiming to have "DC
Vegetarian Organizations", which in turn had two links, one of which was
"404 Not Found". Not a very useful page, but it was an actual Web page,
so link checkers properly reported it as a good link. No redirects (to
porn or otherwise) were present when I looked.
If you look at the link history in Opera, there is indeed a page there
with a title suggesting porn. All but a bit odd but, as you say, there
is a page there so the link checkers are correct.

--
Stephen Poley

http://www.xs4all.nl/~sbpoley/webmatters/
Aug 20 '06 #3

Daniel R. Tobias wrote:
Steve wrote:
All of these link checkers reported a bad link as good. It is the
first link on this sample of the test page I used. The link is called
"Mini Pigs":

http://beforewisdom.com/dcorgs.html

My question is, what would cause a link checker to report a bad link as
good and aside from manual checks is there anyway around this problem?

It looks like the link is being redirected to a porn site and then
redirected to a 404 page. Do redirects bollucks up link checkers?

When I tried it, that link went to a page claiming to have "DC
Vegetarian Organizations", which in turn had two links, one of which was
"404 Not Found". Not a very useful page, but it was an actual Web page,
so link checkers properly reported it as a good link. No redirects (to
porn or otherwise) were present when I looked.
It is not a real page. I parsed it down from a real page figuring it
would be easier to tell people to go to the "mini pigs" link at the top
of a page with only 3 links.

Aug 20 '06 #4
On 20 Aug 2006 09:18:07 -0700, "Steve" <st**********@yahoo.comwrote:

>It is not a real page. I parsed it down from a real page figuring it
would be easier to tell people to go to the "mini pigs" link at the top
of a page with only 3 links.
It is most certainly a real page. The source starts as follows (I'll
spare you the rest):

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>Arabic sex story in arabic in real sex action.</title>
</head>
<SCRIPT language=JavaScript src="/menu2.js"></SCRIPT>
....
--
Stephen Poley

http://www.xs4all.nl/~sbpoley/webmatters/
Aug 20 '06 #5
In article <11*********************@b28g2000cwb.googlegroups. com>,
"Steve" <st**********@yahoo.comwrote:
Hi;

I had a big link checking job to do and it has been years since I have
done anything like that so I found a test page to use that I knew had
bad links on it( a friends site ) and I decided to test the various
free services out.

I tried about 5 different link checkers on the test page I had ,
including Xenu and NetMechanic. I got 5 sets of identical results.

All of these link checkers reported a bad link as good. It is the
first link on this sample of the test page I used. The link is called
"Mini Pigs":

http://beforewisdom.com/dcorgs.html

My question is, what would cause a link checker to report a bad link as
good and aside from manual checks is there anyway around this problem?

It looks like the link is being redirected to a porn site and then
redirected to a 404 page. Do redirects bollucks up link checkers?
Steve,
Your definition of "good" and "bad" for links is different from that of
the average link checker. All of the link checkers that I'm aware of
merely test that a GET or HEAD request to the destination of a link
generates a successful response. It'd be a pretty special link checker
if it evaluated the content of the link target. How is the link checker
supposed to know what you consider "good" content? For all it knows,
maybe you *want* to link to a site featuring live nude zebras. To each
his own.

My site crawler (see my sig) primarily performs HTML validation but it
also does link checks. As you can see on this page, I'm very specific
about which response codes I consider "bad" and "good":
http://nikitathespider.com/reports/sample/HotLinks.html

Note that 301 (Moved Permanently) is on the "bad" list but 302 (Moved
Temporarily) is not. That's a judgment call on my part; other link
checkers may have a different opinion.

HTH

--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more
Aug 20 '06 #6
Nikita the Spider wrote:
Steve,
Your definition of "good" and "bad" for links is different from that of
the average link checker. All of the link checkers that I'm aware of
merely test that a GET or HEAD request to the destination of a link
generates a successful response. It'd be a pretty special link checker
if it evaluated the content of the link target. How is the link checker
supposed to know what you consider "good" content? For all it knows,
maybe you *want* to link to a site featuring live nude zebras. To each
his own.
Yep. But what a link checker can do (and what Valet does) is to detect
updated links, so it alerts you if the target of a link has been updated
since you last checked. It's up to you whether you use that information
to check that the link is still what you thought it was.
Note that 301 (Moved Permanently) is on the "bad" list but 302 (Moved
Temporarily) is not. That's a judgment call on my part; other link
checkers may have a different opinion.
I'd say that's an oversimplistic approach. Redirects are not
inherently either good or bad. Valet flags them as redirects,
with the suggestion that you might want to update them.

--
Nick Kew
Aug 21 '06 #7
Rik
Nick Kew wrote:
Nikita the Spider wrote:
>Steve,
Your definition of "good" and "bad" for links is different from that
of the average link checker. All of the link checkers that I'm aware
of merely test that a GET or HEAD request to the destination of a
link generates a successful response. It'd be a pretty special link
checker if it evaluated the content of the link target. How is the
link checker supposed to know what you consider "good" content? For
all it knows, maybe you *want* to link to a site featuring live nude
zebras. To each his own.

Yep. But what a link checker can do (and what Valet does) is to
detect updated links, so it alerts you if the target of a link has
been updated since you last checked. It's up to you whether you use
that information to check that the link is still what you thought it
was.
>Note that 301 (Moved Permanently) is on the "bad" list but 302 (Moved
Temporarily) is not. That's a judgment call on my part; other link
checkers may have a different opinion.

I'd say that's an oversimplistic approach. Redirects are not
inherently either good or bad. Valet flags them as redirects,
with the suggestion that you might want to update them.
Well, that's exactly the case here: if you want to test you link for
validity, and get an answer it's permanently moved, it will mean the link
is not valid anymore, and should be updated. A temporarily moved means your
specific link is still valid.

Grtz,
--
Rik Wasmus
Aug 21 '06 #8
In article <bc************@asgard.webthing.com>,
Nick Kew <ni**@asgard.webthing.comwrote:
Nikita the Spider wrote:
Steve,
Your definition of "good" and "bad" for links is different from that of
the average link checker. All of the link checkers that I'm aware of
merely test that a GET or HEAD request to the destination of a link
generates a successful response. It'd be a pretty special link checker
if it evaluated the content of the link target. How is the link checker
supposed to know what you consider "good" content? For all it knows,
maybe you *want* to link to a site featuring live nude zebras. To each
his own.

Yep. But what a link checker can do (and what Valet does) is to detect
updated links, so it alerts you if the target of a link has been updated
since you last checked. It's up to you whether you use that information
to check that the link is still what you thought it was.
That's a nice feature, although I think it would be noisy for some sites.
Note that 301 (Moved Permanently) is on the "bad" list but 302 (Moved
Temporarily) is not. That's a judgment call on my part; other link
checkers may have a different opinion.

I'd say that's an oversimplistic approach. Redirects are not
inherently either good or bad. Valet flags them as redirects,
with the suggestion that you might want to update them.
I agree. I don't use the good/bad terminology myself, I was just trying
to stay consistent in the context of this posting. I just call them
"hot" links which "are the links on your site that are most likely to
need attention". One gets the information one needs to decide whether or
not the link is "broken" by one's own standards.

--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more
Aug 21 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
10538
by: Chris | last post by:
I'm using eight links listed horizontally as a menu on my site. I'm using font-variant:small-caps and they are padded so that they mimic buttons. My gripe is with the way IE handles the focus...
4
2919
by: Isabelle | last post by:
Hello all, I want two sets of links to do different things; one set within *content* and the other *navigational* elements. In particular I want the *hover* rollover effect to work differently...
6
7044
by: Pasi Kovanen | last post by:
How do I define for example link inside H1 style to be of different color than other links in the same page: <a href="blah">this is some color</a> <h1><a href="blahblah">this is different...
10
9915
by: IWP506 | last post by:
Hello everyone. I have a header picture for my website with text on it like "Home" "links" "about" etc. It's all 1 picture. I've seen people's sites where they have 1 picture but somehow make...
26
8024
by: johkar | last post by:
I need to cancel the link and execute a function onclick of all the links within the span tag which has a class of "container" assigned. There will be only one span tag with this class applied. ...
9
3479
by: chrisspencer02 | last post by:
I am looking for a method to extract the links embedded within the Javascript in a web page: an ActiveX component, or example code in C++/Pascal/etc. I am looking for a general solution, not one...
0
1143
by: krzysztof.konopko | last post by:
I know that the informations I provide may seem limited but maybe someone has solved similar problem. I am new in design patterns and I am quite confused who should control whom, how to configure...
11
2432
by: Alan | last post by:
Okay, IE and Netscape have a default link style with "color: blue;" and "text-decoration: underline". That's fine, but for what I am doing I need to turn OFF the default styling. I know how to...
0
7119
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7157
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7195
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
5453
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4579
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3088
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3078
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1400
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
644
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.