473,700 Members | 2,429 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Bad links looking like good ones, why?: Link Checkers

Hi;

I had a big link checking job to do and it has been years since I have
done anything like that so I found a test page to use that I knew had
bad links on it( a friends site ) and I decided to test the various
free services out.

I tried about 5 different link checkers on the test page I had ,
including Xenu and NetMechanic. I got 5 sets of identical results.

All of these link checkers reported a bad link as good. It is the
first link on this sample of the test page I used. The link is called
"Mini Pigs":

http://beforewisdom.com/dcorgs.html

My question is, what would cause a link checker to report a bad link as
good and aside from manual checks is there anyway around this problem?

It looks like the link is being redirected to a porn site and then
redirected to a 404 page. Do redirects bollucks up link checkers?

I also used a link checker extension to firefox that deals with
"forwarded or forbidden" links, which also reported the afforemention
link as good.

I am curious :).

Aug 20 '06 #1
8 2439
Steve wrote:
All of these link checkers reported a bad link as good. It is the
first link on this sample of the test page I used. The link is called
"Mini Pigs":

http://beforewisdom.com/dcorgs.html

My question is, what would cause a link checker to report a bad link as
good and aside from manual checks is there anyway around this problem?

It looks like the link is being redirected to a porn site and then
redirected to a 404 page. Do redirects bollucks up link checkers?
When I tried it, that link went to a page claiming to have "DC
Vegetarian Organizations", which in turn had two links, one of which was
"404 Not Found". Not a very useful page, but it was an actual Web page,
so link checkers properly reported it as a good link. No redirects (to
porn or otherwise) were present when I looked.

--
== Dan ==
Dan's Mail Format Site: http://mailformat.dan.info/
Dan's Web Tips: http://webtips.dan.info/
Dan's Domain Site: http://domains.dan.info/
Aug 20 '06 #2
On Sun, 20 Aug 2006 11:33:59 -0400, "Daniel R. Tobias" <da*@tobias.nam e>
wrote:
>Steve wrote:
>All of these link checkers reported a bad link as good. It is the
first link on this sample of the test page I used. The link is called
"Mini Pigs":

http://beforewisdom.com/dcorgs.html

My question is, what would cause a link checker to report a bad link as
good and aside from manual checks is there anyway around this problem?

It looks like the link is being redirected to a porn site and then
redirected to a 404 page. Do redirects bollucks up link checkers?

When I tried it, that link went to a page claiming to have "DC
Vegetarian Organizations", which in turn had two links, one of which was
"404 Not Found". Not a very useful page, but it was an actual Web page,
so link checkers properly reported it as a good link. No redirects (to
porn or otherwise) were present when I looked.
If you look at the link history in Opera, there is indeed a page there
with a title suggesting porn. All but a bit odd but, as you say, there
is a page there so the link checkers are correct.

--
Stephen Poley

http://www.xs4all.nl/~sbpoley/webmatters/
Aug 20 '06 #3

Daniel R. Tobias wrote:
Steve wrote:
All of these link checkers reported a bad link as good. It is the
first link on this sample of the test page I used. The link is called
"Mini Pigs":

http://beforewisdom.com/dcorgs.html

My question is, what would cause a link checker to report a bad link as
good and aside from manual checks is there anyway around this problem?

It looks like the link is being redirected to a porn site and then
redirected to a 404 page. Do redirects bollucks up link checkers?

When I tried it, that link went to a page claiming to have "DC
Vegetarian Organizations", which in turn had two links, one of which was
"404 Not Found". Not a very useful page, but it was an actual Web page,
so link checkers properly reported it as a good link. No redirects (to
porn or otherwise) were present when I looked.
It is not a real page. I parsed it down from a real page figuring it
would be easier to tell people to go to the "mini pigs" link at the top
of a page with only 3 links.

Aug 20 '06 #4
On 20 Aug 2006 09:18:07 -0700, "Steve" <st**********@y ahoo.comwrote:

>It is not a real page. I parsed it down from a real page figuring it
would be easier to tell people to go to the "mini pigs" link at the top
of a page with only 3 links.
It is most certainly a real page. The source starts as follows (I'll
spare you the rest):

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>Arabic sex story in arabic in real sex action.</title>
</head>
<SCRIPT language=JavaSc ript src="/menu2.js"></SCRIPT>
....
--
Stephen Poley

http://www.xs4all.nl/~sbpoley/webmatters/
Aug 20 '06 #5
In article <11************ *********@b28g2 000cwb.googlegr oups.com>,
"Steve" <st**********@y ahoo.comwrote:
Hi;

I had a big link checking job to do and it has been years since I have
done anything like that so I found a test page to use that I knew had
bad links on it( a friends site ) and I decided to test the various
free services out.

I tried about 5 different link checkers on the test page I had ,
including Xenu and NetMechanic. I got 5 sets of identical results.

All of these link checkers reported a bad link as good. It is the
first link on this sample of the test page I used. The link is called
"Mini Pigs":

http://beforewisdom.com/dcorgs.html

My question is, what would cause a link checker to report a bad link as
good and aside from manual checks is there anyway around this problem?

It looks like the link is being redirected to a porn site and then
redirected to a 404 page. Do redirects bollucks up link checkers?
Steve,
Your definition of "good" and "bad" for links is different from that of
the average link checker. All of the link checkers that I'm aware of
merely test that a GET or HEAD request to the destination of a link
generates a successful response. It'd be a pretty special link checker
if it evaluated the content of the link target. How is the link checker
supposed to know what you consider "good" content? For all it knows,
maybe you *want* to link to a site featuring live nude zebras. To each
his own.

My site crawler (see my sig) primarily performs HTML validation but it
also does link checks. As you can see on this page, I'm very specific
about which response codes I consider "bad" and "good":
http://nikitathespider.com/reports/sample/HotLinks.html

Note that 301 (Moved Permanently) is on the "bad" list but 302 (Moved
Temporarily) is not. That's a judgment call on my part; other link
checkers may have a different opinion.

HTH

--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more
Aug 20 '06 #6
Nikita the Spider wrote:
Steve,
Your definition of "good" and "bad" for links is different from that of
the average link checker. All of the link checkers that I'm aware of
merely test that a GET or HEAD request to the destination of a link
generates a successful response. It'd be a pretty special link checker
if it evaluated the content of the link target. How is the link checker
supposed to know what you consider "good" content? For all it knows,
maybe you *want* to link to a site featuring live nude zebras. To each
his own.
Yep. But what a link checker can do (and what Valet does) is to detect
updated links, so it alerts you if the target of a link has been updated
since you last checked. It's up to you whether you use that information
to check that the link is still what you thought it was.
Note that 301 (Moved Permanently) is on the "bad" list but 302 (Moved
Temporarily) is not. That's a judgment call on my part; other link
checkers may have a different opinion.
I'd say that's an oversimplistic approach. Redirects are not
inherently either good or bad. Valet flags them as redirects,
with the suggestion that you might want to update them.

--
Nick Kew
Aug 21 '06 #7
Rik
Nick Kew wrote:
Nikita the Spider wrote:
>Steve,
Your definition of "good" and "bad" for links is different from that
of the average link checker. All of the link checkers that I'm aware
of merely test that a GET or HEAD request to the destination of a
link generates a successful response. It'd be a pretty special link
checker if it evaluated the content of the link target. How is the
link checker supposed to know what you consider "good" content? For
all it knows, maybe you *want* to link to a site featuring live nude
zebras. To each his own.

Yep. But what a link checker can do (and what Valet does) is to
detect updated links, so it alerts you if the target of a link has
been updated since you last checked. It's up to you whether you use
that information to check that the link is still what you thought it
was.
>Note that 301 (Moved Permanently) is on the "bad" list but 302 (Moved
Temporarily) is not. That's a judgment call on my part; other link
checkers may have a different opinion.

I'd say that's an oversimplistic approach. Redirects are not
inherently either good or bad. Valet flags them as redirects,
with the suggestion that you might want to update them.
Well, that's exactly the case here: if you want to test you link for
validity, and get an answer it's permanently moved, it will mean the link
is not valid anymore, and should be updated. A temporarily moved means your
specific link is still valid.

Grtz,
--
Rik Wasmus
Aug 21 '06 #8
In article <bc************ @asgard.webthin g.com>,
Nick Kew <ni**@asgard.we bthing.comwrote :
Nikita the Spider wrote:
Steve,
Your definition of "good" and "bad" for links is different from that of
the average link checker. All of the link checkers that I'm aware of
merely test that a GET or HEAD request to the destination of a link
generates a successful response. It'd be a pretty special link checker
if it evaluated the content of the link target. How is the link checker
supposed to know what you consider "good" content? For all it knows,
maybe you *want* to link to a site featuring live nude zebras. To each
his own.

Yep. But what a link checker can do (and what Valet does) is to detect
updated links, so it alerts you if the target of a link has been updated
since you last checked. It's up to you whether you use that information
to check that the link is still what you thought it was.
That's a nice feature, although I think it would be noisy for some sites.
Note that 301 (Moved Permanently) is on the "bad" list but 302 (Moved
Temporarily) is not. That's a judgment call on my part; other link
checkers may have a different opinion.

I'd say that's an oversimplistic approach. Redirects are not
inherently either good or bad. Valet flags them as redirects,
with the suggestion that you might want to update them.
I agree. I don't use the good/bad terminology myself, I was just trying
to stay consistent in the context of this posting. I just call them
"hot" links which "are the links on your site that are most likely to
need attention". One gets the information one needs to decide whether or
not the link is "broken" by one's own standards.

--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more
Aug 21 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
10556
by: Chris | last post by:
I'm using eight links listed horizontally as a menu on my site. I'm using font-variant:small-caps and they are padded so that they mimic buttons. My gripe is with the way IE handles the focus rectangle on these links. It insists on drawing this crazy shape that traces the text, which with small caps looks rather assinine. Firefox handles the same task very gracefully (yeah Gecko!) and I would like to force IE to do the same. The site...
4
2947
by: Isabelle | last post by:
Hello all, I want two sets of links to do different things; one set within *content* and the other *navigational* elements. In particular I want the *hover* rollover effect to work differently in each of those sets. How do I do this in CSS? Class function... two style sheets? I'm lost, please help! : =) Newbie to CSS,
6
7053
by: Pasi Kovanen | last post by:
How do I define for example link inside H1 style to be of different color than other links in the same page: <a href="blah">this is some color</a> <h1><a href="blahblah">this is different color</a></h1> I remember seeing suitable code for this a couple of years ago but wasn't able to google my way to it.
10
9933
by: IWP506 | last post by:
Hello everyone. I have a header picture for my website with text on it like "Home" "links" "about" etc. It's all 1 picture. I've seen people's sites where they have 1 picture but somehow make only portions of it a link. My question is is there any way to make a section point to 1 page, another sectioni to another page, etc, or do i need to make seperate images and try to piece them together. I'm using frontpage 2000
26
8103
by: johkar | last post by:
I need to cancel the link and execute a function onclick of all the links within the span tag which has a class of "container" assigned. There will be only one span tag with this class applied. I know you can get a specific tag using document.getElementsByTagName('span'), but I am unsure how to get the one with the class="container". I know there is a getAttribute method, just need a pointer or two to put it all together. Once I know...
9
3500
by: chrisspencer02 | last post by:
I am looking for a method to extract the links embedded within the Javascript in a web page: an ActiveX component, or example code in C++/Pascal/etc. I am looking for a general solution, not one tailored to a particular page/script. Hopefully, the problem can be solved without recreating a complete Javascript interpreter. Any ideas?
0
1157
by: krzysztof.konopko | last post by:
I know that the informations I provide may seem limited but maybe someone has solved similar problem. I am new in design patterns and I am quite confused who should control whom, how to configure particular elements checkers, how to make it all flexible and elegant, how to encapsulate it as much as possible. There is an API which provide a sequence of data objects. My task is to check if this sequence is correct: - does it have proper...
11
2447
by: Alan | last post by:
Okay, IE and Netscape have a default link style with "color: blue;" and "text-decoration: underline". That's fine, but for what I am doing I need to turn OFF the default styling. I know how to change it so that it is another color, or force it to NOT have underlining, or make it bold, or italics, or whatever. The problem is, how do I get it to turn the syling OFF completely. How do I get it NOT to change the color or text-decoration...
0
8728
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8647
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9076
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8974
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8926
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6563
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5903
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4404
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4659
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.