473,385 Members | 1,782 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Link Checers and the "www"

Hi;

A friend of mine is publishing a book that includes 3000 citations,
many with urls. When he made his bibliography he chopped off the
"http://"s off of his urls.

I wrote a program to parse out his urls and put them into a dummy HTML
page to take advantage of the free online link checkers.

I made two page. One where I made sure every url had "http://www."
and one where I just used "http://"

All of the urls on both pages were properly formed ( no double "www"s
or anything like that ).

Interestingly there were a few links that were broken using
"http://www", but that worked with only "http://" prepended.

I was under the impression that the "www" is necessary and that if
people don't type it in browsers just put it in.

So, how can putting "www" into a link break it as a link?

Is it a pecularity of server side processing?

Thanks in advance for any clues for satisfying my curiosity

Steve

Aug 20 '06 #1
13 2354
Els
Steve wrote:
Interestingly there were a few links that were broken using
"http://www", but that worked with only "http://" prepended.

I was under the impression that the "www" is necessary and that if
people don't type it in browsers just put it in.

So, how can putting "www" into a link break it as a link?
It's the webmaster's choice. Personally I only use my domains without
'www.', but I do set up a redirect on the www-versions, to catch those
that automatically start typing www.

--
Els http://locusmeus.com/
accessible web design: http://locusoptimus.com/

Now playing: 2 Unlimited - Here I Go
Aug 20 '06 #2
Steve wrote:
Hi;

A friend of mine is publishing a book that includes 3000 citations,
many with urls. When he made his bibliography he chopped off the
"http://"s off of his urls.

I wrote a program to parse out his urls and put them into a dummy HTML
page to take advantage of the free online link checkers.

I made two page. One where I made sure every url had "http://www."
and one where I just used "http://"

All of the urls on both pages were properly formed ( no double "www"s
or anything like that ).

Interestingly there were a few links that were broken using
"http://www", but that worked with only "http://" prepended.

I was under the impression that the "www" is necessary and that if
people don't type it in browsers just put it in.
It isn't necessary at all. These prefixes don't do anything. They're
just names.

An organization will generally have a domain, like example.com, to which
their servers will belong. Then they name the servers anything they want:

fidelio.example.com
pagliacci.example.com
tosca.example.com

For servers that have specific purposes, particular the ones being
exposed to the public (in other words, not only accessible from the
internal LAN), they'll typically assign server names that denote the use
of the server:

www.example.com
mail.example.com
webservices.example.com

That's all. It's all a matter of convention. If you defined
mail.example.com to point to your web server, everything would work. If
you have three different web sites and you call them

general.example.com
shop.example.com
returns.example.com

everything is fine. You can also use example.com, naked, as a server
name. Since the different services (web, mail, etc.) use different
default ports, you can even use one name for all of them, and skip the
prefix altogether. You can also assign multiple names to the same
server, so that example.com and www.example.com, typed into a web
browser, both lead to the same web site. It's all completely flexible.

So, how can putting "www" into a link break it as a link?
It isn't breaking anything. This is what will happen if the name
example.com has been defined to denote a web server, and the name
www.example.com hasn't been defined. Each name has to be configured to
exist.
Aug 20 '06 #3
<snipped good info about "www" is not necessary and may not be defined
for use with a domain name>

Hey Harlan and Els. Thanks a ton for improving my education on this
point and satisfying my curiosity.

Thanks a ton!

Steve

Aug 20 '06 #4
Harlan Messinger wrote:
>>
I was under the impression that the "www" is necessary and that if
people don't type it in browsers just put it in.
*Some* browsers put it in, unless you disable their 'Attempt telepathy'
misfeature. I find it intensely annoying when a browser re-interprets
the URL that I type - for example asuming that instead of a URL I
intended a search-term, and further that the search engine I had in mind
was somewhere.silly.msn.com.
>
It isn't necessary at all. These prefixes don't do anything. They're
just names.
I'm sure Harlan knows what he means, but I think what he's said above is
misleading.

If the host providing the webservice is www.example.org, and there is no
CNAME record aliasing that name to some other name, then the www. *is*
necessary. The 'prefix' isn't a prefix; it's the name of the host.

A location in the domain-name tree is just that - a named node. The name
may correspond to a host, if there is an A record in DNS, or if there is
a CNAME record that aliases that name to another name that does have an
A record. And the same name may have child nodes, which have the same
status - they may or may not represent hosts, and they may or may not
have child nodes.

So a node may represent both a host and a 'domain' in the colloquial
sense. In fact a 'domain' in DNS-speak is simply a location in the tree
of names - i.e. a node. I think this may be why people find the term
"FQDN" (standing for "fully-qualified domain name") confusing - in this
context a "domain name" means a name in the domain name system. For some
time I used to think it referred to the name of the domain in which a
host resided - I didn't realise a FQDN could mean specifically the
fully-qualified name of a host.

It is common practice to create an A record for the parent domain ('e.g.
example.com), and to also create a CNAME for www.example.com, that
points to example.com. If you do this, then both names will resolve to
the same host. In the address-bar of your browser, request to
www.example.com will continue to point to that domain as you navigate
the site, and similarly a request to the naked example.com will preserve
that nme in the address-bar.
>
An organization will generally have a domain, like example.com, to
which their servers will belong. Then they name the servers anything
they want:

fidelio.example.com pagliacci.example.com tosca.example.com

For servers that have specific purposes, particular the ones being
exposed to the public (in other words, not only accessible from the
internal LAN), they'll typically assign server names that denote the
use of the server:

www.example.com mail.example.com webservices.example.com

That's all. It's all a matter of convention. If you defined
mail.example.com to point to your web server, everything would work.
If you have three different web sites and you call them

general.example.com shop.example.com returns.example.com

everything is fine. You can also use example.com, naked, as a server
name.
It is important to note that if the DNS zone is example.com, and you
wish to establish a host at that node, then you MUST use an A record.
You may not use a CNAME to identify the host that shares a name with the
zone, and have it point to another name (e.g. www.example.com) that has
the A record. Doing it that way round causes problems.
Since the different services (web, mail, etc.) use different default
ports, you can even use one name for all of them, and skip the prefix
altogether. You can also assign multiple names to the same server, so
that example.com and www.example.com, typed into a web browser, both
lead to the same web site. It's all completely flexible.
This would be the case if the two names had A records pointing to the
same address. This can cause problems in certin cases; it is a very good
idea to have reverse DNS on a mailserver that matches the server's A
record, for example. Since you can only have one rDNS record for one
address, it is impossible to guarantee a match, if the server has
multiple A records.

Where a single host has multiple IP addresses, e.g. multiple network
interfaces, the situation is more complicated, and I understand there is
some controversy as to whether it is good or bad practice to give the
interfaces different names.
>
>So, how can putting "www" into a link break it as a link?

It isn't breaking anything. This is what will happen if the name
example.com has been defined to denote a web server, and the name
www.example.com hasn't been defined. Each name has to be configured
to exist.

--
Jack.
http://www.jackpot.uk.net/
Aug 20 '06 #5
Steve wrote:
Hi;

A friend of mine is publishing a book that includes 3000 citations,
many with urls. When he made his bibliography he chopped off the
"http://"s off of his urls.

I wrote a program to parse out his urls and put them into a dummy HTML
page to take advantage of the free online link checkers.

I made two page. One where I made sure every url had "http://www."
and one where I just used "http://"

All of the urls on both pages were properly formed ( no double "www"s
or anything like that ).

Interestingly there were a few links that were broken using
"http://www", but that worked with only "http://" prepended.

I was under the impression that the "www" is necessary and that if
people don't type it in browsers just put it in.

So, how can putting "www" into a link break it as a link?

Is it a pecularity of server side processing?

Thanks in advance for any clues for satisfying my curiosity

Steve
I have actually seen a situation in which <http://www.xxx.comwas a
different Web page from <http://xxx.com>. Both were for the same
company, but there were indeed differences.

--

David E. Ross
<http://www.rossde.com/>

Concerned about someone (e.g., Pres. Bush) snooping
into your E-mail? Use PGP.
See my <http://www.rossde.com/PGP/>
Aug 20 '06 #6
David E. Ross wrote:
I have actually seen a situation in which <http://www.xxx.comwas a
different Web page from <http://xxx.com>. Both were for the same
company, but there were indeed differences.
I've come across one recently where they were completely different - one
was for the company, and the other was for a motorcycle club or
something. Very odd.

--
My email address is valid but not monitored. Use my first name at the
same domain instead.
Aug 20 '06 #7
Jack wrote:
Harlan Messinger wrote:
>>>
I was under the impression that the "www" is necessary and that if
people don't type it in browsers just put it in.

*Some* browsers put it in, unless you disable their 'Attempt telepathy'
misfeature. I find it intensely annoying when a browser re-interprets
the URL that I type - for example asuming that instead of a URL I
intended a search-term, and further that the search engine I had in mind
was somewhere.silly.msn.com.
>>
It isn't necessary at all. These prefixes don't do anything. They're
just names.

I'm sure Harlan knows what he means, but I think what he's said above is
misleading.
It shouldn't have been in the context of the question, where the premise
was that www.example.org *didn't exist*. The issue wasn't whether it was
necessary when the version with www is the *only* name assigned to the
server.
>
If the host providing the webservice is www.example.org, and there is no
CNAME record aliasing that name to some other name, then the www. *is*
necessary. The 'prefix' isn't a prefix; it's the name of the host.
Aug 21 '06 #8

Harlan Messinger wrote:
It isn't necessary at all. These prefixes don't do anything.
That's such a partial statement that it goes beyond misleading and into
"downright wrong" for anyone who doesn't already understand the whole
situation.
They're just names.
They're "just names", but they're important names. They're names that
may or may or may not already be set up -- and if the site is using
exactly those names, then you have to make correct and appropriate use
of them, to match what the site is doing.

You certainly can't make a blanket statement that "All links will work
if you use the www. together with the http://" and then expect to apply
this global rule across all links. It probably will work, but
"probably" isn't good enough for a checker or QA process.

Aug 21 '06 #9
In article <4k************@individual.net>,
Harlan Messinger <hm*******************@comcast.netwrote:
>I was under the impression that the "www" is necessary and that if
people don't type it in browsers just put it in.

It isn't necessary at all. These prefixes don't do anything. They're
just names.
They are necessary for certain sites. Try www.usps.gov without the www,
for example.

-A
Aug 21 '06 #10
axlq wrote:
In article <4k************@individual.net>,
Harlan Messinger <hm*******************@comcast.netwrote:
>>I was under the impression that the "www" is necessary and that if
people don't type it in browsers just put it in.
It isn't necessary at all. These prefixes don't do anything. They're
just names.

They are necessary for certain sites. Try www.usps.gov without the www,
for example.
You missed the point, which was that it isn't necessary for a web site's
address to have "www." in it.
Aug 21 '06 #11
Harlan Messinger wrote:
axlq wrote:
>In article <4k************@individual.net>,
Harlan Messinger <hm*******************@comcast.netwrote:
>>>I was under the impression that the "www" is necessary and that if
people don't type it in browsers just put it in.
It isn't necessary at all. These prefixes don't do anything. They're
just names.

They are necessary for certain sites. Try www.usps.gov without the www,
for example.

You missed the point, which was that it isn't necessary for a web site's
address to have "www." in it.
I'm not sure that was the clearest way for me to clarify what I'd said.
"www.usps.gov" is the site's address, *and* it has "www" in it. They
could just as well have chosen not to make "www.usps.gov" the address.
They could just as well have made it "usps.gov" or "website.usps.gov" or
"oldfashionedmailservice.usps.gov". Whatever address they chose,
obviously that's what you'd have to type into your browser to get there,
but that wasn't the source of the OP's confusion.
Aug 21 '06 #12

Harlan Messinger wrote:
You missed the point, which was that it isn't necessary for a web site's
address to have "www." in it.
We weren't talking about addresses though, we were talking about links.
It's essential that the link matches a workable name. What's
"workable" depends on what the webmaster set up, you can't just point
at a default and guarantee it works (although www. almost always
does).

Aug 21 '06 #13
Andy Dingley wrote:
Harlan Messinger wrote:
>You missed the point, which was that it isn't necessary for a web site's
address to have "www." in it.

We weren't talking about addresses though, we were talking about links.
It's essential that the link matches a workable name. What's
"workable" depends on what the webmaster set up, you can't just point
at a default and guarantee it works (although www. almost always
does).
At the risk of beating a dead horse ...

The OP wrote:
Interestingly there were a few links that were broken using
"http://www", but that worked with only "http://" prepended.

I was under the impression that the "www" is necessary and that if
people don't type it in browsers just put it in.

So, how can putting "www" into a link break it as a link?
The answer is that the addresses of these web sites don't have www in
them. Pure and simple.
Also,
I was under the impression that the "www" is necessary and that if
people don't type it in browsers just put it in.
My response, expanded: It's necessary if only a host name with www has
been configured to point to the site. It isn't necessary if both www and
non-www versions have been configured. It won't work at all if only a
non-www version has been configured.

At this point a typical Web user is likely to think, "Huh?" because he
assumed that the specific string "www" had an implicit functional
purpose. So I continued with the details that there is no intrinsic
significance to "www", and that a host name does not have to include it.
Aug 22 '06 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Michael | last post by:
How can I get X (or another value) from this string "asdfsadfX", basically i want to get what ever is in between the tags and place them in a variable called $www , can anyone help?
14
by: Russell Hoover | last post by:
I want "www." to never show in a browser's location bar when anyone visits my site via my domain-name. i.e. : BAD: http://www.my-domain-name GOOD: http://my-domain-name How can I...
1
by: robert | last post by:
In a DAV scheme with PROPFIND or GET (PROPFIND /test/ HTTP/1.1) and Basic AUTH to a MS SharePoint over https server (AUTH required), he responds 'WWW-Authenticate: NTLM' only: reply: 'HTTP/1.1...
4
by: star111792 | last post by:
hi, can anyone tell me that why "www." suffix is used in some URLs and why not in others? thanks.
1
by: silverachilles | last post by:
Hello, I have php code which takes an rss feed and outputs it on a page of my website. For each item it shows the link, title and description. My problem is the description element contains a lot...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.