redirect / new website how to redirect old (google) links to newsite ?

John

Hi,

I updated a site and changed the file extensions from .html to .php.

Now i noticed that the google does find the old .html pages but since
they're not there anymore... they can't be found.

Are there any way of (easily, without messing the site ;)) redirecting
those links to the main site?

Thanks!

Nov 7 '06 #1

Subscribe Post Reply

1812

Jerry Stuckle

John wrote:

Hi,

I updated a site and changed the file extensions from .html to .php.

Now i noticed that the google does find the old .html pages but since
they're not there anymore... they can't be found.

Are there any way of (easily, without messing the site ;)) redirecting
those links to the main site?

Thanks!

John,

This is an Apache (or whatever webserver you're using) question, not a
PHP one.

If you're using Apache, try alt.apache.configuration.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Nov 7 '06 #2

Michael Fesser

..oO(John)

>I updated a site and changed the file extensions from .html to .php.

Read

Cool URIs don't change
http://www.w3.org/Provider/Style/URI

>Now i noticed that the google does find the old .html pages but since
they're not there anymore... they can't be found.

Are there any way of (easily, without messing the site ;)) redirecting
those links to the main site?

Don't redirect. Configure your server to parse .html files for PHP.

Micha

Nov 7 '06 #3

John

Thanks!!

Michael Fesser kirjoitti:

.oO(John)

>I updated a site and changed the file extensions from .html to .php.

Read

Cool URIs don't change
http://www.w3.org/Provider/Style/URI

>Now i noticed that the google does find the old .html pages but since
they're not there anymore... they can't be found.

Are there any way of (easily, without messing the site ;)) redirecting
those links to the main site?

Don't redirect. Configure your server to parse .html files for PHP.

Micha

Nov 7 '06 #4

Jerry Stuckle

Michael Fesser wrote:

.oO(John)

>>I updated a site and changed the file extensions from .html to .php.

Read

Cool URIs don't change
http://www.w3.org/Provider/Style/URI

>>Now i noticed that the google does find the old .html pages but since
they're not there anymore... they can't be found.

Are there any way of (easily, without messing the site ;)) redirecting
those links to the main site?

Don't redirect. Configure your server to parse .html files for PHP.

Micha

It's unnecessary overhead to parse static html files for PHP code. Now
what if you also want server side includes? And maybe another language
or two? A 301 redirect is recognized by all search engines and they
will replace the old URL with the new one.

New users will get the new URI and old ones will get redirected (and
most will also quickly learn the new URI).

After a period of time, you can replace the 301 redirect with another
page which indicates "The page has moved...". That way the few
left-over people who haven't change the URI will do so.

And BTW - I wouldn't classify a URI as "cool" just because it had an
html extension.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Nov 7 '06 #5

John Dunlop

Jerry Stuckle:

And BTW - I wouldn't classify a URI as "cool" just because it had an
html extension.

No, neither would I, but did you actually read /Cool URIs don't
change/? Read it and you'll see that that's not what Michael meant.

--
Jock

Nov 8 '06 #6

Michael Fesser

..oO(Jerry Stuckle)

>Michael Fesser wrote:
>>
Don't redirect. Configure your server to parse .html files for PHP.

It's unnecessary overhead to parse static html files for PHP code.

His files are not static, they contain PHP code. It's the URL that
should be static to avoid link rot and inconveniences for your visitors.

Now
what if you also want server side includes?

If you have PHP, you don't need SSI anymore.

But of course you can also use SSI if you like - just configure the
server to parse .html files for SSI directives. And if you want to use
it all at the same time, you can do that as well:

http://example.com/static.html
http://example.com/phpscript.html
http://example.com/perlscript.html
http://example.com/ssi.html

It just depends on the server configuration (in this case for example
with content negotiation and MultiViews).

>A 301 redirect is recognized by all search engines and they
will replace the old URL with the new one.

New users will get the new URI and old ones will get redirected (and
most will also quickly learn the new URI).

That's a broken design. There are many valid reasons to keep all such
technical stuff out of URLs.

>After a period of time, you can replace the 301 redirect with another
page which indicates "The page has moved...". That way the few
left-over people who haven't change the URI will do so.

Completely unnecessary, if you do it right from the beginning. If you
really think you need a filename extension in URLs, then use 'html'.
Of course no extension at all is even better.

Micha

Nov 8 '06 #7

Jerry Stuckle

John Dunlop wrote:

Jerry Stuckle:

>>And BTW - I wouldn't classify a URI as "cool" just because it had an
html extension.

No, neither would I, but did you actually read /Cool URIs don't
change/? Read it and you'll see that that's not what Michael meant.

Yep, I read it. And I disagree with a lot of what it said.

It's ok to have a "cool" url such as www.example.com/john/...

That I wouldn't necessarily want to change. But just changing .html
extensions to .php? Not even two seconds thinking about it.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Nov 9 '06 #8

Jerry Stuckle

Michael Fesser wrote:

.oO(Jerry Stuckle)

>>Michael Fesser wrote:

>>>Don't redirect. Configure your server to parse .html files for PHP.

It's unnecessary overhead to parse static html files for PHP code.

His files are not static, they contain PHP code. It's the URL that
should be static to avoid link rot and inconveniences for your visitors.

Not necessarily. I have customers with a lot of PHP files which are
pretty much 'static'. Many of them depend on stuff in databases which
might change once a year or less. Not a lot of difference there between
a static page which is updated once a year and one which pulls from a
database.

Also, I have some customers with .php pages which really are static.
But why use PHP for some other function such as processing form data.
The output doesn't change at all. In that sense, they are 'static'.

Also, he never said *all* his pages were static. Telling the web server
to process *all* .html files as php files is an unnecessary overhead.

>
>Now
what if you also want server side includes?

If you have PHP, you don't need SSI anymore.

And if you don't need any PHP code in the page other than maybe an
include, you're doing completely unnecessary work.

But of course you can also use SSI if you like - just configure the
server to parse .html files for SSI directives. And if you want to use
it all at the same time, you can do that as well:

http://example.com/static.html
http://example.com/phpscript.html
http://example.com/perlscript.html
http://example.com/ssi.html

It just depends on the server configuration (in this case for example
with content negotiation and MultiViews).

Yep, and the more you tell the server it has to parse, the more CPU time
it takes.

>
>>A 301 redirect is recognized by all search engines and they
will replace the old URL with the new one.

New users will get the new URI and old ones will get redirected (and
most will also quickly learn the new URI).

That's a broken design. There are many valid reasons to keep all such
technical stuff out of URLs.

Not at all. No "technical stuff" in the url at all. You're just
redirecting index.html to actually retrieve index.php instead. And the
latter will be processed by the php interpreter.

>
>>After a period of time, you can replace the 301 redirect with another
page which indicates "The page has moved...". That way the few
left-over people who haven't change the URI will do so.

Completely unnecessary, if you do it right from the beginning. If you
really think you need a filename extension in URLs, then use 'html'.
Of course no extension at all is even better.

Micha

No extension? UGH - no, DOUBLE UGH! Vomit! Wash your mouth out with soap!

Extensions were created for a purpose - to let the server know what
needs to be handled by which processors. To bypass that creates a
completely unnecessary load on the server.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Nov 9 '06 #9

Michael Fesser

..oO(Jerry Stuckle)

>Michael Fesser wrote:

Also, I have some customers with .php pages which really are static.
But why use PHP for some other function such as processing form data.

There are many more nice things I use PHP for, even if the output may be
static (for a while). If I can use a tool and get some benefits from it,
I use it.

>Also, he never said *all* his pages were static. Telling the web server
to process *all* .html files as php files is an unnecessary overhead.

Who cares? That's what a server is for. You won't be able to notice a
difference in time between a plain HTML file delivered as-is and another
HTML file, parsed by PHP. Even with PHP-(Fast)CGI the difference is too
small. The transfer over the network takes much more time.

And BTW: Unnecessary 301 redirects are no overhead? Not much for the
server, but for all the clients and the network.

>But of course you can also use SSI if you like - just configure the
server to parse .html files for SSI directives. And if you want to use
it all at the same time, you can do that as well:

http://example.com/static.html
http://example.com/phpscript.html
http://example.com/perlscript.html
http://example.com/ssi.html

It just depends on the server configuration (in this case for example
with content negotiation and MultiViews).

Yep, and the more you tell the server it has to parse, the more CPU time
it takes.

That's his job.

>That's a broken design. There are many valid reasons to keep all such
technical stuff out of URLs.

Not at all. No "technical stuff" in the url at all. You're just
redirecting index.html to actually retrieve index.php instead.

A 'php' in the URL is technical stuff that doesn't belong there. A URL
describes a resource, not details about the way it is generated.

>No extension?

Exactly. Reliable, long-living (in other words: "cool") URLs don't need
an extension, simply because it avoids a lot of problems.

>UGH - no, DOUBLE UGH! Vomit! Wash your mouth out with soap!

Extensions were created for a purpose - to let the server know what
needs to be handled by which processors.

Concepts like "directory", "file", "extension" don't exist in URLs.
A URL doesn't describe a file, but a resource.

You only need an extension if you directly map a URL onto the server's
filesystem. That's the most common, but not the only way. Of course on
the server the files still have their extension, but there's no need to
show it in a URL.

>To bypass that creates a
completely unnecessary load on the server.

If your server gets into trouble because of some little lookups and
simple pattern matching then you have a _real_ problem.

You should care more about your clients (users, search engines) than
about the server and make things as easy as possible for them. Using the
right tools at the right time to satisfy the clients, that's the whole
point. Ignoring these tools just because they may cause some more CPU
load now and then is - sorry - stupid (no offense intended).

Micha

Nov 9 '06 #10

Jerry Stuckle

Michael Fesser wrote:

.oO(Jerry Stuckle)

>>Michael Fesser wrote:

Also, I have some customers with .php pages which really are static.
But why use PHP for some other function such as processing form data.

There are many more nice things I use PHP for, even if the output may be
static (for a while). If I can use a tool and get some benefits from it,
I use it.

>>Also, he never said *all* his pages were static. Telling the web server
to process *all* .html files as php files is an unnecessary overhead.

Who cares? That's what a server is for. You won't be able to notice a
difference in time between a plain HTML file delivered as-is and another
HTML file, parsed by PHP. Even with PHP-(Fast)CGI the difference is too
small. The transfer over the network takes much more time.

Your hosting company, for one, unless you're on a dedicated server.
You're needlessly taking cpu cycles away from other sites on the server.

And BTW: Unnecessary 301 redirects are no overhead? Not much for the
server, but for all the clients and the network.

Very little compared to unnecessarily parsing .html files. mod_redirect
is quite short and quick in its operation - especially if the redirect
is in the httpd.conf file. But even in .htaccess it's quite fast.

>

>>>But of course you can also use SSI if you like - just configure the
server to parse .html files for SSI directives. And if you want to use
it all at the same time, you can do that as well:

http://example.com/static.html
http://example.com/phpscript.html
http://example.com/perlscript.html
http://example.com/ssi.html

It just depends on the server configuration (in this case for example
with content negotiation and MultiViews).

Yep, and the more you tell the server it has to parse, the more CPU time
it takes.

That's his job.

And it's just plain sloppy programming or laziness to force it to do
more than is called for in a case like this.

>

>>>That's a broken design. There are many valid reasons to keep all such
technical stuff out of URLs.

Not at all. No "technical stuff" in the url at all. You're just
redirecting index.html to actually retrieve index.php instead.

A 'php' in the URL is technical stuff that doesn't belong there. A URL
describes a resource, not details about the way it is generated.

That may be your opinion. The extension just allows the server to do
the most efficient processing of the file. You could call it .xyz for
all I care - just set up the correct file type in Apache.

>
>>No extension?

Exactly. Reliable, long-living (in other words: "cool") URLs don't need
an extension, simply because it avoids a lot of problems.

Then your definition of "cool" varies from almost all of the rest of the
world. How many sites do you see with no extensions, for instance
(other than your own, of course).

>
>>UGH - no, DOUBLE UGH! Vomit! Wash your mouth out with soap!

Extensions were created for a purpose - to let the server know what
needs to be handled by which processors.

Concepts like "directory", "file", "extension" don't exist in URLs.
A URL doesn't describe a file, but a resource.

Yep, it describes a resource. One of the types of resource it describes
is a file. And when it's referring to a file on a server, the extension
is important. Not only .php, but things like .gif, .png, and others
come to mind.

You only need an extension if you directly map a URL onto the server's
filesystem. That's the most common, but not the only way. Of course on
the server the files still have their extension, but there's no need to
show it in a URL.

No, it's not the *only* way. But it's the most common - AND THE MOST
EFFICIENT.

>
>>To bypass that creates a
completely unnecessary load on the server.

If your server gets into trouble because of some little lookups and
simple pattern matching then you have a _real_ problem.

Yep. You have a problem because you create unnecessary load on the server.

You should care more about your clients (users, search engines) than
about the server and make things as easy as possible for them. Using the
right tools at the right time to satisfy the clients, that's the whole
point. Ignoring these tools just because they may cause some more CPU
load now and then is - sorry - stupid (no offense intended).

Micha

I do care about my clients - especially the users. And the search
engines handle 301 redirects quite handily - as I have mentioned before.

And I do use the right tools at the right time. That's why I tell
Apache to parse .php files for PHP code, and not to parse .html files
for it.

And expecting the server to do your work for you is just plan lazy.
Sorry if it offends you - but that's how I see it.

Now - one other thing. You may get by with this on a site with 20 pages
and 1K hits/day. But try sites with 10K+ pages, and over 1M hits/hr.

There is a huge difference in the processing time required.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Nov 9 '06 #11

John Dunlop

Jerry Stuckle:

[A URL] describes a resource.

Well, URLs by definition _locate_ network locatable resources,
specifically 'information resources', as one W3C Recommendation calls
them; whether or not the string of characters in a URL describes in
some human-recognisable way the resource it points to is up to you the
URL owner.

One of the types of resource it describes is a file.

The crux of the matter is that if a URL points to a particular
representation of a resource - e.g., HTML, XHTML, plain text, RTF - and
this is reflected in the URL in the shape of suffixes, when you change
the representation or add another representation and negotiate between
them you have to change the URL to keep it meaningful. For example, if
you originally publish a restaurant menu in plain text, ending its URL
path with '.txt', but later publish an HTML version and negotiate
between the two, the '.txt' at the end of the URL path is counter
intuitive for the HTML version.

You could solve this in one of two ways. One, publish two different
URLs, one ending in '.txt', the other in '.html'. However, this
undermines the value of the resource since it divides the community
into those who refer to the plain text version and those who refer to
the HTML version while for everyone the resource is conceptually a
single entity; c.f., Metcalfe's Law. Two, instead of seeing the URL as
pointing to a particular representation of the resource, see the URL as
pointing to the resource itself - the menu in the example above. That
way, when you change or add representations, you don't need to change
the URL because it identifies the resource itself rather than any
particular representation. Which way you choose depends on what you
see the URL as pointing to.

If you choose the second way, URL suffixes have no place in URLs
because they add nothing to the identification of the resource and they
run afoul of the principles of length (shortness), meaningfulness, and
persistency. I regard as weak the counter argument that if URL
suffixes are a de facto standard, then all URLs should include them.
The BBC, for example, publishes URLs without suffixes, and URLs without
suffixes occur in traditional media. Even if URL suffixes are a de
facto standard, the suffix-less ones are more user friendly.

--
Jock

Nov 10 '06 #12

Jerry Stuckle

John Dunlop wrote:

Jerry Stuckle:

>>[A URL] describes a resource.

Well, URLs by definition _locate_ network locatable resources,
specifically 'information resources', as one W3C Recommendation calls
them; whether or not the string of characters in a URL describes in
some human-recognisable way the resource it points to is up to you the
URL owner.

>>One of the types of resource it describes is a file.

The crux of the matter is that if a URL points to a particular
representation of a resource - e.g., HTML, XHTML, plain text, RTF - and
this is reflected in the URL in the shape of suffixes, when you change
the representation or add another representation and negotiate between
them you have to change the URL to keep it meaningful. For example, if
you originally publish a restaurant menu in plain text, ending its URL
path with '.txt', but later publish an HTML version and negotiate
between the two, the '.txt' at the end of the URL path is counter
intuitive for the HTML version.

You could solve this in one of two ways. One, publish two different
URLs, one ending in '.txt', the other in '.html'. However, this
undermines the value of the resource since it divides the community
into those who refer to the plain text version and those who refer to
the HTML version while for everyone the resource is conceptually a
single entity; c.f., Metcalfe's Law. Two, instead of seeing the URL as
pointing to a particular representation of the resource, see the URL as
pointing to the resource itself - the menu in the example above. That
way, when you change or add representations, you don't need to change
the URL because it identifies the resource itself rather than any
particular representation. Which way you choose depends on what you
see the URL as pointing to.

If you choose the second way, URL suffixes have no place in URLs
because they add nothing to the identification of the resource and they
run afoul of the principles of length (shortness), meaningfulness, and
persistency. I regard as weak the counter argument that if URL
suffixes are a de facto standard, then all URLs should include them.
The BBC, for example, publishes URLs without suffixes, and URLs without
suffixes occur in traditional media. Even if URL suffixes are a de
facto standard, the suffix-less ones are more user friendly.

As I said - I see it pointing to a resource. But one size does not fit
all. There are different types of resources on the internet.

For instance, all of my printers are tcp/ip ready. All of them have
URI's associated with them, and I print to a URI. But I don't try to
load one in my browser - it's the wrong type. I also have system
backups on an internet server. These are also URI's - but I wouldn't
load one in a browser. And my email servers are another type of URI.

When dealing with file URI's, the file extension is meaningful. Maybe
not to the user, but definitely to the server. Servers make different
decisions on how to handle different file extensions for performance
reasons. You wouldn't want to try to run a .asp file through a .php
parser, for instance. And you wouldn't want to try to run everything
(including static pages) through both. It would bring any reasonably
active server to its knees.

Theory is good. And it even works in low volume sites on low activity
servers. But the overhead quickly becomes more unmanageable in more
heavily used sites. I find ignoring this fact to stick to an ideal is a
very weak argument.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Nov 10 '06 #13

John Dunlop

Jerry Stuckle:

When dealing with file URI's, the file extension is meaningful.

Frankly, Jerry, I don't know whether to laugh or cry.

--
Jock

Nov 10 '06 #14

Jerry Stuckle

John Dunlop wrote:

Jerry Stuckle:

>>When dealing with file URI's, the file extension is meaningful.

Frankly, Jerry, I don't know whether to laugh or cry.

John, when you figure out, please let the rest of us know.

Theory is great. And if there were some way for the web server to
determine what type of file it is, it would be a different story.

But right now all we have (on Linux) is an executable flag. The file
can either be executed by the OS or not. There is no way for the web
server to determine what it needs to parse for various languages.

Let me give you an example from one site. Most of the site is written
in VBScript (.asp). However, we have a discussion forum written in
Perl. We have other packages written in PHP. And we're looking at
adding another package which requires Python.

Now - do you expect the webserver to parse every one of those files,
including the static pages, for VBScript, PHP, Perl and Python?

Right now the only way the web server can tell is by the file extension.
Of course this is on IIS, so there's no .htaccess. But I guess if you
dug deeply enough there might be a way to tell the server to parse
index.html as .asp code, but blog.html as PHP code and discussion.html
as Perl code.

Can you imagine the trouble trying to keep up with a couple of thousand
files like that?

This is the real world, not some theoretical Utopia.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Nov 10 '06 #15

Michael Fesser

..oO(Jerry Stuckle)

>Let me give you an example from one site. Most of the site is written
in VBScript (.asp). However, we have a discussion forum written in
Perl. We have other packages written in PHP. And we're looking at
adding another package which requires Python.

Everything is possible, at least on Apaches.

Of course this is on IIS, so there's no .htaccess.

There's not even a webserver ... SCNR

>But I guess if you
dug deeply enough there might be a way to tell the server to parse
index.html as .asp code, but blog.html as PHP code and discussion.html
as Perl code.

Apache: Options +MultiViews

http://example.com/index.html.asp
http://example.com/blog.html.php
http://example.com/discussion.html.pl

That's it. You could even remove the '.html' from the filenames and
still call them how it's supposed to be in the most userfriendly way:

http://example.com/
http://example.com/blog
http://example.com/discussion

That's the whole point of a URL - it describes (OK, it _locates_) a
resource on a network. It doesn't have to reveal technical details about
how that resource is created or which variant of it is returned on
request. Whether it is created from a static file, a PHP script, a Perl
script or anything else - it simply dosen't matter and is of absolutely
no interest for the user agent, so it doesn't have to appear in the URL.

Micha

Nov 11 '06 #16

Michael Fesser

..oO(Jerry Stuckle)

>Michael Fesser wrote:
>>
Who cares? That's what a server is for. You won't be able to notice a
difference in time between a plain HTML file delivered as-is and another
HTML file, parsed by PHP. Even with PHP-(Fast)CGI the difference is too
small. The transfer over the network takes much more time.

Your hosting company, for one, unless you're on a dedicated server.
You're needlessly taking cpu cycles away from other sites on the server.

No. I just use what I pay for. My scripts are limited to a certain
amount of memory and CPU time. Being able to use these resources is part
of my contract with the hoster. If it would affect other sites on the
same machine it would be a violation of the contract.

>And BTW: Unnecessary 301 redirects are no overhead? Not much for the
server, but for all the clients and the network.

Very little compared to unnecessarily parsing .html files.

No. Parsing files is done on the server side, a redirection is handled
client-side. So you're just moving all the work from the server to your
clients. Every access to the old resource wastes network resources.

>mod_redirect
is quite short and quick in its operation - especially if the redirect
is in the httpd.conf file. But even in .htaccess it's quite fast.

Using mod_rewrite while trying to spare some CPU cycles is ... strange.

>And it's just plain sloppy programming or laziness to force it to do
more than is called for in a case like this.

My servers don't do more work than necessary. They simply do what is
needed to satisfy my clients.

>A 'php' in the URL is technical stuff that doesn't belong there. A URL
describes a resource, not details about the way it is generated.

That may be your opinion. The extension just allows the server to do
the most efficient processing of the file. You could call it .xyz for
all I care - just set up the correct file type in Apache.

Sure, but there's still no need to show it in the URL.

>Exactly. Reliable, long-living (in other words: "cool") URLs don't need
an extension, simply because it avoids a lot of problems.

Then your definition of "cool" varies from almost all of the rest of the
world. How many sites do you see with no extensions, for instance
(other than your own, of course).

Nearly every big company uses URLs like <http://example.com/coolThing>,
even if it's often just used for redirecting the user to another page
inside a CMS. But it's a beginning.

>Yep, it describes a resource. One of the types of resource it describes
is a file. And when it's referring to a file on a server, the extension
is important. Not only .php, but things like .gif, .png, and others
come to mind.

In this context omitting the extension provides another nice benefit. We
all know that IE has its problems with alpha transparency in PNGs. Using
server-side content negotiation you can easily deliver a fallback-JPEG
to IE and an alpha-PNG to real browsers - with the same URL. No need for
client-side hacks or switches, just let the server decide which is the
most appropriate variant to send back to the client.

>You only need an extension if you directly map a URL onto the server's
filesystem. That's the most common, but not the only way. Of course on
the server the files still have their extension, but there's no need to
show it in a URL.

No, it's not the *only* way. But it's the most common - AND THE MOST
EFFICIENT.

Others are much more flexible.

>And expecting the server to do your work for you is just plan lazy.
Sorry if it offends you - but that's how I see it.

Now - one other thing. You may get by with this on a site with 20 pages
and 1K hits/day. But try sites with 10K+ pages, and over 1M hits/hr.

There is a huge difference in the processing time required.

We are not talking about high-traffic sites here. On such a beast there
are many other toys available for use - load balancers, local proxies,
bytecode caches etc. Even your highly optimized server could not handle
such a site on its own.

Micha

Nov 11 '06 #17

Jerry Stuckle

Michael Fesser wrote:

.oO(Jerry Stuckle)

>>Let me give you an example from one site. Most of the site is written
in VBScript (.asp). However, we have a discussion forum written in
Perl. We have other packages written in PHP. And we're looking at
adding another package which requires Python.

Everything is possible, at least on Apaches.

>Of course this is on IIS, so there's no .htaccess.

There's not even a webserver ... SCNR

Bullshit. It may not be as good as Apache, but it still is a perfectly
good webserver.

>
>>But I guess if you
dug deeply enough there might be a way to tell the server to parse
index.html as .asp code, but blog.html as PHP code and discussion.html
as Perl code.

Apache: Options +MultiViews

http://example.com/index.html.asp
http://example.com/blog.html.php
http://example.com/discussion.html.pl

That's it. You could even remove the '.html' from the filenames and
still call them how it's supposed to be in the most userfriendly way:

And I would love to see you parse asp on Apache. Obviously you've never
tried it or you wouldn't even attempt to make this statement.

There still is not a solid plugin for ASP files on Apache. People have
been working on them, but they're still not ready for a serious server.

And no, don't think I'm a Microsoft fan. Most of my servers are
Linux/Apache. However, none of those are using asp pages.

http://example.com/
http://example.com/blog
http://example.com/discussion

That's the whole point of a URL - it describes (OK, it _locates_) a
resource on a network. It doesn't have to reveal technical details about
how that resource is created or which variant of it is returned on
request. Whether it is created from a static file, a PHP script, a Perl
script or anything else - it simply dosen't matter and is of absolutely
no interest for the user agent, so it doesn't have to appear in the URL.

Micha

Sorry, Micha. Your theoretical ideas don't work in practice.

Try doing parsing every file (including static html files) for three or
four different languages. Now try to to this on a server which averages

100K hits per hour, 24/7, and peaks at close to 1M hits/hr.

IF it works, you'll have a tremendous amount of additional overhead and
you'll be slowed to a crawl. Even in Apache.

Of course, your average "Mom and Pop grocery" which uses 50Mb of disk
and 500Mb of bandwidth per month won't see this. But a serious site
will have a lot of problems.

So before you go touting your great theories, I suggest you try running
a site on dual 3Gh+ processors, 2GB RAM, averaging 20-25% CPU and
peaking close to 100%. Get your no-file-extension idea working. And
I'd love to see the server keep it up.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Nov 11 '06 #18

Jerry Stuckle

Michael Fesser wrote:

.oO(Jerry Stuckle)

>>Michael Fesser wrote:

>>>Who cares? That's what a server is for. You won't be able to notice a
difference in time between a plain HTML file delivered as-is and another
HTML file, parsed by PHP. Even with PHP-(Fast)CGI the difference is too
small. The transfer over the network takes much more time.

Your hosting company, for one, unless you're on a dedicated server.
You're needlessly taking cpu cycles away from other sites on the server.

No. I just use what I pay for. My scripts are limited to a certain
amount of memory and CPU time. Being able to use these resources is part
of my contract with the hoster. If it would affect other sites on the
same machine it would be a violation of the contract.

And when you get shut down for overusing the CPU, don't come crying to me.

>

>>>And BTW: Unnecessary 301 redirects are no overhead? Not much for the
server, but for all the clients and the network.

Very little compared to unnecessarily parsing .html files.

No. Parsing files is done on the server side, a redirection is handled
client-side. So you're just moving all the work from the server to your
clients. Every access to the old resource wastes network resources.

Yes and no. The redirection is initiated server-side. There is very
little overhead doing this. Then the client requests the new page.

>
>>mod_redirect
is quite short and quick in its operation - especially if the redirect
is in the httpd.conf file. But even in .htaccess it's quite fast.

Using mod_rewrite while trying to spare some CPU cycles is ... strange.

And why is that? It's much cheaper than needlessly parsing even a
single 2K html file.

>
>>And it's just plain sloppy programming or laziness to force it to do
more than is called for in a case like this.

My servers don't do more work than necessary. They simply do what is
needed to satisfy my clients.

If you program like you indicate, your servers are doing a lot more work
than necessary. But that's OK. You aren't dealing with any serious
sites - just Mom and Pop outfits. So it really doesn't matter to them.
They don't know the difference anyway.

>

>>>A 'php' in the URL is technical stuff that doesn't belong there. A URL
describes a resource, not details about the way it is generated.

That may be your opinion. The extension just allows the server to do
the most efficient processing of the file. You could call it .xyz for
all I care - just set up the correct file type in Apache.

Sure, but there's still no need to show it in the URL.

And pray tell how are you going to do that without parsing every single
file for php (and possibly other languages)?

Face it - for files, the file extension IS meaningful to Apache, IIS and
any other webserver on the market.

>

>>>Exactly. Reliable, long-living (in other words: "cool") URLs don't need
an extension, simply because it avoids a lot of problems.

Then your definition of "cool" varies from almost all of the rest of the
world. How many sites do you see with no extensions, for instance
(other than your own, of course).

Nearly every big company uses URLs like <http://example.com/coolThing>,
even if it's often just used for redirecting the user to another page
inside a CMS. But it's a beginning.

Yep, and if you check, that's normally the index.html or index.php file
in a directory. IOW the server is picking up the default page for that
directory.

>
>>Yep, it describes a resource. One of the types of resource it describes
is a file. And when it's referring to a file on a server, the extension
is important. Not only .php, but things like .gif, .png, and others
come to mind.

In this context omitting the extension provides another nice benefit. We
all know that IE has its problems with alpha transparency in PNGs. Using
server-side content negotiation you can easily deliver a fallback-JPEG
to IE and an alpha-PNG to real browsers - with the same URL. No need for
client-side hacks or switches, just let the server decide which is the
most appropriate variant to send back to the client.

So? If that's a problem on your site, just use JPEG in the first place.
Then you don't even need to worry about the negotiation.

>

>>>You only need an extension if you directly map a URL onto the server's
filesystem. That's the most common, but not the only way. Of course on
the server the files still have their extension, but there's no need to
show it in a URL.

No, it's not the *only* way. But it's the most common - AND THE MOST
EFFICIENT.

Others are much more flexible.

Not at all. 301 redirects allow the most efficient way to be just as
flexible when necessary.

>
>>And expecting the server to do your work for you is just plan lazy.
Sorry if it offends you - but that's how I see it.

Now - one other thing. You may get by with this on a site with 20 pages
and 1K hits/day. But try sites with 10K+ pages, and over 1M hits/hr.

There is a huge difference in the processing time required.

We are not talking about high-traffic sites here. On such a beast there
are many other toys available for use - load balancers, local proxies,
bytecode caches etc. Even your highly optimized server could not handle
such a site on its own.

Micha

No, YOU AREN'T. But that is NOT TRUE of the entire internet. And not
true of some of the other programmers here.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Nov 11 '06 #19

Andy Hassall

On Fri, 10 Nov 2006 21:53:16 -0500, Jerry Stuckle <js*******@attglobal.net>
wrote:

>>Of course this is on IIS, so there's no .htaccess.

There's not even a webserver ... SCNR

Bullshit. It may not be as good as Apache, but it still is a perfectly
good webserver.

On this point, agreed - IIS 6 is a reasonable basic webserver. I run one at
work in a mixed ASP (legacy code) and PHP (new code) environment, and it's
noticably better than IIS 5, and Apache on Windows is historically not
brilliant. Having said that, I'm still working (slowly) towards Apache on a
UNIX variant (probably RHEL), since all the new code is PHP and is gradually
replacing the ASP.

>Sorry, Micha. Your theoretical ideas don't work in practice.

Try doing parsing every file (including static html files) for three or
four different languages.

You've missed the point - I don't believe anyone has suggested that you do
that.

--
Andy Hassall :: an**@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool

Nov 11 '06 #20

Jerry Stuckle

Andy Hassall wrote:

On Fri, 10 Nov 2006 21:53:16 -0500, Jerry Stuckle <js*******@attglobal.net>
wrote:

>>>>Of course this is on IIS, so there's no .htaccess.

There's not even a webserver ... SCNR

Bullshit. It may not be as good as Apache, but it still is a perfectly
good webserver.

On this point, agreed - IIS 6 is a reasonable basic webserver. I run one at
work in a mixed ASP (legacy code) and PHP (new code) environment, and it's
noticably better than IIS 5, and Apache on Windows is historically not
brilliant. Having said that, I'm still working (slowly) towards Apache on a
UNIX variant (probably RHEL), since all the new code is PHP and is gradually
replacing the ASP.

>>Sorry, Micha. Your theoretical ideas don't work in practice.

Try doing parsing every file (including static html files) for three or
four different languages.

You've missed the point - I don't believe anyone has suggested that you do
that.

Andy, that's exactly what Micha is suggesting. Remove the extension
from the file and have Apache parse all files the same way.

In fact, his first suggestion was to tell Apache to parse all .html
files with PHP, whether they included PHP code or not. Then I brought
up other conditions - like the site I mentioned which currently uses
asp, php and perl - and will soon probably be using python.

And I'm supposed to parse every page for each of these?

All to match his Utopian ideal of having URI's which are completely
independent of technical details.

Good in theory. But not in practice.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Nov 11 '06 #21

Andy Hassall

On Fri, 10 Nov 2006 22:12:47 -0500, Jerry Stuckle <js*******@attglobal.net>
wrote:

> You've missed the point - I don't believe anyone has suggested that you do
that.

Andy, that's exactly what Micha is suggesting. Remove the extension
from the file and have Apache parse all files the same way.

In fact, his first suggestion was to tell Apache to parse all .html
files with PHP, whether they included PHP code or not.

That was as a solution for the OP that already had URLs with .html suffixes.

Then I brought
up other conditions - like the site I mentioned which currently uses
asp, php and perl - and will soon probably be using python.

And I'm supposed to parse every page for each of these?

The discussion then went onto extension-less URIs, and implicitly into the
Apache MultiViews option. This option does not run all files through every
possible processing option.

The files on the filesystem still have extensions to indicate what processing
the webserver should do with them (such as .php for PHP processing, or .gif for
none, etc.).

But the URI space doesn't have to map directly onto the filesystem space. You
can omit the extension when Apache is configured appropriately. For example:

* You access: http://example.com/something
* The document root directory may be /var/www/
* There may be a file named /var/www/something.php
* There would typically be no other files with the base filename
'/var/www/something'.
* Apache's MultiViews selects something.php as the suitable file to serve for
the URI, and parses it with PHP due to a previous AddType declaration for the
..php filesystem extension.
* End of story - it's not parsed for SSI or Python or Perl or whatever.

This is the simplest case of content negotiation.

If there were /var/www/something.pl, /var/www/something.py,
/var/www/something.html, /var/www/something.shtml, /var/www/something.png,
/var/www/something.jpeg and /var/www/something.cgi all together - and even
/var/www/something.en.html and /var/www/something.fr.html etc. then things get
more complicated, but it still would only run one of the processing options
anyway depending on which was selected as most suitable to serve the URI
http://example.com/something.

You should read: http://httpd.apache.org/docs/2.2/con...gotiation.html

--
Andy Hassall :: an**@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool

Nov 11 '06 #22

redirect / new website how to redirect old (google) links to newsite ?

Similar topics