By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,824 Members | 1,213 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,824 IT Pros & Developers. It's quick & easy.

Wanted: PHP Site Map Generator!

P: n/a
Hi,

Having spent my free time over the last few months converting several
hundred pages of mainly static (s)html into eight pages of data driven php
loveliness and a whopping MySQL database I'm faced with a bit of a dilemma.

Will search engines be able to crawl all the index.php?query paths so
people can find stuff? I'm currently at the top of the heap on Google as my
site has been around for years and is pretty niche and I don't want to lose
that advantage.

Which leads to the main part of the question; is there a 'site map'
generator which will vreate a single crawlable HTM page with all my
possible pages on iot which I can then link to from my home page which will
enable the search spiders to do their thing?

Or am I left with the unenviable task of having to put response/redirects
on all the old pages pointing to the equivalent dynamic pages?

Many thanks,

Mik Foggin
--
Remove capitals to reply :)
Jul 17 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
Mik Foggin wrote:
Having spent my free time over the last few months converting several
hundred pages of mainly static (s)html into eight pages of data driven php
loveliness and a whopping MySQL database I'm faced with a bit of a dilemma.

Will search engines be able to crawl all the index.php?query paths so
people can find stuff? I'm currently at the top of the heap on Google as my
site has been around for years and is pretty niche and I don't want to lose
that advantage.
Well, you don't actually have to change the URIs of the pages on your
site. In fact, doing so may get you into a bit larger problem since all
your inbound links and such will no longer be valid.

You can use apache's mod_rewrite (or isapi_rewrite for IIS) to make your
changes transparent to the client. For instance, say you have products
that were once accessed via the following:

/products/prodnum1.shtml
/products/prodnum2.shtml
/products/prodnum3.shtml

The way it sounds like you have it now is more like:

/products.php?prodnum=prodnum1
/products.php?prodnum=prodnum2
/products.php?prodnum=prodnum3

If you used mod_rewrite, you can take care of that by putting the
following in a .htaccess file:

RewriteEngine On
RewritePath /
RewriteRule ^products/([a-zA-Z0-9_-])+\.shtml$ prod.php?prodnum=$1 [L]

The uri /products/number.shtml will then tranlate to the page
prod.php?prodnum=number

*NOTE:* the target php file name should not be the same as part of the
URI you want to translate. Dependind on the order that the modules are
loaded, this may not give you the results you expect.

Added bonus: no need to have webmasters of other sites fix links. No
need to worry about search engines finding the new content. No need to
worry about loosing points with Google's PR.

Just remember, sites that have URIs that don't change over time will
have a better chance at keeping higher rankings in the search engines.
Which leads to the main part of the question; is there a 'site map'
generator which will vreate a single crawlable HTM page with all my
possible pages on iot which I can then link to from my home page which will
enable the search spiders to do their thing?
http://home.snafu.de/tilman/xenulink.html is a good piece of software...
very handy!
Or am I left with the unenviable task of having to put response/redirects
on all the old pages pointing to the equivalent dynamic pages?


You should do this anyway, but use mod_rewrite or equivalent if you had
a strong structure to begin with. Creating new files to handle each one
of these is a pain in the arse.

There also used to be a place on the php website where you could
download their url search code. The concept was simple. Say you enter in
a URI that doesn't exist on the site..

http://www.php.net/preg_replace

you are automatically redirected to:

http://www.php.net/manual/en/function.preg-replace.php

I'm sure someone here still has the code or the url to get it from... ;)

--
Justin Koivisto - sp**@koivi.com
PHP POSTERS: Please use comp.lang.php for PHP related questions,
alt.php* groups are not recommended.

Jul 17 '05 #2

P: n/a
> Mik Foggin wrote:
Having spent my free time over the last few months converting several
hundred pages of mainly static (s)html into eight pages of data driven php loveliness and a whopping MySQL database I'm faced with a bit of a dilemma.
Will search engines be able to crawl all the index.php?query paths so
people can find stuff? I'm currently at the top of the heap on Google as my site has been around for years and is pretty niche and I don't want to lose that advantage.


Well, you don't actually have to change the URIs of the pages on your
site. In fact, doing so may get you into a bit larger problem since all
your inbound links and such will no longer be valid.

You can use apache's mod_rewrite (or isapi_rewrite for IIS) to make your
changes transparent to the client. For instance, say you have products
that were once accessed via the following:

/products/prodnum1.shtml
/products/prodnum2.shtml
/products/prodnum3.shtml

The way it sounds like you have it now is more like:

/products.php?prodnum=prodnum1
/products.php?prodnum=prodnum2
/products.php?prodnum=prodnum3

If you used mod_rewrite, you can take care of that by putting the
following in a .htaccess file:

RewriteEngine On
RewritePath /
RewriteRule ^products/([a-zA-Z0-9_-])+\.shtml$ prod.php?prodnum=$1 [L]

The uri /products/number.shtml will then tranlate to the page
prod.php?prodnum=number

[snip]

This is exactly what I did (or rather, had to do) when converting our
well-indexed site from static to dynamic. Can't say it was the most fun I
ever had, but it was worth it.

http://www.devarticles.com/art/1/506 gives a nice intro to mod_rewrite if
you're on Apache. Also note if you find the page is coming up fine but your
images and links are broken, the <base href> tag is your friend.
Jul 17 '05 #3

P: n/a
Justin Koivisto <sp**@koivi.com> wrote in
news:Kg*****************@news7.onvoy.net:


You can use apache's mod_rewrite (or isapi_rewrite for IIS) to make
your changes transparent to the client. For instance, say you have
products that were once accessed via the following:

/products/prodnum1.shtml
/products/prodnum2.shtml
/products/prodnum3.shtml

The way it sounds like you have it now is more like:

/products.php?prodnum=prodnum1
/products.php?prodnum=prodnum2
/products.php?prodnum=prodnum3

If you used mod_rewrite, you can take care of that by putting the
following in a .htaccess file:

RewriteEngine On
RewritePath /
RewriteRule ^products/([a-zA-Z0-9_-])+\.shtml$ prod.php?prodnum=$1 [L]

The uri /products/number.shtml will then tranlate to the page
prod.php?prodnum=number


Thanks for the response, unfortunately the original structure wasn't as
well defined as your suggestion may require; I have 7 sub directories but
there is no simple structure to the file names, so rather than as in your
example above its more like:

/products/apples.shtml
/products/cheese.shtml
/products/cornedbeef.shtml

need mapping to /products/index.php?prodnum=x

so I expect I can't write a simple rule but will have to set a rewrite
for each file. Labourious but undoubtedly worth it!

I'm checking with my web hosts to see whether or not I can actually use
the mod_rewrite function at all so we'll see how we go!

Mik.

--
Remove capitals to reply :)
Jul 17 '05 #4

P: n/a
sk
Sounds like it might be too late now, but if you'd planned ahead you
could have imported into your database both the page content itself and
its old URI (that is, the directory name and filename) as a separate
field. Then you could do a one-time query from your database to
generate a list of static redirects that you could paste into your
httpd.conf, e.g.:

/food/cabbage.html /article.php?id=3
/drink/hot/tea.htm /article.php?id=4
....

That's what I did years ago in the days before fancy geegaws like
mod_rewrite.

--
Steve Koppelman

Mik Foggin wrote:
Justin Koivisto <sp**@koivi.com> wrote in
news:Kg*****************@news7.onvoy.net:
You can use apache's mod_rewrite (or isapi_rewrite for IIS) to make
your changes transparent to the client. For instance, say you have
products that were once accessed via the following:

/products/prodnum1.shtml
/products/prodnum2.shtml
/products/prodnum3.shtml

The way it sounds like you have it now is more like:

/products.php?prodnum=prodnum1
/products.php?prodnum=prodnum2
/products.php?prodnum=prodnum3

If you used mod_rewrite, you can take care of that by putting the
following in a .htaccess file:

RewriteEngine On
RewritePath /
RewriteRule ^products/([a-zA-Z0-9_-])+\.shtml$ prod.php?prodnum=$1 [L]

The uri /products/number.shtml will then tranlate to the page
prod.php?prodnum=number

Thanks for the response, unfortunately the original structure wasn't as
well defined as your suggestion may require; I have 7 sub directories but
there is no simple structure to the file names, so rather than as in your
example above its more like:

/products/apples.shtml
/products/cheese.shtml
/products/cornedbeef.shtml

need mapping to /products/index.php?prodnum=x

so I expect I can't write a simple rule but will have to set a rewrite
for each file. Labourious but undoubtedly worth it!

I'm checking with my web hosts to see whether or not I can actually use
the mod_rewrite function at all so we'll see how we go!

Mik.


Jul 17 '05 #5

P: n/a
sk <st***********@hatless-dot-com-without-the-spam.com> wrote in
news:Kgsdb.606368$o%2.285348@sccrnsc02:
Sounds like it might be too late now, but if you'd planned ahead you
could have imported into your database both the page content itself
and its old URI (that is, the directory name and filename) as a
separate field. Then you could do a one-time query from your database
to generate a list of static redirects that you could paste into your
httpd.conf, e.g.:

/food/cabbage.html /article.php?id=3
/drink/hot/tea.htm /article.php?id=4
...

That's what I did years ago in the days before fancy geegaws like
mod_rewrite.

--
Steve Koppelman


Seems like mod_rewrite is a no-go so the httpd.conf could be a real life
saver,

Many thanks,

Mik
Jul 17 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.