I've created a php page which is optimized for search engine indexation: no
images, tables or css, just plain html with relevant meta tags etc.
The page contains a list records pulled from a database, and for each record
there is a link to the detail view for that record in this form: <a
href="<?=$_SERVER['PHP_SELF'] ?>?rec=<?=$recordId ?>">
I've made sure the urlstring doesn't contain a variable called 'id' or
something, since I heard some bots will assume that is a session id and not
follow the link.
From my server stats I can see that the "master view" page is being indexed
fine by Google, but the bot apparently doesn't follow the links to the
detail pages.
Does anyone have any idea why?
..soma 24 2701
"somaboy mx" <no****@fakemail.fk> a écrit dans le message de news:
42**********@x-privat.org... From my server stats I can see that the "master view" page is being
indexed fine by Google, but the bot apparently doesn't follow the links to the detail pages.
Does anyone have any idea why?
Be patient...
*** ECRIA Public Mail Buffer wrote/escribió (Thu, 30 Jun 2005 11:54:05
-0400): Bots don't like dynamic URLs.
Since nowadays most interesting info comes from dynamic URLs, that doesn't
say much about bots.
--
-- Álvaro G. Vicario - Burgos, Spain
-- http://bits.demogracia.com - Mi sitio sobre programación web
-- Don't e-mail me your questions, post them to the group
--
Alvaro G Vicario wrote: *** ECRIA Public Mail Buffer wrote/escribió (Thu, 30 Jun 2005 11:54:05 -0400): Bots don't like dynamic URLs.
Since nowadays most interesting info comes from dynamic URLs, that doesn't say much about bots.
What would you have the bots do, fill out forms, click buttons? Even if
they could, would you want a bunch webcrawler bots firing your cgis and
PHP stuff all the time?
Brian
ECRIA Public Mail Buffer wrote: Bots don't like dynamic URLs.
They dont seem to mind mine. de***********@yahoo.com wrote: Alvaro G Vicario wrote: *** ECRIA Public Mail Buffer wrote/escribió (Thu, 30 Jun 2005 11:54:05 -0400): Bots don't like dynamic URLs.
Since nowadays most interesting info comes from dynamic URLs, that doesn't say much about bots.
What would you have the bots do, fill out forms, click buttons? Even if they could, would you want a bunch webcrawler bots firing your cgis and PHP stuff all the time?
I would... And I wouldn't mind them firing my cgis and PHP stuff a
few times a day. Wouldn't really hurt anything.
And they don't have to fill out forms most of the time, just follow the
given links. As long as there are links from the home page.
--
Contact Us Script: http://www.douglassdavis.com
"ECRIA Public Mail Buffer" <ng***********@ecria.com> a écrit dans le message
de news: da**********@murdoch.acc.Virginia.EDU... Bots don't like dynamic URLs.
Completly wrong
"ECRIA Public Mail Buffer" <ng***********@ecria.com> wrote in message
news:da**********@murdoch.acc.Virginia.EDU... Bots don't like dynamic URLs.
I'd be surprised ifthat were the case. I've had pages with a couple of
variables in the urlstring which got excellent indexing.
This said, I believe there might be some benefit in "clean" url's, if only
for human readability.
..soma
Google for one generally only indexes dynamic URLs for news sites and
forums. The reason is simple: Dynamic content changes unpredictably and
cannot be indexed reliably without additional credible information about the
content.
This is a discussion-based newsgroup, so disagree if you must. Feel free to
give us an example of your dynamic non-forum/news/blog URL that appears in
web search results. Doing so is a much better way to make your point than
anonymously posting "You don't know what you're talking about".
We have been in the Design/SEO business for years, and we know exactly what
we are talking about.
ECRIA http://www.ecria.com
ECRIA Public Mail Buffer <ng***********@ecria.com> wrote: Google for one generally only indexes dynamic URLs for news sites and forums. The reason is simple: Dynamic content changes unpredictably and cannot be indexed reliably without additional credible information about the content.
We have been in the Design/SEO business for years, and we know exactly what we are talking about.
Then you should know that there is no way to determine if an URL is
dynamically generated or not other than parsing it's contents. If a
spider/indexer decides to mark something as dynamic based on the form of
an URL the developers should be fired at once.
ECRIA Public Mail Buffer <ng***********@ecria.com> wrote: Google for one generally only indexes dynamic URLs for news sites and forums. The reason is simple: Dynamic content changes unpredictably and cannot be indexed reliably without additional credible information about the content.
We have been in the Design/SEO business for years, and we know exactly what we are talking about.
Then you should know that there is no way to determine if an URL is
dynamically generated or not other than parsing it's contents. And even
that doesn't mean anything (an unchanged page can still be dynamically
generated and a changed page could be manually updated).
If a spider/indexer decides to mark something as dynamic based on the
form of an URL the developers should be fired at once.
" Then you should know that there is no way to determine if an URL is
dynamically generated or not other than parsing it's contents. And even that
doesn't mean anything (an unchanged page can still be dynamically generated
and a changed page could be manually updated)."
Agreed. Furthermore, a static page URL may actually be a dynamically
generated page. For example, there is not a single HTML file on http://www.ecria.com - but HTML is all anyone will see.
However, the point is that if there are variables in a URL, robots assume
that the page is generated dynamically - which is a fair assumption. We're
not condoning this - it's just what happens.
There are ways to get around it, but they don't change the fact that this is
the way robots work.
Other than a blog/forum/news pages, how many web sites containing URL
variables show up in, say, a Google search?
See what I mean?
ECRIA http://www.ecria.com
ECRIA Public Mail Buffer <ng***********@ecria.com> wrote: However, the point is that if there are variables in a URL, robots assume that the page is generated dynamically - which is a fair assumption.
Bad assumption, they could be used clientside.
There are ways to get around it, but they don't change the fact that this is the way robots work.
That's why I say they should be fired. They are to lazy to fix bad
assumptions and even go out of their way to insert this kind of silly
behavior.
Other than a blog/forum/news pages, how many web sites containing URL variables show up in, say, a Google search?
See what I mean?
No, all pages of eg http://www.amsterdamchinafestival.nl/ appear to be
listed in Google. All dynamically generated by hiddeous URL without
trying to hide it's dynamic and it's no blog, form or news page. Even if
it was how would a spider know it's on of the "special" sites?
There's a difference between being LISTED and being RANKED - http://www.amsterdamchinafestival.nl/ does not show up in Google results for
"Amsterdam china festival" - it's own title!
How does Google know it's a news/blog site? Beats me - ask them. They know.
Probably something to do with the Google Groups technology.
I think that will have to be my final word... this discussion is not going
anywhere.
ECRIA http://www.ecria.com
ECRIA Public Mail Buffer <ng***********@ecria.com> wrote: There's a difference between being LISTED and being RANKED -
Goal post shifting detected! From indexing to ranking. http://www.amsterdamchinafestival.nl/ does not show up in Google results for "Amsterdam china festival" - it's own title!
In the results I get, it's at apathetic 10th place (most propably due to
only having 2 incoming links).
How does Google know it's a news/blog site? Beats me - ask them. They know. Probably something to do with the Google Groups technology.
I think that will have to be my final word... this discussion is not going anywhere.
I have given you the requested example of the "dynamic
non-forum/news/blog URL that appears in web search results". Now please
enlighten us with knowledge of "the Design/SEO business for years"...
ECRIA Public Mail Buffer wrote: Google for one generally only indexes dynamic URLs for news sites and forums.
And how does Google tell if a URL is a news site or a forum?
The reason is simple: Dynamic content changes unpredictably and cannot be indexed reliably without additional credible information about the content.
Don't news sites & forums change unpredictably?
I've had more problems with Google being out-of-date on forums and news
sites than on other dynamic sites.
This is a discussion-based newsgroup, so disagree if you must. Feel free to give us an example of your dynamic non-forum/news/blog URL that appears in web search results.
http://www.mrbreakfast.com/superdisp...?recipeid=1325
cinnamon roll pull apart, #22 http://www.911cheferic.com/main/drec...fle&recipe=532
grand marnier souffle, #10 http://www.mrbreakfast.com/superdisp...p?recipeid=267
grand marnier souffle, #2
I'd say #10 and #2 is pretty good indexing, wouldn't you?
Doing so is a much better way to make your point than anonymously posting "You don't know what you're talking about".
True. It's generally better to prove your point, rather than making
unfounded assertions, right?
We have been in the Design/SEO business for years, and we know exactly what we are talking about.
:)
--
Tony Garcia
Web Right! Development
Daniel Tryba wrote: ECRIA Public Mail Buffer <ng***********@ecria.com> wrote: There's a difference between being LISTED and being RANKED -
Goal post shifting detected! From indexing to ranking.
OK, so we're saying that Google doesn't RANK dynamic URL's, now.
Interesting, since http://www.mrbreakfast.com/superdisp...p?recipeid=267
shows up as #2 under a search for "grand marnier souffle" How does Google know it's a news/blog site? Beats me - ask them. They know. Probably something to do with the Google Groups technology.
I think that will have to be my final word... this discussion is not going anywhere.
I have given you the requested example of the "dynamic non-forum/news/blog URL that appears in web search results". Now please enlighten us with knowledge of "the Design/SEO business for years"...
As I said before:
:)
--
Tony Garcia
Web Right! Development http://www.google.com/webmasters/2.html
FAQ: "My webpages have never been included in the Google index."
Google: "Your pages are dynamically generated. We're able to index
dynamically generated pages. However, because our web crawler could
overwhelm and crash sites that serve dynamic content, we limit the number of
dynamic pages we index. In addition, our crawlers may suspect that a URL
with many dynamic parameters might be the same page as another URL with
different parameters. For that reason, we recommend using fewer parameters
if possible. Typically, URLs with 1-2 parameters are more easily crawlable
than those with many parameters."
ECRIA: "Bots don't like dynamic URLs."
I think you're right - dynamic sites are indexed, but ecria does have a
point (ducking and covering)...
On Fri, 1 Jul 2005 14:40:20 -0400, "ECRIA Public Mail Buffer"
<ng***********@ecria.com> wrote: How does Google know it's a news/blog site? Beats me - ask them. They know. Probably something to do with the Google Groups technology.
I heard that certain words (like 'blog' or 'forum') when detected are
like death to your site - a bot will either drop the links or the
search engine just won't rank you very highly.
Chris
ECRIA Public Mail Buffer wrote: We have been in the Design/SEO business for years,
The SEO business, eh? That's nice to know.
and we know exactly what we are talking about.
I'm glad somebody does, cos I haven't the foggiest!
--
Jock
Somebody wrote: I heard that certain words (like 'blog' or 'forum') when detected are like death to your site - a bot will either drop the links or the search engine just won't rank you very highly.
Do you believe that?
--
Jock
ECRIA Public Mail Buffer wrote: Google for one generally only indexes dynamic URLs for news sites and forums. The reason is simple: Dynamic content changes unpredictably and cannot be indexed reliably without additional credible information about the content.
This is a discussion-based newsgroup, so disagree if you must. Feel free to give us an example of your dynamic non-forum/news/blog URL that appears in web search results. Doing so is a much better way to make your point than anonymously posting "You don't know what you're talking about".
We have been in the Design/SEO business for years, and we know exactly what we are talking about.
ECRIA http://www.ecria.com
http://www.google.com/search?hs=ec8&...rg&btnG=Search
For one. Completely dynamically generated. Not a news site or a forum. Many
of my sites have dynamic pages. And all get spidered eventually. Just not
necessarily all at once.
I can come up with plenty more because even though you've "been in the business
for several years" your statement is hogwash. Even google indicates they spider
dynamic pages: http://www.google.com/intl/en/webmasters/2.html
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp. js*******@attglobal.net
================== ch*********@hartslock.org.uk wrote: On Fri, 1 Jul 2005 14:40:20 -0400, "ECRIA Public Mail Buffer" <ng***********@ecria.com> wrote:
How does Google know it's a news/blog site? Beats me - ask them. They know. Probably something to do with the Google Groups technology.
I heard that certain words (like 'blog' or 'forum') when detected are like death to your site - a bot will either drop the links or the search engine just won't rank you very highly.
Chris
Don't believe everything you hear - especially on the internet.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp. js*******@attglobal.net
==================
"ECRIA Public Mail Buffer" <ng***********@ecria.com> wrote ... This is a discussion-based newsgroup, so disagree if you must. Feel free to give us an example of your dynamic non-forum/news/blog URL that appears in web search results. Doing so is a much better way to make your point than anonymously posting "You don't know what you're talking about".
Here's a list of dynamic url's from a site I created a while back, all
indexed: http://www.google.be/search?q=allinu....krikri.be+aID
We have been in the Design/SEO business for years, and we know exactly what we are talking about.
Apparently you need to re-evaluate your assumptions...
..s This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Bonnie |
last post by:
Hi:
I'm hoping someone can shed some light on this issue. (I've been
digging around everywhere and can't seem to find it by searching):
I use the @import statement to attach an external style...
|
by: socialism001 |
last post by:
I have a folder that I want googlebot to index but I don't want any
other bots to be able to index the files. How would I do this in the
..robots file.
Thanks,
Chris
|
by: noop |
last post by:
Hi, not really a html question, but...
I've submitted my URL to Google for indexing.
In the logs of my server, I see that googlebot has requested my /robots.txt
and my /index.html, but it stopped...
|
by: John Smith |
last post by:
Googlebot has been picking up numerous PHPSESSID name/value pairs in
URIs at my website, and this causes duplicate hits and wasted bandwidth.
I've since prevented PHPSESSID generation in my PHP...
|
by: CAH |
last post by:
Hi
Can you avoid that googlebot indexes PHPSESSID pages? Googlebot is
indexing pages with PHPSESSID, which makes it think my page has a
infinite number of pages. How can one avoid this?
...
|
by: =?Utf-8?B?cGF0cmlja2RyZA==?= |
last post by:
Hi everyone!
I get some errors lately regarding:
HTTP_USER_AGENT Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)
and:
...
|
by: Ciaran |
last post by:
I have a piece of code that I'd rather google's spider did not follow. Is
this
possible please?
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
| |