469,595 Members | 2,278 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,595 developers. It's quick & easy.

/robots.txt at end of URL?

I'm noticing that web requrests are coming in with /robots.txt appended at the
end:

http://www.domain.com/ProductDetails...527/robots.txt

I can correct these, one by one for each page, but I'd like to find a way to
have ASP.NET 2.,0 strip this invalid /robots.txt off the end of any URL for me.

Is this possible?
--
Thanks in advance, Les Caudle
Jul 20 '07 #1
4 1484
Let me asking why do you want to strip it?
robors.txt is asked by automatic robots that collect information. Like
google or yaho.

But in your case apparently it's some lame robot that can not parse out url
correctly simply ads /robot.txt to the end of url and asks for it from the
server.
So why would you worry what kind of garbage it will get in return from your
page.
All important robots you should be worring about like Google or Yahoo parse
out url correctly.

So i am saying just ignore it. It's not human.

George.

"Les Caudle" <Do***********@newsgroup.nospamwrote in message
news:o4********************************@4ax.com...
I'm noticing that web requrests are coming in with /robots.txt appended at
the
end:

http://www.domain.com/ProductDetails...527/robots.txt

I can correct these, one by one for each page, but I'd like to find a way
to
have ASP.NET 2.,0 strip this invalid /robots.txt off the end of any URL
for me.

Is this possible?
--
Thanks in advance, Les Caudle

Jul 20 '07 #2
Well, it creates an event in my event log that would distract me from real
events caused by users who had problems.

Would be nice to be able to globally deal with it.

Regards, Les Caudle

On Fri, 20 Jul 2007 10:28:48 -0400, "George Ter-Saakov" <gt****@cardone.com>
wrote:
>Let me asking why do you want to strip it?
robors.txt is asked by automatic robots that collect information. Like
google or yaho.

But in your case apparently it's some lame robot that can not parse out url
correctly simply ads /robot.txt to the end of url and asks for it from the
server.
So why would you worry what kind of garbage it will get in return from your
page.
All important robots you should be worring about like Google or Yahoo parse
out url correctly.

So i am saying just ignore it. It's not human.

George.

"Les Caudle" <Do***********@newsgroup.nospamwrote in message
news:o4********************************@4ax.com.. .
>I'm noticing that web requrests are coming in with /robots.txt appended at
the
end:

http://www.domain.com/ProductDetails...527/robots.txt

I can correct these, one by one for each page, but I'd like to find a way
to
have ASP.NET 2.,0 strip this invalid /robots.txt off the end of any URL
for me.

Is this possible?
--
Thanks in advance, Les Caudle
Jul 20 '07 #3
Well, the problem is that this time it's robot.txt. Next time (with another
bad robot) it will be something else.
You can not fix it for every bad robot that out there.
Override Application_OnError and send an email to yourself every time
something bad happened. Then you can use Email's rules to filter out the
most annoying ones

George.
"Les Caudle" <Do***********@newsgroup.nospamwrote in message
news:lq********************************@4ax.com...
Well, it creates an event in my event log that would distract me from real
events caused by users who had problems.

Would be nice to be able to globally deal with it.

Regards, Les Caudle

On Fri, 20 Jul 2007 10:28:48 -0400, "George Ter-Saakov"
<gt****@cardone.com>
wrote:
>>Let me asking why do you want to strip it?
robors.txt is asked by automatic robots that collect information. Like
google or yaho.

But in your case apparently it's some lame robot that can not parse out
url
correctly simply ads /robot.txt to the end of url and asks for it from the
server.
So why would you worry what kind of garbage it will get in return from
your
page.
All important robots you should be worring about like Google or Yahoo
parse
out url correctly.

So i am saying just ignore it. It's not human.

George.

"Les Caudle" <Do***********@newsgroup.nospamwrote in message
news:o4********************************@4ax.com. ..
>>I'm noticing that web requrests are coming in with /robots.txt appended
at
the
end:

http://www.domain.com/ProductDetails...527/robots.txt

I can correct these, one by one for each page, but I'd like to find a
way
to
have ASP.NET 2.,0 strip this invalid /robots.txt off the end of any URL
for me.

Is this possible?
--
Thanks in advance, Les Caudle

Jul 20 '07 #4
Hi Les,

I agree with George, as long as your web site can be accessed publicly, you
cannot guarantee every URL request is valid or in expected manner.
Regards,
Walter Wang (wa****@online.microsoft.com, remove 'online.')
Microsoft Online Community Support

==================================================
When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.
==================================================

This posting is provided "AS IS" with no warranties, and confers no rights.

Jul 22 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Craig Cockburn | last post: by
56 posts views Thread by Anonymous, quoting Philip Ronan | last post: by
4 posts views Thread by Misfit | last post: by
5 posts views Thread by John Nagle | last post: by
4 posts views Thread by guiromero | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.