473,503 Members | 5,495 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

HttpBrowserCapabilities - recognized Crawlers

Hi,
Would someone know where I could get a list of the supported crawlers for
the HttpBrowserCapabilities?
Is there a way to add new ones/modify the list?

I have a web site for which I want to show a different content for search
engine bots. I was planning on relying on HttpBrowserCapabilities.crawler,
but what if the bot signature changes, or there is another one added, ...

Thanks,
Zolt
Aug 20 '08 #1
4 1883
Zolt wrote:
I have a web site for which I want to show a different content for
search engine bots.
Rather than try to get the site blacklisted by search engines, why not just
use a robots.txt file to exclude them?

Andrew
Aug 21 '08 #2
Andrew,

What I want to do is show the search engines a different content, not
prevent them from coming to my site.
The problem is that I have pages that contain text in 2 languages which is
shown depending on the browser's prefered language and/or selected language
saved in a cookie.
Doing it this way, I don't have to show urls with ugly query strings like
http://www.mysite.com/default.aspx?lang=en
The problem with search engines is that they only use the default language,
but can't switch language to reindex the content in the other language.
My goal is to detect if the requester is a web crawler, and if it is, show
both languages. If not, continue the normal way.

I have found an interesting post, which I believe I will be able to use
(http://forums.asp.net/p/908519/1012090.aspx#1012090).

I should be able to modify it to monitor the major search engines - I am
only interested in those major ones.

Thanks for the suggestion anyway,
Zolt

"Andrew Morton" wrote:
Zolt wrote:
I have a web site for which I want to show a different content for
search engine bots.

Rather than try to get the site blacklisted by search engines, why not just
use a robots.txt file to exclude them?

Andrew
Aug 21 '08 #3
Zolt wrote:
Andrew,

What I want to do is show the search engines a different content, not
prevent them from coming to my site.
The problem is that I have pages that contain text in 2 languages
which is shown depending on the browser's prefered language and/or
selected language saved in a cookie.
Ahh - it sounded like you might want to do something referred to as web site
cloaking.
....
My goal is to detect if the requester is a web crawler, and if it is,
show both languages. If not, continue the normal way.
A regular expression which catches the crawlers which visit our sites is

Dim re As New
Regex("bot|spider|slurp|crawler|teoma|DMOZ|;1813|f indlinks|tellbaby|ia_archiver|nutch|voyager|wwwste r|3dir|scooter|appie|exactseek|feedfetcher|freedir |holmes|panscient|yandex|alef|cfnetwork|kalooga",
RegexOptions.Compiled Or RegexOptions.IgnoreCase)

applied to the user-agent string, of course. You could use the Sub
Session_Start in Global.asax.vb as the location to check it.

Then if you can find a URL in the UA string, you can check its TLD for .com,
..fr, .whatever.

(You might want to take out the ";1813" - I put that in to filter out the
AVG link checker thing which happened to distort the actual users stats on
our sites.)

HTH

Andrew
Aug 21 '08 #4
Thanks a lot Andrew!
Your solution seems to give more choices than the one I found.
I will probably go that route.

Really appreciated,
Zolt
Aug 21 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2053
by: Araxes Tharsis | last post by:
Hi, This must be a very old and well studied question... I created a site using JSP, that permits the viewing of articles that are fully stored in a database. The url for the articles is something...
4
2802
by: Earl T | last post by:
When I try to get the netscape version for version 7, I get the HttpBrowserCapabilities class returning the version as 5 and not 7. (see code and output below) CODE HttpBrowserCapabilities...
0
1009
by: TomislaW | last post by:
I try to trace users on my web page In global.asax.cs on Application_BeginRequest I check if user has my cookie, if not I give him new cookie (integer identity number from database). When...
2
2979
by: kevinwjames | last post by:
I'm writing a C# app which analyzes and reports on IIS web logs. I've got it producing numbers for hits, visits, ips, etc. but I'm having trouble parsing the UserAgent string so I can get Browser,...
5
2860
by: Robert W. | last post by:
I just added this line of code to a simple aspx page: HttpBrowserCapabilities browserCap = new HttpBrowserCapabilities(); Yet when I used QuickWatch in Debug mode to inspect 'browserCap' all...
0
1621
by: Stefano | last post by:
Hi all, I'm trying to create a browser definition file (.browser) that matches crawlers user agents. I don't want modify browser files in the Config system folder. I'd like to use App_Browsers...
3
2832
by: rooznamechi.h | last post by:
Hi, I use Url rewriting on my website and my website works normally , but I don't know why Google crawlers can not read my pages . For example look at address below :...
2
1153
by: disappearedng | last post by:
Hi Does anyone here have a good recommendation for an open source crawler that I could get my hands on? It doesn't have to be python based. I am interested in learning how crawling works. I think...
2
1657
by: disappearedng | last post by:
Hi Does anyone here have a good recommendation for an open source crawler that I could get my hands on? It doesn't have to be C++ based. I am interested in learning how crawling works. I think...
0
7093
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7291
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7357
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
5598
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
3180
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3171
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1522
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
748
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
402
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.