473,796 Members | 2,839 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to keep a site out of the search engines?

Hi;

I have a site that I do not want the search engines to pick up
on.....attracts people and problems I do not want.

Is there a tag ( or some other means ) of preventing this?

Thanks

Steve

May 24 '06 #1
11 2060
Steve wrote:
I have a site that I do not want the search engines to pick up
on
http://www.robotstxt.org/wc/exclusion-admin.html
.....attracts people and problems I do not want.


If you want a private website, then use some form of password
protection on it.

May 24 '06 #2
David Dorward wrote:
Steve wrote:
I have a site that I do not want the search engines to pick up
on


http://www.robotstxt.org/wc/exclusion-admin.html


Does this actually work? Or is it like <meta name="robots"
content="noinde x,nofollow"> which google still crawls but doesn't index.
Though I guess that's the same difference.

--
Brian O'Connor (ironcorona)
May 24 '06 #3
ironcorona wrote:
David Dorward wrote:
Steve wrote:
I have a site that I do not want the search engines to pick up
on


http://www.robotstxt.org/wc/exclusion-admin.html


Does this actually work?


Reputable bots obey it.
--
David Dorward <http://blog.dorward.me .uk/> <http://dorward.me.uk/>
Home is where the ~/.bashrc is
May 24 '06 #4
On 24 May 2006 06:52:41 -0700, "Steve" <st**********@y ahoo.com>
wrote:
Hi;

I have a site that I do not want the search engines to pick up
on.....attract s people and problems I do not want.

Is there a tag ( or some other means ) of preventing this?


Don't put clickable links to it in a pubically accessable web
page.
May 24 '06 #5
> Is there a tag ( or some other means ) of preventing this?

You could use robots.txt... However, a better yet solution
may be to use .htaccess... Perhaps a user/password system?

Do have access to the webserver? Apache?
What about PHP (or ASP if Windows) ?
--
best regards
Thomas Schulz
http://www.micro-sys.dk/products/sitemap-generator/
http://www.micro-sys.dk/products/website-analyzer/
May 26 '06 #6

Si Ballenger wrote:
Don't put clickable links to it in a pubically accessable web
page.


Not reliable. Google Toolbar (for just one) is a backchannel that
feeds the URLs of "hidden" web sites back to Google, where they then
get spidered.

You also have no control over other people linking to your site.
If you want to "avoid indexing", then just use robots.txt (Maybe your
site isn't released yet).

If you want to keep your content hidden, then disallow anyone and
everyone from accessing it (by web server config, such as .htacccess).
Then specifically _allow_ content to be visible to a small set of
permitted users, such as by password access.

There is no practical way to identify "a spider" as distinct from "a
user". So any attempt to make content generally available and
_disallow_ spiders is always doomed to be unreliable and susceptible to
some level of leakage.

May 26 '06 #7
di*****@codesmi ths.com <di*****@codesm iths.com> scripsit:
There is no practical way to identify "a spider" as distinct from "a
user".
There is. At least search engine spiders obey the Robots Exclusion Standard
in practice. Occasionally, there might be a misbehaving spider, but such
spiders are rare and they serve odd purposes. You can't beat them, or
separate them from users in any reasonable way, but there's no need to do
that either. They don't make your page findable using commonly used search
engines.
So any attempt to make content generally available and
_disallow_ spiders is always doomed to be unreliable and susceptible
to some level of leakage.


That's correct if you mean complete control. Generally, complete control
does not work on the WWW.

--
Yucca, http://www.cs.tut.fi/~jkorpela/

May 26 '06 #8

Jukka K. Korpela wrote:
di*****@codesmi ths.com <di*****@codesm iths.com> scripsit:
There is no practical way to identify "a spider" as distinct from "a
user".
There is. At least search engine spiders obey the Robots Exclusion Standard
in practice.


That is requesting a behaviour from a spider, not identifying it.

Occasionally, there might be a misbehaving spider,
Such as Google. Google's spiders have Been Evil of late, hammering on
some sites excessively and also ignoring the robot exclusion protocol
in the case of deep URLs obtained through the Google Toolbar.
Generally, complete control does not work on the WWW.


It does, provided you begin by forbidding _everything_ to _everyone_,
then only relaxing this rule for the very small domain where you can
control things (generally a crypto or password based subset of accepted
user agents). The "lack of control on the web" is a result of the web
being broadly accessible to a broad range of agents (and generally a
good thing too).

May 26 '06 #9
In article <11************ **********@i40g 2000cwc.googleg roups.com>,
"Andy Dingley <di*****@codesm iths.com>" <di*****@codesm iths.com> writes
Google Toolbar (for just one) is a backchannel that feeds the URLs of
"hidden" web sites back to Google, where they then get spidered.


Also, Google has taken to looking at newly registered domain names to
see if there is a web site there. This means that even if your site
doesn't have any links to it and you don't use the Google toolbar,
Google could still find it!!

--
Alan Silver
(anything added below this line is nothing to do with me)
May 29 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1649
by: Aardwolf | last post by:
I have recently started to convert several of my websites over to dynamic sites with pages written as requested with php and in some cases using mysql databases to supply data within parts of the pages rather than the plain HTML that they used to be writtten in. I am concerned as to how this will affect the ability of the search engine spiders to map out my websites. Can anyone tell me what the ramifications of doing this are, and if...
1
1585
by: Bosconian | last post by:
I know this question is asked from time to time, but the offerings change often enough that it deserves repeating. I have a dynamic database-driven web site using PHP/MySQL on Linux. I need to integrate a search tool that will search both the database and static pages and output a list of links corresponding to the search string entered. I am an experience web developer, but have never worked with web site search engines before. I know...
0
4159
by: R. Rajesh Jeba Anbiah | last post by:
Q: Is PHP search engine friendly? Q: Will search engine spiders crawl my PHP pages? A: Spiders should crawl anything provided they're accessible. Since, nowadays most of the websites are been developed with PHP, you are not supposed to doubt that. As a proof that PHP pages could be crawled and indexed, refer this Google search
1
1792
by: disaia | last post by:
2 problems: Example: If a person types in a part number into Yahoo: 1. Is there a way for Yahoo to list your web site as one of the results. 2. If the user clicks on your link, can your web application know the part number the user typed into the Yahoo search box. I would like to use that part number to query our database and present a dynamic web
67
6053
by: Sandy.Pittendrigh | last post by:
Here's a question I don't know the answer to: I have a friend who makes very expensive, hand-made bamboo flyrods. He's widely recognized (in the fishing industry) as one of the 3-5 'best' rod makers in the world. He gets (sic) close to $5000 per custom made flyrod. A surprising number of people buy these fishing rods and never use them....they buy them as art-like investments. He is, after all, the best there is. But if you search on...
3
1282
by: Mark | last post by:
Our site gets searched by robots all the time. This is great. However, many of our pages that we want to be cataloged are data driven, so we end up with pages like: www.ourdomain.com/products.aspx?productid=356 Let's assume that we stop selling productid 356. This means that the url above is invalid. If a general user has bookmarked this page or pastes in a url into a browser that isn't quite right, we want them to get a 'pretty'...
8
2257
by: Sandy Pittendrigh | last post by:
I have a how-to-do-it manual like site, related to fishing. I want to add a new interactive question/comment feature to each instructional page on the site. I want (registered) users to be able to add comments at the bottom of each page, similar to the way the php, mysql, apache manuals work. PUNCHLINE_A:
2
1664
by: Griff | last post by:
Hi We have an eCommerce site that was designed as a BusinessToBusiness system. When anyone accesses a page, the site checks to see whether they have a current session (i.e. already authenticated) and if not it redirects them to the log-on page. Recently, we added some BusinessToConsumer functionality. The same authentication process described above applies, but when the unknown user gets redirected to the logon page they see a button...
0
2464
by: passion | last post by:
"Specialized Search Engines" along with Google Search Capability (2 in 1): http://specialized-search-engines.blogspot.com/ Billions of websites are available on the web and plenty of extremely good search engines are there like Google, Yahoo and Live to name few of them. Though this search engines have extremely efficient, complex and beautiful algorithms designed by gems of the industry, but still they may not deliver best results for...
0
9673
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9525
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10452
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10221
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
7546
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6785
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5440
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4115
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2924
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.