In our last episode, <g7**********@a ioe.org>, the lovely and talented Nad
broadcast on alt.html:
I have a very large site with valuable information. Is there any way to
prevent downloading a large number of articles. Some people want to
download the entire site.
It depends upon what you mean by 'articles.' If put you html documents on a
web server. you are pretty much inviting the public to view/download as much
of it as they want. If it is 'valuable', why are you giving it away? And
if you are giving it away valuable stuff, what did you expect? What is your
real concern here?
If you only worried about server load, why not zip or tar and gzip it up
and put it on an FTP server? This is most practical for related documents,
such as parts of a tutorial or parts of a spec. If you are a philanthropist
who is giving away valuable stuff, you can give it away in big chunks so
the nickel and dime requests don't bug you.
Well-behaved download-the-whole-site spiders will obey robots.txt, but that
is pretty much a courtesy thing, and it won't stop anyone who is manually
downloading a page at a time, and it won't stop rogue or altered spiders.
Likewise, you can block nice spiders which send a true user-agent ID, but
not so nice spiders can spoof their ID. That's kind of pointless, because
most of the nice spiders will obey robots.txt anyway.
You can make pages available through php or cgi which keeps track of the
number of documents with hidden controls. This is easily defeated by
anyone determined to do so, and like a cheap lock, will only keep the honest
people out. Beyond that, you can go to various user account schemes up to
putting your documents on a secure server.
But I think what you are asking is 'Can I keep my documents public and still
limit public access?' And the answer to that is, of course not because
there is a fundamental contradiction in what you want.
Any hints or pointers would be appreciated.
--
Lars Eighner <http://larseighner.com/
us****@larseigh ner.com
War hath no fury like a noncombatant.
- Charles Edward Montague