473,626 Members | 3,930 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to limit the number of web pages downloaded from a site?

Nad
I have a very large site with valuable information.
Is there any way to prevent downloading a large number
of articles. Some people want to download the entire site.

Any hints or pointers would be appreciated.

Aug 8 '08 #1
16 1460
In article <g7**********@a ioe.org>, na*@invalid.com (Nad) wrote:
I have a very large site with valuable information.
Is there any way to prevent downloading a large number
of articles. Some people want to download the entire site.

Any hints or pointers would be appreciated.
Password protect folders or pages, make users register to get the
passwords, that would slow them down a bit. But really, if you make
stuff available publicly...

--
dorayme
Aug 8 '08 #2
Gazing into my crystal ball I observed na*@invalid.com (Nad) writing in
news:g7******** **@aioe.org:
I have a very large site with valuable information.
Is there any way to prevent downloading a large number
of articles. Some people want to download the entire site.

Any hints or pointers would be appreciated.

You could store their IP address in a session, and check to see the length
of time between requests.

--
Adrienne Boswell at Home
Arbpen Web Site Design Services
http://www.cavalcade-of-coding.info
Please respond to the group so others can share

Aug 9 '08 #3
On 08 Aug 2008, na*@invalid.com (Nad) wrote:
I have a very large site with valuable information.
Is there any way to prevent downloading a large number
of articles. Some people want to download the entire site.

Any hints or pointers would be appreciated.
Change the articles' text to Olde Englishe.

--
Neredbojias
http://www.neredbojias.net/
Public Website
Aug 9 '08 #4
Nad
In article <Xn************ *************** *@69.16.185.247 >, Adrienne Boswell
<ar****@yahoo.c omwrote:
>Gazing into my crystal ball I observed na*@invalid.com (Nad) writing in
news:g7******* ***@aioe.org:
>I have a very large site with valuable information.
Is there any way to prevent downloading a large number
of articles. Some people want to download the entire site.

Any hints or pointers would be appreciated.


You could store their IP address in a session, and check to see the length
of time between requests.
Well, something along those lines.
The problem is the server side support.
Some servers do not allow cgi, php, javascript, or even ssi
executable commands, and I'd like it to work on ANY server.
Aug 9 '08 #5
Nad
In article <Xn************ *************** **@194.177.96.7 8>, Neredbojias
<Sc********@gma il.comwrote:
>On 08 Aug 2008, na*@invalid.com (Nad) wrote:
>I have a very large site with valuable information.
Is there any way to prevent downloading a large number
of articles. Some people want to download the entire site.

Any hints or pointers would be appreciated.

Change the articles' text to Olde Englishe.
:--}

I like that!!!
Aug 9 '08 #6
In our last episode, <g7**********@a ioe.org>, the lovely and talented Nad
broadcast on alt.html:
I have a very large site with valuable information. Is there any way to
prevent downloading a large number of articles. Some people want to
download the entire site.
It depends upon what you mean by 'articles.' If put you html documents on a
web server. you are pretty much inviting the public to view/download as much
of it as they want. If it is 'valuable', why are you giving it away? And
if you are giving it away valuable stuff, what did you expect? What is your
real concern here?

If you only worried about server load, why not zip or tar and gzip it up
and put it on an FTP server? This is most practical for related documents,
such as parts of a tutorial or parts of a spec. If you are a philanthropist
who is giving away valuable stuff, you can give it away in big chunks so
the nickel and dime requests don't bug you.

Well-behaved download-the-whole-site spiders will obey robots.txt, but that
is pretty much a courtesy thing, and it won't stop anyone who is manually
downloading a page at a time, and it won't stop rogue or altered spiders.
Likewise, you can block nice spiders which send a true user-agent ID, but
not so nice spiders can spoof their ID. That's kind of pointless, because
most of the nice spiders will obey robots.txt anyway.

You can make pages available through php or cgi which keeps track of the
number of documents with hidden controls. This is easily defeated by
anyone determined to do so, and like a cheap lock, will only keep the honest
people out. Beyond that, you can go to various user account schemes up to
putting your documents on a secure server.

But I think what you are asking is 'Can I keep my documents public and still
limit public access?' And the answer to that is, of course not because
there is a fundamental contradiction in what you want.
Any hints or pointers would be appreciated.
--
Lars Eighner <http://larseighner.com/us****@larseigh ner.com
War hath no fury like a noncombatant.
- Charles Edward Montague
Aug 9 '08 #7
On 08 Aug 2008, na*@invalid.com (Nad) wrote:
In article <Xn************ *************** **@194.177.96.7 8>, Neredbojias
<Sc********@gma il.comwrote:
>>On 08 Aug 2008, na*@invalid.com (Nad) wrote:
>>I have a very large site with valuable information.
Is there any way to prevent downloading a large number
of articles. Some people want to download the entire site.

Any hints or pointers would be appreciated.

Change the articles' text to Olde Englishe.

:--}

I like that!!!
<grin>

Seriously, I don't think there's much you can do that is practical. With
server-side support, you could impliment some kind of time limit and/or p/w
but you indicated you didn't want to rely on that. An off-the-wall "non-
solution" would be to use reasonably long meta page redirects, but the user
could always come back with a new time limit.

--
Neredbojias
http://www.neredbojias.net/
Public Website
Aug 9 '08 #8
Nad
In article <do************ *************** *******@news-vip.optusnet.co m.au>,
dorayme <do************ @optusnet.com.a uwrote:
>In article <g7**********@a ioe.org>, na*@invalid.com (Nad) wrote:
>I have a very large site with valuable information.
Is there any way to prevent downloading a large number
of articles. Some people want to download the entire site.

Any hints or pointers would be appreciated.

Password protect folders or pages, make users register to get the
passwords, that would slow them down a bit.
It doesn't work. For example, Teleport Pro (a program to download
the entire sites) allows you to specify login/passwd.
So, once they register, they can enter this info and boom...
>But really, if you make
stuff available publicly...
Well, the site is 150 megs, over 20k articles.
And there are plenty of people who would LOVE to have
the entire site on their own box.
Then you have a problem. Providers usually charge for
the amount of traffic. In one month, you'd have to shell
out some bux, just to give the information to the
"gimme free Coce" zombies.
That does not make sense.

Aug 9 '08 #9
Nad
In article <sl************ *******@debrand ed.larseighner. com>, Lars Eighner
<us****@larseig hner.comwrote:
>In our last episode, <g7**********@a ioe.org>, the lovely and talented Nad
broadcast on alt.html:
>I have a very large site with valuable information. Is there any way to
prevent downloading a large number of articles. Some people want to
download the entire site.

It depends upon what you mean by 'articles.' If put you html documents on a
web server. you are pretty much inviting the public to view/download as much
of it as they want. If it is 'valuable', why are you giving it away? And
if you are giving it away valuable stuff, what did you expect? What is your
real concern here?
Downloading the entire 150+ meg site, which translates into
all sorts of things.
>If you only worried about server load, why not zip or tar and gzip it up
and put it on an FTP server? This is most practical for related documents,
such as parts of a tutorial or parts of a spec. If you are a philanthropist
who is giving away valuable stuff, you can give it away in big chunks so
the nickel and dime requests don't bug you.

Well-behaved download-the-whole-site spiders will obey robots.txt,
That doesn't work. Some random user may come and download the
entire site. By the time you put him into robots.txt, it is too late.
>but that
is pretty much a courtesy thing, and it won't stop anyone who is manually
downloading a page at a time,
That is not a problem. They can manually download as much as they want.
But no automated downloads.
>and it won't stop rogue or altered spiders.
Likewise, you can block nice spiders which send a true user-agent ID, but
not so nice spiders can spoof their ID. That's kind of pointless, because
most of the nice spiders will obey robots.txt anyway.
>You can make pages available through php or cgi which keeps track of the
number of documents with hidden controls. This is easily defeated by
anyone determined to do so,
How?
>and like a cheap lock, will only keep the honest
people out. Beyond that, you can go to various user account schemes up to
putting your documents on a secure server.
Well, no account schemes, no user verification, no limits beyond
trying to automatically download the entire site pretty much.
>But I think what you are asking is 'Can I keep my documents public and still
limit public access?'
Not really. AUTOMATED download.
>And the answer to that is, of course not because
there is a fundamental contradiction in what you want.
I do not see it at the moment.
>Any hints or pointers would be appreciated.
Aug 9 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
8013
by: Catherine | last post by:
I am having a problem viewing asp pages on iis version 5.1 xp pro. HTML pages are viewable on http://localhost but .asp pages are not. I have created a test program called timetest.asp with the following code: <html> <head> <title>Test ASP</title>
0
1783
by: krystoffff | last post by:
Hi I would like to paginate the results of a query on several pages. So I use a query with a limit X offset Y to display X results on a page, ok. But for the first page, I need to run the same query with a count(*) to know how many pages I will get (number total of rows/ X). The problem is my query is very slow (maybe 5s) because there is much
29
4632
by: Paul | last post by:
Hi, I'd like to limit the number of selections a user can make in a multiple select listbox. I have a note on the interface to say that only x no. of items should be selected and I check the number server side but I'd like to implement some javascript to do the same on the client side. Ideally I'd like the javascript to work in IE5+ and Netscape6+. Thanks,
8
6162
by: johkar | last post by:
How do I ensure a number has no more than 4 digits and 2 decimal places (adds .0 or .00 as necessary) onblur using reg expressions? Hints or solutions appreciated. John
0
5773
by: D. Dante Lorenso | last post by:
I need to know that original number of rows that WOULD have been returned by a SELECT statement if the LIMIT / OFFSET where not present in the statement. Is there a way to get this data from PG ? SELECT ... ; ----> returns 100,000 rows
9
10827
by: Terry E Dow | last post by:
Howdy, I am having trouble with the objectCategory=group member.Count attribute. I get one of three counts, a number between 1-999, no member (does not contain member property), or 0. Using LDIFDE as a comparison I get the same results. No members means just that, an empty group. Zero means that the DirectorySearcher.SizeLimit has been exceeded....
0
1860
by: Tomas | last post by:
I have two questions: (1) How (if possible) can you, with ASP.NET (and with the IIS 5 included with win2000) specify an maximum limit of the memory that a web application may consume, as an absolute number of megabytes ??? ( I am aware of the "memoryLimit" attribute in the "processModel" element in the Web.config but that only specifies the maximum allowed memory size, as a percentage of total system memory, and I don't want a...
9
4956
by: AES | last post by:
I fairly often make PDF copies of web pages or sites by copying the web page link from the web page itself and pasting it into the Acrobat 7.0 Standard "Create PDF From Web Page" command. (Not trying to steal material; usually just want to make a temporary copy to read offline, e.g. on a plane flight.) This almost always works remarkably well, but when I tried it recently with a Wikipedia article
2
4215
by: Ron Hinds | last post by:
I'm getting this in an ASP application on IIS6/W2K3. The page in question is trying to return a XML file approximately 45MB in size. Changing this is not an option. Worked fine on IIS5/W2K. I tried Response.Buffer = False, no joy. So I searched on MSDN and found instructions for increasing the AspBufferingLimit property in the metabase. I increased it to 100MB for that web application, stopped and restarted that web application, still same...
0
8266
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8638
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
7196
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5574
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4092
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4198
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2626
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1811
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1511
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.