Following on from Will Woodhull's message. . .
I agree that "professional" harvester programs aren't going to be
stopped by session lock-ups if some bandwith limit is exceeded (lock up
the session if more than X number of page requests are sent per second,
etc).
However I'm pretty sure this approach would effectively stall casual
site ripping with packages like WebLeech or Web Stripper. Haven't tried
it though-- not yet at least.
Much depends on what Dave Turner (the OP) actually needs. So far it
isn't clear whether he's trying to protect the content of a commercial
site from being undercut by an unscrupulous competitor or if he's tryin
to keep his wawoo-neat design ideas from showing up on half of Yahoo's
free web sites.
THOUGHT!
Obviously if a plain browser can see all the stuff on your pages they
are in the wild and fair game.
You could /try/ the following but beware of the effect of caching.
The object is to put a poison pill into the pages which triggers when it
isn't being shown live and on your site. I can think of two ways of
doing this - both purely theoretical and both require you to dynamically
tweak some javascript.
1. Your javascript (which might be loading images or other OnLoad()
activities) tests the date and time (client) against the 'now' on your
server. The 'now' on your server is hard coded into the javascript -
hence the need for dynamic creation. Your js code might say "if client
time is two weeks behind the time the page was created then open 'a this
site has been ripped window'.
2. Have js ask for some resource from your server which has to be
dynamic. For example an image that gives today's date or a news feed
extract. This will look a bit weird when looked at statically.
So my conclusion is
You can't stop ripping but you might be able to flag it to an
unsuspecting viewer of HTML.
In a slightly different vein.
How about getting js to call some resources in a way that is not easy to
deduce from looking at the code automatically is a URL. Basically some
low level encryption. I don't know how rippers work but surely one of
the things they would do is try to redirect all <a
href="my.site/page.htm"> to
<a href="ripped.site/page.htm"> Your js could be written to trap this
sort of thing by (a) not fetching stuff for a 'real time' (as in 1
above) load but having the url clearly present in the js along the lines
of "ha ha this will break the page 'cos now you're trying to link to a
resource on ripped.site which doesn't exist." (b) (say) ROT 13 complete
URLs including the http bit and calling them by an OnClick(). This
brings them onto your site and what you do then is up to you. Perhaps
it is a page that has a lifetime of a week and then gets filled with
poisonous content. Or just eaxmine the referrer in the header to
discover that the come-from page was somewhere out in cyberspace and
then you could decide what to do.
I'm trying to think of a way to call a style sheet with variable (js -
date based) parameters. Then you could really upset the page layout
after a fortnight.
All the above is just thoughts.
--
PETER FOX Not the same since the bookshop idea was shelved
pe******@eminent.demon.co.uk.not.this.bit.no.html
2 Tees Close, Witham, Essex.
Gravity beer in Essex <http://www.eminent.demon.co.uk>