By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,561 Members | 3,127 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,561 IT Pros & Developers. It's quick & easy.

bots hitting AJAX scripts

P: n/a
I was wondering how people are able to create ajax applications like
star ratings scripts without having bots hit the scripts. I have a
star rating script on my site which was getting hit by bots. After I
put in rel="nofollow" into the link part that cut down on the nice
bots but I just started getting hit by some bots again.

I have had a good amount of experience dealing with bots in the past
with them sumitting my forms that were open to all visitors. After I
created a "What is 1+1" hack as a check on the forms then the problem
was solved. But I like my ajax scripts to conform to the ajax
philosophy of keeping everything as quick and simple as can be so
having confirmation on a star-rating script would be kind of weird.
When you look at sites like bash.org with their + and - voting,
Craigslist with their flagging, or the tons of sites that now use ajax
star-rating scripts I am wondering how they are able to keep bots from
contaminating the input? The only thing I can think of is that they
ban them totally but this seems like it would be hard.

I should mention that I know about the various ways to deal with bots:
1. site wide - IP banning & robots.txt
2. page specific - meta tags
3. link specific - rel="nofollow"

But the problems are:
1. Site-wide: It would be a pain to constantly ban all non-good bots.
Or I am wrong about this?
2. Page specific: For a site that has a couple hundred static PHP
pages it would be hard to manage the creation and editing of meta
tags.
3. Link specific: This doesn't keep bad bots, and sometimes good bots,
from not following the link. (Some people even say the good bots still
hit the link but just give it no PageRank, but it doesn't really
matter)
__________________
http://www.WebpageBlueprint.com - learn to make web sites

Apr 21 '07 #1
Share this Question
Share on Google+
3 Replies


P: n/a
ra************@aol.com wrote:
I was wondering how people are able to create ajax applications like
star ratings scripts without having bots hit the scripts.
Bots generally don't touch anything relating to JavaScript.
I have a
star rating script on my site which was getting hit by bots. After I
put in rel="nofollow" into the link part that cut down on the nice
bots but I just started getting hit by some bots again.
So you have a non-JavaScript fallback. This is good.

It is a link that causes a significant change (i.e. something more then an
entry in access.log) on the server. This is bad.

GET is for getting data, and POST is for posting data. GET should be save
for robots to follow. Switch to a form that uses POST as the fallback.

BTW, "nofollow" is not widely supported and generally means "DO follow, but
don't count this link as an endorsement of what is at the other end".
Google managed to come up with a name that doesn't follow the concentions
of the rel attribute OR describe what it actually means.
--
David Dorward <http://blog.dorward.me.uk/ <http://dorward.me.uk/>
Home is where the ~/.bashrc is
Apr 21 '07 #2

P: n/a
VK
On Apr 21, 11:04 am, randommale9...@aol.com wrote:
I was wondering how people are able to create ajax applications like
star ratings scripts without having bots hit the scripts.
AFAIK there is no way to do it out of script itself.

The only reliable solution found so far - and prevented the total
havoc - are pictures containing confirmation keywords, this is what
all major portals are using. Respectively all bot makers keep
improving the text recognition capabilities of their software, so on
many portals pictures became so obfuscated that barely readable even
by human eyes. Even this time-proved measure is in danger now because
of accessibility requirements for visually impaired people. This
requires to provide a link with audio-narration of the confirmation
keyword. With the modern level of speech-recognition technologies it
is rather easy to make a bot exploring audio data instead of images.
And unlike visual data, audio data give much lesser possibility for
obfuscation. So everyone is in continuous search - both site owners
and bot makers - to make each other life the most difficult :-)

It is not reasonable to force your user each and every time to decrypt
some twisted images. So all portals are using session-based approach.
First user has to login and to prove that she is not a bot by typing
properly the confirmation keyword. After that during the session she
is allowed to use any services without further confirmations. IMO this
is the way you should go as well.

Is it possible that someone will manually log a bot and then let it to
do the rest automatically? Yes, it is perfectly possible and already
used. The confirmation keyword / session protection is far of being
perfect - it is simply the only one that anyhow works and that
prevents the total chaos. This spring I studied the first real "nested
bot war" on Russian blog site livejournal.com
One user wrote a bot called Top30 to post ads in the most popular
blogs. Other user wrote an AI-capable bot called Kondopoogle to
dispropagate posts made by Top30 . A few days latter someone wrote the
third AI-capable bot Roboalksnis to comment on posts made by
Kondopoogle and insisting on "freedom of speech". The authors were
logging them manually and then for the time of the session these bot
were hunting each other and posting by their own. In a few days many
blogs became disastrous with a few "human" posts buried in sometimes
rather sophisticated discussions between Top30, Kondopoogle and
Roboalksnis. I had to learn Russian earlier for my linguistical
studies so I took time to translate for myself all relevant materials
because IMO this rather small "bot war" is the sign of what is coming
soon. The original can be found at
http://lenta.ru/news/2007/03/21/kondopoogle/
the Google translator is disastrous but better than nothing:
http://66.249.91.104/translate_c?hl=...language_tools

Eventually all bot accounts were blocked, but for the history some
really futuristic threads are remaining on livejournal.com with the
only one real human discussing with Top30, Kondopoogle and Roboalksnis
different issues of freedom of speech, netiquette, online
advertisement and such - of course being unaware that all of his/her
opponents are bots.

Will all necessary building blocks being available and mostly free of
charges: it is only a question of time when we'll come to bot wars of
the next generation. So not the oldy primitive auto-submissions, but
say one bot moving rating down if the rating was moved up by some
other specific bot, bots calling to your feelings, to netiquette, bots
filing abuse reports to Internet providers or even collecting
signatures for a letter to the Congress ;-)

For instance a well-made and well-trained bot build on A.L.I.C.E. AI-
engine may require a professional psychologist to decide - out of a
written conversation only - if it's a human or a program.

Apr 21 '07 #3

P: n/a
dd
On Apr 21, 9:04 am, randommale9...@aol.com wrote:
I was wondering how people are able to create ajax applications like
star ratings scripts without having bots hit the scripts. I have a
Wouldn't a cookie do the job? I've implemented something like
this (not for a star rating system) where I keep a database of
which pages were visited in a cookie. The date and a number
identifying the page are recorded with a count of how many
times it was hit. Every time any page is hit, I load in the cookie,
check if the unique number identifying this page matches any
of the fields in the cookie and increment the counter. You could
use the same logic to determine whether to enable the rating
call. Once I've updated the cookie information in memory, I then
re-write the cookie. Before I do that though, I sort all of the
records
into date order (most recent first). When it comes to writing back
the cookie, I start with the most recent record and loop until the
cookie is full (4KB) or I run out of records. Doing this lets me
throw away the oldest records when the cookie gets full.

Apr 22 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.