By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
434,786 Members | 1,143 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 434,786 IT Pros & Developers. It's quick & easy.

html forms and spam-bots

P: n/a
I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. Any ideas on a quick way to eliminate
these time wasters early in the process?
Sep 27 '08 #1
Share this Question
Share on Google+
28 Replies


P: n/a
William Gill wrote:
I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. Any ideas on a quick way to eliminate
these time wasters early in the process?
Some type of CAPTCHA routine (it doesn't necessarily have to be an image).

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Sep 27 '08 #2

P: n/a
>I am getting a lot of phony form submissions. I assume they are not
>being done manually because of the volume, but by some sort of spam-bot.
I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. Any ideas on a quick way to eliminate
these time wasters early in the process?
Don't assume that submissions are really coming from your form;
referer is trivially easy to fake. It's a fixed value and it's not
even necessary to fetch your form to fake it.

If you include some hidden fields with random but checked values, the
spambot may not have the ability to fully parse the form (and may not
even fetch your form) so they won't include those variables.

Randomly changing up the field names in the form (with each set
usable only once a week or so) requires the spambot to parse your form.

For really advanced filtering, use CAPTCHA or instructions that
humans can read and interpret but simple bots can't (Uncheck box
3 and check box 4 before submitting) that change each time.

Sep 27 '08 #3

P: n/a
Gordon Burditt wrote:
>I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. Any ideas on a quick way to eliminate
these time wasters early in the process?

Don't assume that submissions are really coming from your form;
referer is trivially easy to fake. It's a fixed value and it's not
even necessary to fetch your form to fake it.

If you include some hidden fields with random but checked values, the
spambot may not have the ability to fully parse the form (and may not
even fetch your form) so they won't include those variables.
That's what I was looking to do, but when it looked like the bot was
actually filling the form it seemed to defeat the idea.
>
Randomly changing up the field names in the form (with each set
usable only once a week or so) requires the spambot to parse your form.

For really advanced filtering, use CAPTCHA or instructions that
humans can read and interpret but simple bots can't (Uncheck box
3 and check box 4 before submitting) that change each time.
Another possibility.

thanks
Sep 27 '08 #4

P: n/a
Jerry Stuckle wrote:
William Gill wrote:
>I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of
spam-bot. I thought of putting some code in the php processing script
to trash the bot generated responses, but was surprised to see the
referrer is my form, not some clone or such. Any ideas on a quick way
to eliminate these time wasters early in the process?

Some type of CAPTCHA routine (it doesn't necessarily have to be an image).
I hadn't thought about non-image CAPTCHA this is easy enough to do.

Thanks
Sep 27 '08 #5

P: n/a
On 27 Sep, 19:25, William Gill <nore...@example.comwrote:
I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such.
Did you check your logs for the previous hit from the client or just
check the referer? The latter is easily faked - but if there is no
protection on the form then they don't need to bother.
Any ideas on a quick way to eliminate
these time wasters early in the process?
Use a capcha or required a validated email address before publishing
posts (this can be automated by e.g. requiring the email address at
the time of the post and quarantining the post until the user releases
it with a cryptic URL sent to their email address, quarantined posts
expire after 2 days) - both the capcha and email validation will not
prevent abuse but should cut it down.

C.
Sep 28 '08 #6

P: n/a
William Gill wrote:
Jerry Stuckle wrote:
>William Gill wrote:
>>I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of
spam-bot. I thought of putting some code in the php processing
script to trash the bot generated responses, but was surprised to see
the referrer is my form, not some clone or such. Any ideas on a
quick way to eliminate these time wasters early in the process?

Some type of CAPTCHA routine (it doesn't necessarily have to be an
image).
I hadn't thought about non-image CAPTCHA this is easy enough to do.


As long as you do something that no one else is doing you will
probably be OK. I have a dumb little bit that killed the SPAM from my
forms, but it would be easy to figure out if you took a bit of time.

Jeff
Thanks
Sep 28 '08 #7

P: n/a
Hello,

on 09/27/2008 03:25 PM William Gill said the following:
I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. Any ideas on a quick way to eliminate
these time wasters early in the process?
You may want to try this forms class that comes with a CAPTCHA
validation plug-in.

http://www.phpclasses.org/formsgeneration

Here is a live example:

http://www.meta-language.net/forms-e..._captcha_input

Here you can watch a tutorial video:

http://www.phpclasses.org/browse/vid...n-captcha.html

This CAPTCHA plug-in also avoids Cross-site request forgery (CSRF)
attacks because the CAPTCHA text is only valid for a given period of time.
--

Regards,
Manuel Lemos

Find and post PHP jobs
http://www.phpclasses.org/jobs/

PHP Classes - Free ready to use OOP components written in PHP
http://www.phpclasses.org/
Sep 29 '08 #8

P: n/a
On Sep 27, 9:25*pm, William Gill <nore...@example.comwrote:
I am getting a lot of phony form submissions. *I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
* I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. *Any ideas on a quick way to eliminate
these time wasters early in the process?
You can also use javascript, as most of spam bots cannot parse it.
For example,
<form method="post" action="iamspammer" name="myForm">
[..]
</form>
<script type="text/javascript>
document.myForm.action = "therealaction";
</script>
You can randomize action url with help of mod_rewrite .

If it doesnt help, you can use recaptcha, which provides both audio
and image captcha.
I recommend Zend_Captcha with recaptcha adapter.
http://framework.zend.com/manual/en/...ters.recaptcha
Sep 29 '08 #9

P: n/a
Manuel Lemos wrote:
Hello,

on 09/27/2008 03:25 PM William Gill said the following:
>I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. Any ideas on a quick way to eliminate
these time wasters early in the process?

You may want to try this forms class that comes with a CAPTCHA
validation plug-in.

http://www.phpclasses.org/formsgeneration

Here is a live example:

http://www.meta-language.net/forms-e..._captcha_input

Here you can watch a tutorial video:

http://www.phpclasses.org/browse/vid...n-captcha.html

This CAPTCHA plug-in also avoids Cross-site request forgery (CSRF)
attacks because the CAPTCHA text is only valid for a given period of time.

That is one of the worst CAPTCHA routines the internet. He will be much
better off with almost anything else.

You should at least be truthful and tell people it's YOUR crap your
trying to push on them, rather than lead them to think you are a
disinterested third party.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Sep 29 '08 #10

P: n/a
..oO(Kristaps KŻlis)
>On Sep 27, 9:25*pm, William Gill <nore...@example.comwrote:
>I am getting a lot of phony form submissions. *I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
* I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. *Any ideas on a quick way to eliminate
these time wasters early in the process?

You can also use javascript, as most of spam bots cannot parse it.
Many visitors can't use it either. Really bad idea.

Micha
Sep 29 '08 #11

P: n/a
William Gill <no*****@example.comwrote in news:dUuDk.1497$073.28
@fe99.usenetserver.com:
I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. Any ideas on a quick way to eliminate
these time wasters early in the process?
CAPTCHA is the way. I use a simple method which stores a 4 digit number in
the database. Then using 4 drop down boxes they have to select the
coresponding numbers. on submit the numbers are checked. simple but so
effective.
Sep 29 '08 #12

P: n/a
..oO(Tree*Rat)
>William Gill <no*****@example.comwrote in news:dUuDk.1497$073.28
@fe99.usenetserver.com:
>I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. Any ideas on a quick way to eliminate
these time wasters early in the process?

CAPTCHA is the way.
Not really.
>I use a simple method which stores a 4 digit number in
the database. Then using 4 drop down boxes they have to select the
coresponding numbers. on submit the numbers are checked. simple but so
effective.
Every CAPTCHA creates obstacles for the visitors. One could even say
that you abuse _them_ to solve _your_ spam problem. There are other
totally transparent methods to fight spam without forcing the visitors
to do your work. Google "honeypot" for example.

Micha
Sep 29 '08 #13

P: n/a
C. (http://symcbean.blogspot.com/) wrote:
Did you check your logs for the previous hit from the client or just
check the referer? The latter is easily faked - but if there is no
protection on the form then they don't need to bother.
A quick scan shows a GET followed by a POST, so it looks (at least to
me) like their bots are actually "filling out the form"
Sep 29 '08 #14

P: n/a
William Gill wrote:
C. (http://symcbean.blogspot.com/) wrote:
>Did you check your logs for the previous hit from the client or just
check the referer? The latter is easily faked - but if there is no
protection on the form then they don't need to bother.
A quick scan shows a GET followed by a POST, so it looks (at least to
me) like their bots are actually "filling out the form"
I may have spoke too soon. A closer look reveals some oddities like a
GET followed by a POST several minutes later from a different IP. I
will have to watch closely and compare timepoints w/actual spammed
submissions.
Sep 29 '08 #15

P: n/a
William Gill wrote:
William Gill wrote:
>C. (http://symcbean.blogspot.com/) wrote:
>>Did you check your logs for the previous hit from the client or just
check the referer? The latter is easily faked - but if there is no
protection on the form then they don't need to bother.
A quick scan shows a GET followed by a POST, so it looks (at least to
me) like their bots are actually "filling out the form"

I may have spoke too soon. A closer look reveals some oddities like a
GET followed by a POST several minutes later from a different IP. I
will have to watch closely and compare timepoints w/actual spammed
submissions.
Which is entirely possible whether the submission is via a bot or a human.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Sep 29 '08 #16

P: n/a
Jerry Stuckle wrote:
Which is entirely possible whether the submission is via a bot or a human.
Yes, I know. However the incidence of the mismatched GET/POST pairs
seems (at least on the surface) to more closely math the ratio of
spammed submissions to real submissions. As I said, I will need to look
a little closer.
Sep 29 '08 #17

P: n/a
>Which is entirely possible whether the submission is via a bot or a human.
>>
Yes, I know. However the incidence of the mismatched GET/POST pairs
seems (at least on the surface) to more closely math the ratio of
spammed submissions to real submissions. As I said, I will need to look
a little closer.
Consider this method of determining whether it's a spambot or a human:

1. Start a session
2. If they GET your form, set a flag in the session data.
3. If they POST your form, check the flag. If it's not set,
it's likely a spambot. If it is set, it might be a legit
user, assuming the spambot behavior of different IPs indicates
something other than a spambot running a browser through a
proxy, such as independent machines doing the GET and POST.

In any case, reset the flag unless it's allowed to fetch the form
once and submit it multiple times.

This might be totally useless, or it might be effective and it doesn't
bother the user at all if they have session cookies enabled.

Sep 29 '08 #18

P: n/a
Michael Fesser <ne*****@gmx.dewrote in
news:60********************************@4ax.com:
.oO(Tree*Rat)
>>William Gill <no*****@example.comwrote in news:dUuDk.1497$073.28
@fe99.usenetserver.com:
>>I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of
spam-bot.
I thought of putting some code in the php processing script to
trash
the bot generated responses, but was surprised to see the referrer
is my form, not some clone or such. Any ideas on a quick way to
eliminate these time wasters early in the process?

CAPTCHA is the way.

Not really.
>>I use a simple method which stores a 4 digit number in
the database. Then using 4 drop down boxes they have to select the
coresponding numbers. on submit the numbers are checked. simple but so
effective.

Every CAPTCHA creates obstacles for the visitors. One could even say
that you abuse _them_ to solve _your_ spam problem. There are other
totally transparent methods to fight spam without forcing the visitors
to do your work. Google "honeypot" for example.

Micha
Someone else who relies on google stuff, read between the line with them
and you will notice google don't do things for free, like chrome, they
get info about your activity while using chrome. SO why rely on others.
Do it yourself and keep it simple. Yes there are other ways but "keep it
simple stupid" is a phrase that I stick by.
Sep 30 '08 #19

P: n/a
Message-ID: <Xn**********************************@69.16.176.25 3from
Tree*Rat contained the following:
>Every CAPTCHA creates obstacles for the visitors. One could even say
that you abuse _them_ to solve _your_ spam problem. There are other
totally transparent methods to fight spam without forcing the visitors
to do your work. Google "honeypot" for example.

Micha

Someone else who relies on google stuff, read between the line with them
and you will notice google don't do things for free, like chrome, they
get info about your activity while using chrome. SO why rely on others.
Do it yourself and keep it simple. Yes there are other ways but "keep it
simple stupid" is a phrase that I stick by.
Whoosh!
--
Geoff Berrow 0110001001101100010000000110
001101101011011001000110111101100111001011
100110001101101111001011100111010101101011
http://slipperyhill.co.uk - http://4theweb.co.uk
Sep 30 '08 #20

P: n/a
On Sep 27, 9:25*pm, William Gill <nore...@example.comwrote:
I am getting a lot of phony form submissions. *I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
* I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. *Any ideas on a quick way to eliminate
these time wasters early in the process?
Check $_SERVER['REMOTE_ADDR'] and if it's static, ban it!..
I use captcha after 2 submissions and more than 5 posts per ip in a
day are prohibited.
Here is gd manual which would be helpful to create your custom captcha
functions http://tr2.php.net/manual/en/book.image.php
Sep 30 '08 #21

P: n/a
On 30 Sep, 10:58, Betikci Boris <pard...@gmail.comwrote:
On Sep 27, 9:25*pm, William Gill <nore...@example.comwrote:
I am getting a lot of phony form submissions. *I assume they are not
being done manually because of the volume, but by some sort of spam-bot..
* I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. *Any ideas on a quick way to eliminate
these time wasters early in the process?

Check $_SERVER['REMOTE_ADDR'] and if it's static, ban it!..
I use captcha after 2 submissions and more than 5 posts per ip in a
day are prohibited.
Proxies?
Sep 30 '08 #22

P: n/a
On 30 Sep, 10:58, Betikci Boris <pard...@gmail.comwrote:
On Sep 27, 9:25*pm, William Gill <nore...@example.comwrote:
I am getting a lot of phony form submissions. *I assume they are not
being done manually because of the volume, but by some sort of spam-bot..
* I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. *Any ideas on a quick way to eliminate
these time wasters early in the process?

Check $_SERVER['REMOTE_ADDR'] and if it's static, ban it!..
I use captcha after 2 submissions and more than 5 posts per ip in a
day are prohibited.
Here is gd manual which would be helpful to create your custom captcha
functionshttp://tr2.php.net/manual/en/book.image.php
Clicked Send too quick!

the ISP Freeserve for example used to route all their users through a
proxy, as do many large organisations.

So your prohibition would, in this case, stop real honest unique
users.

Remind me not to employ you when I need a web site built!
Sep 30 '08 #23

P: n/a
On 30 Sep, 12:10, Captain Paralytic <paul_laut...@yahoo.comwrote:
On 30 Sep, 10:58, Betikci Boris <pard...@gmail.comwrote:
On Sep 27, 9:25 pm, William Gill <nore...@example.comwrote:
I am getting a lot of phony form submissions. I assume they are not
being done manually because of the volume, but by some sort of spam-bot.
I thought of putting some code in the php processing script to trash
the bot generated responses, but was surprised to see the referrer is my
form, not some clone or such. Any ideas on a quick way to eliminate
these time wasters early in the process?
Check $_SERVER['REMOTE_ADDR'] and if it's static, ban it!..
I use captcha after 2 submissions and more than 5 posts per ip in a
day are prohibited.
Here is gd manual which would be helpful to create your custom captcha
functionshttp://tr2.php.net/manual/en/book.image.php

Clicked Send too quick!

the ISP Freeserve for example used to route all their users through a
proxy, as do many large organisations.

So your prohibition would, in this case, stop real honest unique
users.

Remind me not to employ you when I need a web site built!
Agreed - this is a dumb idea

C.
Sep 30 '08 #24

P: n/a
Michael Fesser wrote:
.oO(Tree*Rat)
Every CAPTCHA creates obstacles for the visitors. One could even say
that you abuse _them_ to solve _your_ spam problem. There are other
totally transparent methods to fight spam without forcing the visitors
to do your work. Google "honeypot" for example.

Micah
Micah,

I agree, but I'm not getting much that is helpful trying to research
honeypot. Can you give me (generically) some ideas on transparent
methods? I can generate my own algorithms, but I'm at a loss where to
start.
Sep 30 '08 #25

P: n/a
Message-ID: <Tu***************@fe101.usenetserver.comfrom William Gill
contained the following:
>I agree, but I'm not getting much that is helpful trying to research
honeypot. Can you give me (generically) some ideas on transparent
methods? I can generate my own algorithms, but I'm at a loss where to
start.

Bots that spam forms generally add random text to all fields. One
honeypot method involves creating a field and then hiding it using CSS

The field should be present in the $_POST array, but empty for a valid
submission.

Html:

<label for='secret_field' style='display:none'>Please leave blank <input
name='secret_field'id='secret_field'></label>

PHP:

if(!isset($_POST[$secret_field])||$_POST[$secret_field]!=""){
//dump form contents
}
else{
//send mail
}
--
Geoff Berrow 0110001001101100010000000110
001101101011011001000110111101100111001011
100110001101101111001011100111010101101011
http://slipperyhill.co.uk - http://4theweb.co.uk
Sep 30 '08 #26

P: n/a
..oO(William Gill)
>Michael Fesser wrote:
>.oO(Tree*Rat)
>Every CAPTCHA creates obstacles for the visitors. One could even say
that you abuse _them_ to solve _your_ spam problem. There are other
totally transparent methods to fight spam without forcing the visitors
to do your work. Google "honeypot" for example.

Micah

Micah,
Sometimes I can't even write my own name ... *LOL*
>I agree, but I'm not getting much that is helpful trying to research
honeypot. Can you give me (generically) some ideas on transparent
methods? I can generate my own algorithms, but I'm at a loss where to
start.
Here are some explanations of different methods to fight bots:

http://nedbatchelder.com/text/stopbots.html

Honeypots are described somewhere down the page.

There are three other methods described as well, but at least random
field names might have unwanted and user-unfriendly side effects (e.g.
prevent auto-formfill in browsers), which is why I don't use them.

The timestamp thing is also described in some more details and with
implementation details in the thread "An unobtrusive partial alternative
to CAPCHA" starting with msg-ID <news:87************@bsb.me.uk>.

Here's the link to the archived version:

http://groups.google.com/group/comp....2329c126cb94db

HTH
Micha
Sep 30 '08 #27

P: n/a
Michael Fesser wrote:
>>Micah
Micah,
Micha,

Sorry, I just cut-n-pasted w/o looking.
>
Here are some explanations of different methods to fight bots:

http://nedbatchelder.com/text/stopbots.html

Honeypots are described somewhere down the page.

There are three other methods described as well, but at least random
field names might have unwanted and user-unfriendly side effects (e.g.
prevent auto-formfill in browsers), which is why I don't use them.
Thanks, this is what I was looking for.

Sep 30 '08 #28

P: n/a
..oO(William Gill)
>Michael Fesser wrote:
>>>Micah
Micah,

Micha,

Sorry, I just cut-n-pasted w/o looking.
No problem. ;-)
>Here are some explanations of different methods to fight bots:

http://nedbatchelder.com/text/stopbots.html

Honeypots are described somewhere down the page.

There are three other methods described as well, but at least random
field names might have unwanted and user-unfriendly side effects (e.g.
prevent auto-formfill in browsers), which is why I don't use them.

Thanks, this is what I was looking for.
You're welcome.

Micha
Sep 30 '08 #29

This discussion thread is closed

Replies have been disabled for this discussion.