|
I have e-mail addresses in HTML source files both as commented
information information and also as part of an e-mail link. Naturally,
I'd like to hide them from harvesters to discourage more spam.
The standard approach simply makes the address human readable, but not
machine readable. For example, my address would be:
brownh @hartford-hwp.com
Which I assume is really no more secure than
brownh at hartford-hwp dot com
because when harvesting software begins to cope with these tricks, it
will have no more problem with the second than the first. Because
that's only a matter of a short time before harvesters are not so
easily fooled, I'm looking for a better method.
One technique is to us a javascript to encrypt the address. I've tried
Hivewire Enkoder and it works fine, except that a significant
percentage of browsers have javascript disabled. So this option is
out.
An alternative is to make the address a graphical file. I suppose I
could define the dimensions of this graphics in terms of em, so that
when a user changes his font size, it will change as well. However,
the addresses of concern to me are only in the webpage source, and so
a graphical substitute is no help.
Is there any method (I'm running debian) that will not depend on
javascript and yet is likely to block a harvester for quite some time
to come? It would be nice to have it controlled by a style sheet, so
that the thousands of instances do not have to be updated one by one
as need arises.
--
Haines Brown | |
Share:
|
>>>>> "Haines" == Haines Brown <br****@teufel.hartford-hwp.com> writes:
Haines> One technique is to us a javascript to encrypt the address. I've tried
Haines> Hivewire Enkoder and it works fine, except that a significant
Haines> percentage of browsers have javascript disabled. So this option is
Haines> out.
Yes, don't do that.
Haines> An alternative is to make the address a graphical file. I suppose I
Haines> could define the dimensions of this graphics in terms of em, so that
Haines> when a user changes his font size, it will change as well. However,
Haines> the addresses of concern to me are only in the webpage source, and so
Haines> a graphical substitute is no help.
No, do that either. Blind users will sue your ass.
So far, I've not seen any harvesters even bother with
HTML-de-entitizing, since there are so many "low hanging fruits" that
don't require much CPU processing. If someone could tell me if
they've been harvested as:
Send mail to me at
<a href="mailto:merlyn@stonehenge.com>merlyn@stonehen ge.com</a>!
then I'll stop recommending that. But seriously, this is enough to
thwart everyone out there so far. To be doubly safe, encode some of
"mailto:" as well.
When there are 10 million fewer addresses that *aren't* written
that way, I'll change my recommendation. :)
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<me****@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! | | |
"Haines Brown" <br****@teufel.hartford-hwp.com> wrote in message
news:87************@teufel.hartford-hwp.com... I have e-mail addresses in HTML source files both as commented information information and also as part of an e-mail link. Naturally, I'd like to hide them from harvesters to discourage more spam.
I have a simple and effective solution. Use an email form, with some
standard formmail procedure; in the field where you would normally specify
the recipient's email address, put in a code; then modify the formmail
procedure to map the code into your desired email address.
This completely removes the email address from your HTML, making it
impossible for harvesters to find it. This also prevents your formmail
procedure from being hijacked by spammers.
The downside is that some hosts don't let you customize their formmail
procedure. (This means YOU, Tina, and it's the only thing which is
preventing me from switching my clients to you! Sigh.) | | | me****@stonehenge.com (Randal L. Schwartz) writes: So far, I've not seen any harvesters even bother with HTML-de-entitizing, since there are so many "low hanging fruits" that don't require much CPU processing. If someone could tell me if they've been harvested as:
Send mail to me at <a href="mailto:merlyn@stonehenge.com>merlyn@stonehen ge.com</a>!
then I'll stop recommending that. But seriously, this is enough to thwart everyone out there so far. To be doubly safe, encode some of "mailto:" as well.
Elegant solution! Probably the reason it didn't occur to me ;-) I'm
only waiting for a lurker to jump in and explain why your assumptions
are wrong, but so far, so good (I haven't tried it yet under IE).
While I can't use this for the e-mail address that is placed in a
comment in the page source, I figure anyone who is savy enough to
snoop there is sophisticated enough to understand any tricks I might
play with the address.
--
Haines Brown | | |
On Mon, 21 Jun 2004 17:38:52 GMT, Haines Brown
<br****@teufel.hartford-hwp.com> wrote: I have e-mail addresses in HTML source files both as commented information information and also as part of an e-mail link. Naturally, I'd like to hide them from harvesters to discourage more spam.
At a friend's suggestion well over a year ago I started using "Email
Address Encoder" found at: http://www.wbwip.com/wbw/emailencoder.html
It's quick and easy and since implementing it on only one email
address that's posted on one web page I haven't gotten any unwanted
garbage. Maybe I've just been lucky, but it's worked for me. You can
see it in use at the URL below.
Leslie
Leslie's Audio Trivia http://www.BessieBee.com/Trivia/
"I refuse to have a battle of wits with an unarmed person." | | |
"C A Upsdell" <cupsdell0311XXX@-> wrote in
comp.infosystems. www.authoring.html:I have a simple and effective solution. Use an email form, with some standard formmail procedure; in the field where you would normally specify the recipient's email address, put in a code; then modify the formmail procedure to map the code into your desired email address.
This completely removes the email address from your HTML, making it impossible for harvesters to find it. This also prevents your formmail procedure from being hijacked by spammers.
And it also irritates the heck out of people like me who would
rather use our regular mailer because we like its editor, its spell
checker, and its ability to retain a copy of what we sent.
--
Stan Brown, Oak Road Systems, Cortland County, New York, USA http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
2.1 changes: http://www.w3.org/TR/CSS21/changes.html
validator: http://jigsaw.w3.org/css-validator/ | | |
Leslie <bo****@rocketmail.com> writes: At a friend's suggestion well over a year ago I started using "Email Address Encoder" found at: http://www.wbwip.com/wbw/emailencoder.html
It's quick and easy and since implementing it on only one email address that's posted on one web page I haven't gotten any unwanted garbage. Maybe I've just been lucky, but it's worked for me. You can see it in use at the URL below.
I might add that this url points to a utility that simply converts all
the characters in an address to character entities. Leslie's mailto,
which uses it, works just fine (my galeon browser brings up my
emacs/rmail reader).
However, this page to which he refers also has a link to a page
( http://www.neokraft.net/sottises/mailencoder/ ) that offers a
service to convert an address into hexadecimal notation. I tried it on
my galeon browser, and it works.
I don't see in principle any advantage of hexidecimal over character
entities, or representing an entire address as character entities
rather than just doing a character or two in "mailto" and your
address, but suspect that if harvesters eventually go to the trouble
to skip empty spaces, convert the word "dot" and "@", and even convert
any character entities to plain ASCII, there is less likelihood they
will also think to convert from hex.
Let me go back to one of my original questions: is it possible to
represent an address in a source page, such as placing in its head:
<!-- contact Haines Brown: br***@hartford-hwp.com -->
in a way that is transparent to a human reader, but opaque to a
harvester?
--
Haines Brown | | |
Haines Brown wrote:
<snip> I don't see in principle any advantage of hexidecimal over character entities, or representing an entire address as character entities rather than just doing a character or two in "mailto" and your address, but suspect that if harvesters eventually go to the trouble to skip empty spaces, convert the word "dot" and "@", and even convert any character entities to plain ASCII, there is less likelihood they will also think to convert from hex.
Eventually someone will write an e-mail address harvester that uses IE
embedded as a component (it is not that difficult). When they do it will
be able to read IE's interpretation of link URLs straight from the DOM
and character and hex encoding, or javascript tricks, will not help
conceal anything form it.
The observation that currently there are plenty of e-mail addresses
available for little or no effort explains why such a system has not
been created (combined with technical ignorance on the part of people
who are interested in harvesting e-mail addresses). But in the end it
might just be the case that if you put an e-mail address in a public
place you should expect to get spammed.
Richard. | | |
On Mon, 21 Jun 2004, Stan Brown wrote: "C A Upsdell" <cupsdell0311XXX@-> wrote in comp.infosystems.www.authoring.html:I have a simple and effective solution. Use an email form, with some standard formmail procedure; in the field where you would normally specify the recipient's email address, put in a code; then modify the formmail procedure to map the code into your desired email address.
This completely removes the email address from your HTML, making it impossible for harvesters to find it. This also prevents your formmail procedure from being hijacked by spammers.
And it also irritates the heck out of people like me who would rather use our regular mailer because we like its editor, its spell checker, and its ability to retain a copy of what we sent.
Perhaps so, but you get to use your regular mail program when the person you
are making first contact with via their form REPLIES and thus gives you a
(hopefully valid) return mailbox.
Personally, I use a PHP script to generate and process the form that mails me.
No matter how much someone "munges" an address, that will NEVER stop spam that
finds mailboxes by using a random word attack (including dictionary attacks; a
subset). Smarter spambots will eventually read things like ( and such, and
therefore, those constructs only delay the inevitable harvesting.
Using images to display the mailbox, besides being obnoxious to the blind,
won't work where someone is using a non-graphical browser, or has "load images"
turned off. | | |
On Tue, 22 Jun 2004, Haines Brown wrote: Let me go back to one of my original questions: is it possible to represent an address in a source page, such as placing in its head:
<!-- contact Haines Brown: br***@hartford-hwp.com -->
in a way that is transparent to a human reader, but opaque to a harvester?
That IS transparent to the human reader, viewing the page's source
notwithstanding, and probably opaque to a spambot. The fact that it's in a
comment construct doesn't matter; they'll take it.
What is of concern is that you think that the REAL mailbox should be there, not
that of a spamtrap or honeypot address.... | | |
"D. Stussy" <kd****@bde-arc.ampr.org> writes: On Tue, 22 Jun 2004, Haines Brown wrote: Let me go back to one of my original questions: is it possible to represent an address in a source page, such as placing in its head:
<!-- contact Haines Brown: br***@hartford-hwp.com -->
in a way that is transparent to a human reader, but opaque to a harvester?
That IS transparent to the human reader, viewing the page's source notwithstanding, and probably opaque to a spambot. The fact that it's in a comment construct doesn't matter; they'll take it.
What is of concern is that you think that the REAL mailbox should be there, not that of a spamtrap or honeypot address....
I don't want to drag this out, but I'm not following your point.
The example is of the following line placed in a web page header:
<!-- contact Haines Brown: br***@hartford-hwp.com -->
You seem to agree with my presumption that this is no problem for a
human (viewing the source page) to read. However, I don't understand
why you suggest that a spambot harvester would probably not be able to
read it. Why not?
Are you saying that I am wrong to want my real address to appear in
the header? Why? I started to include it years ago when people would
actually study each others pages, but maintain it today just because I
feel I should admit responsibility for my work in the source, much as
I would in programing source code.
Haines Brown | | |
On Tue, 22 Jun 2004 12:11:02 +0100, "Richard Cornford"
<Ri*****@litotes.demon.co.uk> wrote: Eventually someone will write an e-mail address harvester that uses IE embedded as a component (it is not that difficult).
Indeed, it was around 2002 when I did mine...
Jim.
--
comp.lang.javascript FAQ - http://jibbering.com/faq/ | | This discussion thread is closed Replies have been disabled for this discussion. Similar topics
1 post
views
Thread by Laphan |
last post: by
|
reply
views
Thread by Paramjit Oberoi |
last post: by
|
21 posts
views
Thread by Samir |
last post: by
|
3 posts
views
Thread by arielgr@gmail.com |
last post: by
|
116 posts
views
Thread by Mike MacSween |
last post: by
|
5 posts
views
Thread by |
last post: by
|
2 posts
views
Thread by jmensch@shaw.ca |
last post: by
|
1 post
views
Thread by Timothy Grant |
last post: by
|
4 posts
views
Thread by kirby.urner@gmail.com |
last post: by
|
2 posts
views
Thread by Chris Allen |
last post: by
| | | | | | | | | | |