473,394 Members | 2,100 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

pop3 email header classifier?

Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject? Quite a few seem to have
semi-valid return addresses so I get postmaster rejects from
xx*@microsoft.com etc.

I know about spam-bayes etc, but these things are over 120k each and it
seems pretty pointless to download them (as well as taking about an
hour).
--
Robin Becker
Jul 18 '05 #1
6 2451

[Robin]
Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject?


I have a webmail application that can be made to delete messages based on
regular expressions, at http://entrian.com/cgi-bin/pop3.py

I wrote it in response to a similar problem, whereby a spammer used my
address as his From address, and I received a couple of thousand bounce
messages a day.

You can set up regular expression filters on To, From and Subject, and set
it to either mark messages for deletion (so you get to review them before
deleting them) or delete them straight away (via the "I'm either brave or
stupid" checkbox, TM 8-) You can save your filters for later use.

Take EXTREME CARE with this, particularly if you check the "I'm either
brave or stupid" box. 8-) There is no way to recover a deleted message.
Don't sue me if it eats your hamster's emails.

You probably need something like (untested):

From: microsoft|ms\b
Subject: patch|latest|microsoft|update|upgrade|pack

There's no SSL version of this, so your POP3 account details will pass in
plain text over the internet (in theory my provider has a scheme whereby
you can access the site over SSL using their certificate, but it doesn't
work for some reason - if there's any interest I'll see whether I can make
it work).

(And no, I'm not going to harvest your POP3 account details. They never
even hit the hard drive.)

--
Richie Hindle
ri****@entrian.com
Jul 18 '05 #2
In message <6c********************************@4ax.com>, Richie Hindle
<ri****@entrian.com> writes

someone has posted a poplib command line thing on much the same lines in
another thread.
[Robin]
Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject?


I have a webmail application that can be made to delete messages based on
regular expressions, at http://entrian.com/cgi-bin/pop3.py

I wrote it in response to a similar problem, whereby a spammer used my
address as his From address, and I received a couple of thousand bounce
messages a day.

You can set up regular expression filters on To, From and Subject, and set
it to either mark messages for deletion (so you get to review them before
deleting them) or delete them straight away (via the "I'm either brave or
stupid" checkbox, TM 8-) You can save your filters for later use.

Take EXTREME CARE with this, particularly if you check the "I'm either
brave or stupid" box. 8-) There is no way to recover a deleted message.
Don't sue me if it eats your hamster's emails.

You probably need something like (untested):

From: microsoft|ms\b
Subject: patch|latest|microsoft|update|upgrade|pack

There's no SSL version of this, so your POP3 account details will pass in
plain text over the internet (in theory my provider has a scheme whereby
you can access the site over SSL using their certificate, but it doesn't
work for some reason - if there's any interest I'll see whether I can make
it work).

(And no, I'm not going to harvest your POP3 account details. They never
even hit the hard drive.)


--
Robin Becker

Jul 18 '05 #3
Robin Becker <ro***@jessikat.fsnet.co.uk> wrote:

Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject? Quite a few seem to have
semi-valid return addresses so I get postmaster rejects from
xx*@microsoft.com etc.


Is your e-mail client actually set up to send a RESPONSE when you receive a
virus attachment? If so, can you please STOP IT AT ONCE?

ALL viruses released in the last 3 years choose random names for both the
sender AND recipient. It is not possible to automatically extract the
infected individual's e-mail address from a virus message. You can find
the address of their e-mail server, but that's all.

By sending a polite "you sent me a virus" message, you are doing NOTHING to
stop the viruses, you are ANNOYING an innocent person, and you are DOUBLING
the e-mail volume damage caused by the virus script kiddies.

I got close to 10,000 helpful and completely bogus "you sent my a virus"
messages during the "SoBig" fiasco.
--
- Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Jul 18 '05 #4
In article <r8********************************@4ax.com>, Tim Roberts
<ti**@probo.com> writes
Robin Becker <ro***@jessikat.fsnet.co.uk> wrote:

Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject? Quite a few seem to have
semi-valid return addresses so I get postmaster rejects from
xx*@microsoft.com etc.
Is your e-mail client actually set up to send a RESPONSE when you receive a
virus attachment? If so, can you please STOP IT AT ONCE?


I have no virus detection in the client and am deliberately not
rejecting. That was the whole point of my question I wanted to do
better.

As a point of fact with this SWEN worm, it does seem possible to kill by
a combination of the subject, from address and attachment size. The
spambayes approach would certainly work, but it wouldn't improve my
download times. I estimate I had about 50Mb of these things to download
yesterday (ie 3-4 hours @ 56k). By employing a kill script I could keep
up fairy easily.

I'm certainly not sending any response or rejecting, I'm using DELE
which should be a sink.
ALL viruses released in the last 3 years choose random names for both the
sender AND recipient. It is not possible to automatically extract the
infected individual's e-mail address from a virus message. You can find
the address of their e-mail server, but that's all.

By sending a polite "you sent me a virus" message, you are doing NOTHING to
stop the viruses, you are ANNOYING an innocent person, and you are DOUBLING
the e-mail volume damage caused by the virus script kiddies.

I got close to 10,000 helpful and completely bogus "you sent my a virus"
messages during the "SoBig" fiasco.


--
Robin Becker
Jul 18 '05 #5
<posted & mailed>

Robin Becker wrote:
Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject? Quite a few seem to have
semi-valid return addresses so I get postmaster rejects from
xx*@microsoft.com etc.

I know about spam-bayes etc, but these things are over 120k each and it
seems pretty pointless to download them (as well as taking about an
hour).


I posted an "emergency script" to be used for the purpose -- it
triggers SOLELY on mail size. I have now enhanced it with lots of
options etc, but the basic idea remains that of size-only triggering --
risky but, it IS an emergency. BTW, the "postmaster rejects" are
likely not connected to what you do with the "fake upgrade emails",
alas -- rather, virus senders are now faking "From:" &c addresses,
so everybody's getting lots of bounce msgs for mails they never sent.
Alex

Jul 18 '05 #6
Robin Becker <ro***@jessikat.fsnet.co.uk> wrote previously:
|Is there a python tool that can be made to delete these from my POP3
|mail box rather than let my client reject?
|I know about spam-bayes etc, but these things are over 120k each and it
|seems pretty pointless to download them (as well as taking about an
|hour).

I do exactly this myself. For my article (about a year ago now) on Spam
filtering, for IBM developerWorks, I developed my own little custom
tool. I've refined it over time, but it remains kinda hackerish and
un(der)documented. Still, I'd be happy to share with anyone
interested... especially if anyone wants to make something nice out of
it for distribution.

The idea of what I do is a hodgepodge. But the general idea is that I
use [poplib] to download ONLY the headers. Those messages that are
convincingly spam based on that get deleted without me ever needing to
download bodies.

As a first line of defense, I have a collection of blacklist and
whitelist patterns (I only use strings and globs, not regexen; though
the latter would be easy to add). These look at specific headers fields
in which patterns might occur (or at the whole header, if I wish).

But the next line of defense is the usual naive Bayesian style. The
wrinkle here is that I do not use "words" in the headers for analysis,
but rather trigrams (sequences of three characters). I believe that for
headers-only, this is more accurate, although I have not rigorously
tested this. Things like routing IPs and spam mail clients are hard to
pick out by whole words, but trigrams do some magic.

The other feature of my 'spamfilter' tool is that it knows nothing at
all about specific mail clients. It just sits daemon-like, and
periodically deletes stuff it doesn't like. I check mail from a lot of
different clients, on a lot of different machines; so for me it would be
inconvenient to have the filtering tied to one particular mail
client/machine. My thing just runs and kills, even when I'm out of
town, and checking for internet cafes.

Yours, David...

--
mertz@ | The specter of free information is haunting the `Net! All the
gnosis | powers of IP- and crypto-tyranny have entered into an unholy
..cx | alliance...ideas have nothing to lose but their chains. Unite
| against "intellectual property" and anti-privacy regimes!
-------------------------------------------------------------------------
Jul 18 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Paul Schmidt | last post by:
Dear list: I am new to python, and I am trying to figure out the short answer on something. I want to open a POP3 mailbox, read the enclosed mail using the POP3 module, , and then process it...
4
by: crystal1 | last post by:
Not sure if this has been done... Has anyone created a python script that listens on the default POP3 port for incoming mail, kills certain messages based on a criteria, and forwards the output...
2
by: Mike Brearley | last post by:
I need to write a script that will check a catch-all mailbox (pop3) and send a non delivery report back to the sender of the email. Background info: I have a domain hosted on a site that offers...
1
by: bobano | last post by:
Hi everyone, I am writing a POP3 Client program in Perl. You connect to a POP3 Server and have a running conversation with the mail server using commands from the RFC 1939 Post Office Protocol....
4
by: bill | last post by:
I am in preliminary design of a contact management program. The user would like to use his current mail server (POP3 - remote). I have done some reading on the IMAP functions but am unclear if...
4
by: =?Utf-8?B?QWxwYW5h?= | last post by:
I am making a thin email client and want to get emails from a pop3 server...Is there any built in support in C# to get emails from a pop3 server and parse the email to show up on the UI ?
0
by: =?Utf-8?B?Q2hhcmxlcw==?= | last post by:
Like many people, I normally use Yahoo! Mail via the web and like to keep all my emails stored on the Yahoo! server. However sometimes I can’t get access to a PC/the web and I download my emails...
11
by: mp- | last post by:
I want to be able to allow people to check their email from my PHP online application. Given only the users 1) email address, 2) username (if applicable) and 3) password - how can I auto detect...
0
by: Daljeet Hanspal | last post by:
hi guys i got this code in c#.net to retrieve mails using pop3 protocal, i'm using MERCURY MAIL SERVER ........ I create a user acccount in mercury by name honey@localhost the problem is ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.