473,379 Members | 1,260 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,379 software developers and data experts.

pop3 email header classifier?

Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject? Quite a few seem to have
semi-valid return addresses so I get postmaster rejects from
xx*@microsoft.com etc.

I know about spam-bayes etc, but these things are over 120k each and it
seems pretty pointless to download them (as well as taking about an
hour).
--
Robin Becker
Jul 18 '05 #1
6 2449

[Robin]
Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject?


I have a webmail application that can be made to delete messages based on
regular expressions, at http://entrian.com/cgi-bin/pop3.py

I wrote it in response to a similar problem, whereby a spammer used my
address as his From address, and I received a couple of thousand bounce
messages a day.

You can set up regular expression filters on To, From and Subject, and set
it to either mark messages for deletion (so you get to review them before
deleting them) or delete them straight away (via the "I'm either brave or
stupid" checkbox, TM 8-) You can save your filters for later use.

Take EXTREME CARE with this, particularly if you check the "I'm either
brave or stupid" box. 8-) There is no way to recover a deleted message.
Don't sue me if it eats your hamster's emails.

You probably need something like (untested):

From: microsoft|ms\b
Subject: patch|latest|microsoft|update|upgrade|pack

There's no SSL version of this, so your POP3 account details will pass in
plain text over the internet (in theory my provider has a scheme whereby
you can access the site over SSL using their certificate, but it doesn't
work for some reason - if there's any interest I'll see whether I can make
it work).

(And no, I'm not going to harvest your POP3 account details. They never
even hit the hard drive.)

--
Richie Hindle
ri****@entrian.com
Jul 18 '05 #2
In message <6c********************************@4ax.com>, Richie Hindle
<ri****@entrian.com> writes

someone has posted a poplib command line thing on much the same lines in
another thread.
[Robin]
Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject?


I have a webmail application that can be made to delete messages based on
regular expressions, at http://entrian.com/cgi-bin/pop3.py

I wrote it in response to a similar problem, whereby a spammer used my
address as his From address, and I received a couple of thousand bounce
messages a day.

You can set up regular expression filters on To, From and Subject, and set
it to either mark messages for deletion (so you get to review them before
deleting them) or delete them straight away (via the "I'm either brave or
stupid" checkbox, TM 8-) You can save your filters for later use.

Take EXTREME CARE with this, particularly if you check the "I'm either
brave or stupid" box. 8-) There is no way to recover a deleted message.
Don't sue me if it eats your hamster's emails.

You probably need something like (untested):

From: microsoft|ms\b
Subject: patch|latest|microsoft|update|upgrade|pack

There's no SSL version of this, so your POP3 account details will pass in
plain text over the internet (in theory my provider has a scheme whereby
you can access the site over SSL using their certificate, but it doesn't
work for some reason - if there's any interest I'll see whether I can make
it work).

(And no, I'm not going to harvest your POP3 account details. They never
even hit the hard drive.)


--
Robin Becker

Jul 18 '05 #3
Robin Becker <ro***@jessikat.fsnet.co.uk> wrote:

Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject? Quite a few seem to have
semi-valid return addresses so I get postmaster rejects from
xx*@microsoft.com etc.


Is your e-mail client actually set up to send a RESPONSE when you receive a
virus attachment? If so, can you please STOP IT AT ONCE?

ALL viruses released in the last 3 years choose random names for both the
sender AND recipient. It is not possible to automatically extract the
infected individual's e-mail address from a virus message. You can find
the address of their e-mail server, but that's all.

By sending a polite "you sent me a virus" message, you are doing NOTHING to
stop the viruses, you are ANNOYING an innocent person, and you are DOUBLING
the e-mail volume damage caused by the virus script kiddies.

I got close to 10,000 helpful and completely bogus "you sent my a virus"
messages during the "SoBig" fiasco.
--
- Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Jul 18 '05 #4
In article <r8********************************@4ax.com>, Tim Roberts
<ti**@probo.com> writes
Robin Becker <ro***@jessikat.fsnet.co.uk> wrote:

Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject? Quite a few seem to have
semi-valid return addresses so I get postmaster rejects from
xx*@microsoft.com etc.
Is your e-mail client actually set up to send a RESPONSE when you receive a
virus attachment? If so, can you please STOP IT AT ONCE?


I have no virus detection in the client and am deliberately not
rejecting. That was the whole point of my question I wanted to do
better.

As a point of fact with this SWEN worm, it does seem possible to kill by
a combination of the subject, from address and attachment size. The
spambayes approach would certainly work, but it wouldn't improve my
download times. I estimate I had about 50Mb of these things to download
yesterday (ie 3-4 hours @ 56k). By employing a kill script I could keep
up fairy easily.

I'm certainly not sending any response or rejecting, I'm using DELE
which should be a sink.
ALL viruses released in the last 3 years choose random names for both the
sender AND recipient. It is not possible to automatically extract the
infected individual's e-mail address from a virus message. You can find
the address of their e-mail server, but that's all.

By sending a polite "you sent me a virus" message, you are doing NOTHING to
stop the viruses, you are ANNOYING an innocent person, and you are DOUBLING
the e-mail volume damage caused by the virus script kiddies.

I got close to 10,000 helpful and completely bogus "you sent my a virus"
messages during the "SoBig" fiasco.


--
Robin Becker
Jul 18 '05 #5
<posted & mailed>

Robin Becker wrote:
Hi, I'm getting vast numbers of fake upgrade emails containing some kind
of virus. My rather old client can be made to reject these based on some
patterns in the subject line. They're nearly all based on the word
'New', 'Latest', 'Microsoft', 'Patch', 'Pack', ... etc etc.

Is there a python tool that can be made to delete these from my POP3
mail box rather than let my client reject? Quite a few seem to have
semi-valid return addresses so I get postmaster rejects from
xx*@microsoft.com etc.

I know about spam-bayes etc, but these things are over 120k each and it
seems pretty pointless to download them (as well as taking about an
hour).


I posted an "emergency script" to be used for the purpose -- it
triggers SOLELY on mail size. I have now enhanced it with lots of
options etc, but the basic idea remains that of size-only triggering --
risky but, it IS an emergency. BTW, the "postmaster rejects" are
likely not connected to what you do with the "fake upgrade emails",
alas -- rather, virus senders are now faking "From:" &c addresses,
so everybody's getting lots of bounce msgs for mails they never sent.
Alex

Jul 18 '05 #6
Robin Becker <ro***@jessikat.fsnet.co.uk> wrote previously:
|Is there a python tool that can be made to delete these from my POP3
|mail box rather than let my client reject?
|I know about spam-bayes etc, but these things are over 120k each and it
|seems pretty pointless to download them (as well as taking about an
|hour).

I do exactly this myself. For my article (about a year ago now) on Spam
filtering, for IBM developerWorks, I developed my own little custom
tool. I've refined it over time, but it remains kinda hackerish and
un(der)documented. Still, I'd be happy to share with anyone
interested... especially if anyone wants to make something nice out of
it for distribution.

The idea of what I do is a hodgepodge. But the general idea is that I
use [poplib] to download ONLY the headers. Those messages that are
convincingly spam based on that get deleted without me ever needing to
download bodies.

As a first line of defense, I have a collection of blacklist and
whitelist patterns (I only use strings and globs, not regexen; though
the latter would be easy to add). These look at specific headers fields
in which patterns might occur (or at the whole header, if I wish).

But the next line of defense is the usual naive Bayesian style. The
wrinkle here is that I do not use "words" in the headers for analysis,
but rather trigrams (sequences of three characters). I believe that for
headers-only, this is more accurate, although I have not rigorously
tested this. Things like routing IPs and spam mail clients are hard to
pick out by whole words, but trigrams do some magic.

The other feature of my 'spamfilter' tool is that it knows nothing at
all about specific mail clients. It just sits daemon-like, and
periodically deletes stuff it doesn't like. I check mail from a lot of
different clients, on a lot of different machines; so for me it would be
inconvenient to have the filtering tied to one particular mail
client/machine. My thing just runs and kills, even when I'm out of
town, and checking for internet cafes.

Yours, David...

--
mertz@ | The specter of free information is haunting the `Net! All the
gnosis | powers of IP- and crypto-tyranny have entered into an unholy
..cx | alliance...ideas have nothing to lose but their chains. Unite
| against "intellectual property" and anti-privacy regimes!
-------------------------------------------------------------------------
Jul 18 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Paul Schmidt | last post by:
Dear list: I am new to python, and I am trying to figure out the short answer on something. I want to open a POP3 mailbox, read the enclosed mail using the POP3 module, , and then process it...
4
by: crystal1 | last post by:
Not sure if this has been done... Has anyone created a python script that listens on the default POP3 port for incoming mail, kills certain messages based on a criteria, and forwards the output...
2
by: Mike Brearley | last post by:
I need to write a script that will check a catch-all mailbox (pop3) and send a non delivery report back to the sender of the email. Background info: I have a domain hosted on a site that offers...
1
by: bobano | last post by:
Hi everyone, I am writing a POP3 Client program in Perl. You connect to a POP3 Server and have a running conversation with the mail server using commands from the RFC 1939 Post Office Protocol....
4
by: bill | last post by:
I am in preliminary design of a contact management program. The user would like to use his current mail server (POP3 - remote). I have done some reading on the IMAP functions but am unclear if...
4
by: =?Utf-8?B?QWxwYW5h?= | last post by:
I am making a thin email client and want to get emails from a pop3 server...Is there any built in support in C# to get emails from a pop3 server and parse the email to show up on the UI ?
0
by: =?Utf-8?B?Q2hhcmxlcw==?= | last post by:
Like many people, I normally use Yahoo! Mail via the web and like to keep all my emails stored on the Yahoo! server. However sometimes I can’t get access to a PC/the web and I download my emails...
11
by: mp- | last post by:
I want to be able to allow people to check their email from my PHP online application. Given only the users 1) email address, 2) username (if applicable) and 3) password - how can I auto detect...
0
by: Daljeet Hanspal | last post by:
hi guys i got this code in c#.net to retrieve mails using pop3 protocal, i'm using MERCURY MAIL SERVER ........ I create a user acccount in mercury by name honey@localhost the problem is ...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.