By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,571 Members | 960 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,571 IT Pros & Developers. It's quick & easy.

Tom Lane heads up

P: n/a
Just dropping a quick not for Tom Lane. I sent a personal message
today, but I wasn't sure if you'd get it after I remembered all of the
spam filters you've got set up.

Sorry for the off topic post.

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #1
Share this Question
Share on Google+
69 Replies

P: n/a
DeJuan Jackson wrote:
Just dropping a quick not for Tom Lane. I sent a personal message
today, but I wasn't sure if you'd get it after I remembered all of the
spam filters you've got set up.

Sorry for the off topic post.


That's ok. He is only filtering me :-)

Actually, you get a rejection notice if his spam filters catch you. If
you didn't get one, your'e ok.

Shachar

--
Shachar Shemesh
Lingnu Open Systems Consulting
http://www.lingnu.com/
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #2

P: n/a
On Fri, Mar 05, 2004 at 01:19:13PM +0200, Shachar Shemesh wrote:
DeJuan Jackson wrote:
I sent a personal message today, but I wasn't sure if you'd
get it after I remembered all of the spam filters you've got
set up.


Actually, you get a rejection notice if his spam filters catch
you. If you didn't get one, you're ok.


is there some way of getting a look at tom's or marc's filters? i could
sure use a bit of help there. lordy, we're close to drowing in the
stuff!

if it's in the archives already, i apparently didn't hit the
right search string. a quickie pointer is all i need...

thanks in advance!

--
"Why did they hard code that value into the program?".
"My only guess would be to maximize suckage."
http://suso.suso.org/docs/apache_and.../part4-2.phtml

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #3

P: n/a
On Mon, 19 Apr 2004, Will Trillich wrote:
On Fri, Mar 05, 2004 at 01:19:13PM +0200, Shachar Shemesh wrote:
DeJuan Jackson wrote:
I sent a personal message today, but I wasn't sure if you'd
get it after I remembered all of the spam filters you've got
set up.


Actually, you get a rejection notice if his spam filters catch
you. If you didn't get one, you're ok.


is there some way of getting a look at tom's or marc's filters? i could
sure use a bit of help there. lordy, we're close to drowing in the
stuff!


Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: sc*****@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #4

P: n/a
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.

Joe

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #5

P: n/a
Joe Conway wrote:
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


Here is what I use:

http://candle.pha.pa.us/main/writings/spam/

I get 98% blockage with no false positives, or at least only 1-2 a year
(that folks tell me about). :-)

--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #6

P: n/a
Tom Lane said:
Will Trillich <wi**@serensoft.com> writes:
is there some way of getting a look at tom's or marc's filters? i could
sure use a bit of help there. lordy, we're close to drowing in the
stuff!


Tell me about it :-(

I currently use four levels of filtering:

1. DNSBL lists: blackholes.five-ten-sg.com, bl.spamcop.net, relays.ordb.org
(there are others out there, but these seem to have a good impedance
match to my personal spam load).

2. Private blacklist of IP ranges that have sent me too much spam.
sendmail has a pretty easy mechanism to support this, although it
only seems to support /8 /16 or /24 ranges which is a bit coarse.
(If you've gotten a "Go away spammer" bounce from me, you were caught
by this filter --- let me know and I'll tighten the ranges.)

3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses. I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.

4. Very long list of procmail filters on header and body patterns.

#2 and #4 are fairly personal, in the sense that they have a decent
success/failure ratio for the junk mail I get. I wouldn't recommend
that someone else try my lists, and in any case they take a heck of a
lot of hand maintenance. I've been looking into more automated methods
such as CRM114 but haven't made the jump yet.


Yes they sure are. I tried my personal blacklist on a client's server one
time after they complained of seeing dozens a minute slipping by. It did just
about nothing, but it got them started on their own. #3 looks interesting
though...

Best regards,

Jim Wilson
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #7

P: n/a
Will Trillich <wi**@serensoft.com> writes:
is there some way of getting a look at tom's or marc's filters? i could
sure use a bit of help there. lordy, we're close to drowing in the
stuff!


Tell me about it :-(

I currently use four levels of filtering:

1. DNSBL lists: blackholes.five-ten-sg.com, bl.spamcop.net, relays.ordb.org
(there are others out there, but these seem to have a good impedance
match to my personal spam load).

2. Private blacklist of IP ranges that have sent me too much spam.
sendmail has a pretty easy mechanism to support this, although it
only seems to support /8 /16 or /24 ranges which is a bit coarse.
(If you've gotten a "Go away spammer" bounce from me, you were caught
by this filter --- let me know and I'll tighten the ranges.)

3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses. I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.

4. Very long list of procmail filters on header and body patterns.

#2 and #4 are fairly personal, in the sense that they have a decent
success/failure ratio for the junk mail I get. I wouldn't recommend
that someone else try my lists, and in any case they take a heck of a
lot of hand maintenance. I've been looking into more automated methods
such as CRM114 but haven't made the jump yet.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #8

P: n/a
Quoting Tom Lane <tg*@sss.pgh.pa.us>:
Will Trillich <wi**@serensoft.com> writes:
is there some way of getting a look at tom's or marc's filters? i could
sure use a bit of help there. lordy, we're close to drowing in the
stuff!


Tell me about it :-(

I currently use four levels of filtering:

1. DNSBL lists: blackholes.five-ten-sg.com, bl.spamcop.net, relays.ordb.org
(there are others out there, but these seem to have a good impedance
match to my personal spam load).

2. Private blacklist of IP ranges that have sent me too much spam.
sendmail has a pretty easy mechanism to support this, although it
only seems to support /8 /16 or /24 ranges which is a bit coarse.
(If you've gotten a "Go away spammer" bounce from me, you were caught
by this filter --- let me know and I'll tighten the ranges.)


There is a sendmail script for this called "cidrexpand" that allows you to put
in CIDR blocks- i.e. things like 216.185.96.0/19 can be put into the sendmail
access file.

<--stuff deleted-->
--
Keith C. Perry, MS E.E.
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com

____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #9

P: n/a
When grilled further on (Mon, 19 Apr 2004 21:19:05 -0700),
Joe Conway <ma**@joeconway.com> confessed:
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


Have you played with the "spamassassin --report" feature? Works fairly well if
you can integrate it into your e-mail client and report a bunch of
messages as spam. It trains the Bayes filter and reports to Razor (at
the least).

Sylpheed Claws has actions (you use "spamassassin--report %F" as the action),
and it'll batch the report on all selected messages.

I find that after a 10-20 messages, it starts finding the ones that were
slipping through. Since February, I have 200 missed out of 4200.

Cheers,
Rob

--
22:28:27 up 3 days, 2:06, 3 users, load average: 3.24, 3.08, 3.45
Linux 2.6.5-01 #5 SMP Tue Apr 6 21:32:39 MDT 2004

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iEYEARECAAYFAkCEqlQACgkQLQ/DKuwDYznDtACeMAJYS4ijVXYec83ApRVroey8
NMwAn2l4SEDemryfoMZuEDWtNfaUVYj5
=HZLL
-----END PGP SIGNATURE-----

Nov 23 '05 #10

P: n/a
On Mon, 19 Apr 2004, Will Trillich wrote:
On Fri, Mar 05, 2004 at 01:19:13PM +0200, Shachar Shemesh wrote:
DeJuan Jackson wrote:
I sent a personal message today, but I wasn't sure if you'd
get it after I remembered all of the spam filters you've got
set up.


Actually, you get a rejection notice if his spam filters catch
you. If you didn't get one, you're ok.


is there some way of getting a look at tom's or marc's filters? i could
sure use a bit of help there. lordy, we're close to drowing in the
stuff!


Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: sc*****@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #11

P: n/a
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.

Joe

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #12

P: n/a
On Tue, Apr 20, 2004 at 01:06:18AM -0400, Tom Lane wrote:
Will Trillich <wi**@serensoft.com> writes:
is there some way of getting a look at tom's or marc's filters? i could
sure use a bit of help there. lordy, we're close to drowing in the
stuff!


Tell me about it :-(

I currently use four levels of filtering:

1. DNSBL lists: blackholes.five-ten-sg.com, bl.spamcop.net, relays.ordb.org
(there are others out there, but these seem to have a good impedance
match to my personal spam load).

2. Private blacklist of IP ranges that have sent me too much spam.
sendmail has a pretty easy mechanism to support this, although it
only seems to support /8 /16 or /24 ranges which is a bit coarse.
(If you've gotten a "Go away spammer" bounce from me, you were caught
by this filter --- let me know and I'll tighten the ranges.)

3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses. I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.

4. Very long list of procmail filters on header and body patterns.


It must be pretty difficult maintain these header and body patterns
and the others lists. I had same problem and I resolve if by
"spamassassin", it knows learn and it's more simple than procmailrc
coding. Now I have cca 5% of all spams in my INBOX.

Karel
--
Karel Zak <za***@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #13

P: n/a
Joe Conway wrote:
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


Here is what I use:

http://candle.pha.pa.us/main/writings/spam/

I get 98% blockage with no false positives, or at least only 1-2 a year
(that folks tell me about). :-)

--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #14

P: n/a
Tom Lane said:
Will Trillich <wi**@serensoft.com> writes:
is there some way of getting a look at tom's or marc's filters? i could
sure use a bit of help there. lordy, we're close to drowing in the
stuff!


Tell me about it :-(

I currently use four levels of filtering:

1. DNSBL lists: blackholes.five-ten-sg.com, bl.spamcop.net, relays.ordb.org
(there are others out there, but these seem to have a good impedance
match to my personal spam load).

2. Private blacklist of IP ranges that have sent me too much spam.
sendmail has a pretty easy mechanism to support this, although it
only seems to support /8 /16 or /24 ranges which is a bit coarse.
(If you've gotten a "Go away spammer" bounce from me, you were caught
by this filter --- let me know and I'll tighten the ranges.)

3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses. I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.

4. Very long list of procmail filters on header and body patterns.

#2 and #4 are fairly personal, in the sense that they have a decent
success/failure ratio for the junk mail I get. I wouldn't recommend
that someone else try my lists, and in any case they take a heck of a
lot of hand maintenance. I've been looking into more automated methods
such as CRM114 but haven't made the jump yet.


Yes they sure are. I tried my personal blacklist on a client's server one
time after they complained of seeing dozens a minute slipping by. It did just
about nothing, but it got them started on their own. #3 looks interesting
though...

Best regards,

Jim Wilson
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #15

P: n/a
Karel Zak wrote:
It must be pretty difficult maintain these header and body patterns
and the others lists. I had same problem and I resolve if by
"spamassassin", it knows learn and it's more simple than procmailrc
coding. Now I have cca 5% of all spams in my INBOX.


It's not that difficult here but I'm using Postfix, which has built in
pattern checking. Because my mail server also hosts a bunch of topical
internet mailing lists (mainly motorcycle and bass player stuff) and all
of their admin addresses were harvested by spammers long ago, I don't
just get one copy of spam. I usually get several because each of those
admin addresses eventually alias back to me.

I don't use SpamAssassin or Razor but I manage to kill 95% of spam at
the SMTP stage, before the message is accepted for delivery. This works
better than a delivery stage mail processor like procmail because it
bounces the spam back to the server actually sending it. It's easy to
see from the maillogs what IPs are regularly sending me this crap so
they can be blackholed permanently. I think I've got most of CHINANET
in the bit bucket now <g>.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #16

P: n/a
Will Trillich <wi**@serensoft.com> writes:
is there some way of getting a look at tom's or marc's filters? i could
sure use a bit of help there. lordy, we're close to drowing in the
stuff!


Tell me about it :-(

I currently use four levels of filtering:

1. DNSBL lists: blackholes.five-ten-sg.com, bl.spamcop.net, relays.ordb.org
(there are others out there, but these seem to have a good impedance
match to my personal spam load).

2. Private blacklist of IP ranges that have sent me too much spam.
sendmail has a pretty easy mechanism to support this, although it
only seems to support /8 /16 or /24 ranges which is a bit coarse.
(If you've gotten a "Go away spammer" bounce from me, you were caught
by this filter --- let me know and I'll tighten the ranges.)

3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses. I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.

4. Very long list of procmail filters on header and body patterns.

#2 and #4 are fairly personal, in the sense that they have a decent
success/failure ratio for the junk mail I get. I wouldn't recommend
that someone else try my lists, and in any case they take a heck of a
lot of hand maintenance. I've been looking into more automated methods
such as CRM114 but haven't made the jump yet.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #17

P: n/a
On Mon, 19 Apr 2004, Joe Conway wrote:
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


do you force learn those spam that get through the cracks? I get about 20
or 30 messages that slip through the cracks, which I process through with
sa-learn nightly ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: sc*****@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #18

P: n/a
Quoting Tom Lane <tg*@sss.pgh.pa.us>:
Will Trillich <wi**@serensoft.com> writes:
is there some way of getting a look at tom's or marc's filters? i could
sure use a bit of help there. lordy, we're close to drowing in the
stuff!


Tell me about it :-(

I currently use four levels of filtering:

1. DNSBL lists: blackholes.five-ten-sg.com, bl.spamcop.net, relays.ordb.org
(there are others out there, but these seem to have a good impedance
match to my personal spam load).

2. Private blacklist of IP ranges that have sent me too much spam.
sendmail has a pretty easy mechanism to support this, although it
only seems to support /8 /16 or /24 ranges which is a bit coarse.
(If you've gotten a "Go away spammer" bounce from me, you were caught
by this filter --- let me know and I'll tighten the ranges.)


There is a sendmail script for this called "cidrexpand" that allows you to put
in CIDR blocks- i.e. things like 216.185.96.0/19 can be put into the sendmail
access file.

<--stuff deleted-->
--
Keith C. Perry, MS E.E.
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com

____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #19

P: n/a
When grilled further on (Mon, 19 Apr 2004 21:19:05 -0700),
Joe Conway <ma**@joeconway.com> confessed:
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


Have you played with the "spamassassin --report" feature? Works fairly well if
you can integrate it into your e-mail client and report a bunch of
messages as spam. It trains the Bayes filter and reports to Razor (at
the least).

Sylpheed Claws has actions (you use "spamassassin--report %F" as the action),
and it'll batch the report on all selected messages.

I find that after a 10-20 messages, it starts finding the ones that were
slipping through. Since February, I have 200 missed out of 4200.

Cheers,
Rob

--
22:28:27 up 3 days, 2:06, 3 users, load average: 3.24, 3.08, 3.45
Linux 2.6.5-01 #5 SMP Tue Apr 6 21:32:39 MDT 2004

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iEYEARECAAYFAkCEqlQACgkQLQ/DKuwDYznDtACeMAJYS4ijVXYec83ApRVroey8
NMwAn2l4SEDemryfoMZuEDWtNfaUVYj5
=HZLL
-----END PGP SIGNATURE-----

Nov 23 '05 #20

P: n/a
On Tue, Apr 20, 2004 at 01:06:18AM -0400, Tom Lane wrote:
Will Trillich <wi**@serensoft.com> writes:
is there some way of getting a look at tom's or marc's filters? i could
sure use a bit of help there. lordy, we're close to drowing in the
stuff!


Tell me about it :-(

I currently use four levels of filtering:

1. DNSBL lists: blackholes.five-ten-sg.com, bl.spamcop.net, relays.ordb.org
(there are others out there, but these seem to have a good impedance
match to my personal spam load).

2. Private blacklist of IP ranges that have sent me too much spam.
sendmail has a pretty easy mechanism to support this, although it
only seems to support /8 /16 or /24 ranges which is a bit coarse.
(If you've gotten a "Go away spammer" bounce from me, you were caught
by this filter --- let me know and I'll tighten the ranges.)

3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses. I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.

4. Very long list of procmail filters on header and body patterns.


It must be pretty difficult maintain these header and body patterns
and the others lists. I had same problem and I resolve if by
"spamassassin", it knows learn and it's more simple than procmailrc
coding. Now I have cca 5% of all spams in my INBOX.

Karel
--
Karel Zak <za***@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #21

P: n/a
On Tue, Apr 20, 2004 at 05:35:51AM -0000 I heard the voice of
Jim Wilson, and lo! it spake thus:
Tom Lane said:

3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses. I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.


#3 looks interesting though...


I've been blocking HELO as anything under my domain, as well as my IP
address (as well as any bare IP addresses) for a while, and it
certainly drops a fair bit. And I maintain a long list of HELO names,
AND IP ranges, AND sending hostnames, AND senders domains, plus all
the filtering I do after accepting the mail... Wacky. If we just
renamed 'spam' to 'justifiable homicide'...
--
Matthew Fuller (MF4839) | fu******@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/

"The only reason I'm burning my candle at both ends, is because I
haven't figured out how to light the middle yet"

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #22

P: n/a
Karel Zak wrote:
It must be pretty difficult maintain these header and body patterns
and the others lists. I had same problem and I resolve if by
"spamassassin", it knows learn and it's more simple than procmailrc
coding. Now I have cca 5% of all spams in my INBOX.


It's not that difficult here but I'm using Postfix, which has built in
pattern checking. Because my mail server also hosts a bunch of topical
internet mailing lists (mainly motorcycle and bass player stuff) and all
of their admin addresses were harvested by spammers long ago, I don't
just get one copy of spam. I usually get several because each of those
admin addresses eventually alias back to me.

I don't use SpamAssassin or Razor but I manage to kill 95% of spam at
the SMTP stage, before the message is accepted for delivery. This works
better than a delivery stage mail processor like procmail because it
bounces the spam back to the server actually sending it. It's easy to
see from the maillogs what IPs are regularly sending me this crap so
they can be blackholed permanently. I think I've got most of CHINANET
in the bit bucket now <g>.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #23

P: n/a
On Mon, 19 Apr 2004, Joe Conway wrote:
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


do you force learn those spam that get through the cracks? I get about 20
or 30 messages that slip through the cracks, which I process through with
sa-learn nightly ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: sc*****@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #24

P: n/a
Marc G. Fournier wrote:
do you force learn those spam that get through the cracks? I get about 20
or 30 messages that slip through the cracks, which I process through with
sa-learn nightly ...


No, I haven't been doing that, but I guess I ought to start. Thanks for
the suggestion!

Joe

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #25

P: n/a
On Tue, Apr 20, 2004 at 05:35:51AM -0000 I heard the voice of
Jim Wilson, and lo! it spake thus:
Tom Lane said:

3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses. I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.


#3 looks interesting though...


I've been blocking HELO as anything under my domain, as well as my IP
address (as well as any bare IP addresses) for a while, and it
certainly drops a fair bit. And I maintain a long list of HELO names,
AND IP ranges, AND sending hostnames, AND senders domains, plus all
the filtering I do after accepting the mail... Wacky. If we just
renamed 'spam' to 'justifiable homicide'...
--
Matthew Fuller (MF4839) | fu******@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/

"The only reason I'm burning my candle at both ends, is because I
haven't figured out how to light the middle yet"

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #26

P: n/a
Marc G. Fournier wrote:
do you force learn those spam that get through the cracks? I get about 20
or 30 messages that slip through the cracks, which I process through with
sa-learn nightly ...


No, I haven't been doing that, but I guess I ought to start. Thanks for
the suggestion!

Joe

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #27

P: n/a
On Tue, 20 Apr 2004, Joe Conway wrote:
Marc G. Fournier wrote:
do you force learn those spam that get through the cracks? I get about 20
or 30 messages that slip through the cracks, which I process through with
sa-learn nightly ...


No, I haven't been doing that, but I guess I ought to start. Thanks for
the suggestion!


Also check to make sure that you don't have autolearn disabled ... you
would have had to do it manually, as it is enabled by default, but, for
instance, if you are a user on a system, the site-wide may be set to
disable autolearn, so you'd have to enable it yourself ...

I'm looking forward to 3.x coming out, as the Bayes stuff will be able to
run out of an SQL database instead of flat files ... so servers running
Cyrus IMAPd, where there are no physical user accounts, will be able to
start makng use of Bayes as well ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: sc*****@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #28

P: n/a
On Tue, 20 Apr 2004, Joe Conway wrote:
Marc G. Fournier wrote:
do you force learn those spam that get through the cracks? I get about 20
or 30 messages that slip through the cracks, which I process through with
sa-learn nightly ...


No, I haven't been doing that, but I guess I ought to start. Thanks for
the suggestion!


Also check to make sure that you don't have autolearn disabled ... you
would have had to do it manually, as it is enabled by default, but, for
instance, if you are a user on a system, the site-wide may be set to
disable autolearn, so you'd have to enable it yourself ...

I'm looking forward to 3.x coming out, as the Bayes stuff will be able to
run out of an SQL database instead of flat files ... so servers running
Cyrus IMAPd, where there are no physical user accounts, will be able to
start makng use of Bayes as well ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: sc*****@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #29

P: n/a
Quoting "Matthew D. Fuller" <fu******@over-yonder.net>:
On Tue, Apr 20, 2004 at 05:35:51AM -0000 I heard the voice of
Jim Wilson, and lo! it spake thus:
Tom Lane said:

3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses. I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.


#3 looks interesting though...


I've been blocking HELO as anything under my domain, as well as my IP
address (as well as any bare IP addresses) for a while, and it
certainly drops a fair bit. And I maintain a long list of HELO names,
AND IP ranges, AND sending hostnames, AND senders domains, plus all
the filtering I do after accepting the mail... Wacky. If we just
renamed 'spam' to 'justifiable homicide'...
--
Matthew Fuller (MF4839) | fu******@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/

"The only reason I'm burning my candle at both ends, is because I
haven't figured out how to light the middle yet"

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


We could only wish for "justifiable homicide". Now there's a law I would
support! :)

Are you guys miltering to drop the messages with those HELO patterns? I'm
nailing 80%+ across all my clients and I may get 20 to 50 spams/day (down from
200+) which is acceptable but I was going to start using some netfilter hooks
(i.e. Linux firewall code) to inspect mail traffic and apply some more patterns.
If you guys are getting 95%+ via miltering then thats definitely the way to go.

--
Keith C. Perry, MS E.E.
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com

____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #30

P: n/a
Quoting "Matthew D. Fuller" <fu******@over-yonder.net>:
On Tue, Apr 20, 2004 at 05:35:51AM -0000 I heard the voice of
Jim Wilson, and lo! it spake thus:
Tom Lane said:

3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses. I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.


#3 looks interesting though...


I've been blocking HELO as anything under my domain, as well as my IP
address (as well as any bare IP addresses) for a while, and it
certainly drops a fair bit. And I maintain a long list of HELO names,
AND IP ranges, AND sending hostnames, AND senders domains, plus all
the filtering I do after accepting the mail... Wacky. If we just
renamed 'spam' to 'justifiable homicide'...
--
Matthew Fuller (MF4839) | fu******@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/

"The only reason I'm burning my candle at both ends, is because I
haven't figured out how to light the middle yet"

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


We could only wish for "justifiable homicide". Now there's a law I would
support! :)

Are you guys miltering to drop the messages with those HELO patterns? I'm
nailing 80%+ across all my clients and I may get 20 to 50 spams/day (down from
200+) which is acceptable but I was going to start using some netfilter hooks
(i.e. Linux firewall code) to inspect mail traffic and apply some more patterns.
If you guys are getting 95%+ via miltering then thats definitely the way to go.

--
Keith C. Perry, MS E.E.
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com

____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #31

P: n/a
On Tue, Apr 20, 2004 at 10:17:05AM -0300, Marc G. Fournier wrote:
On Mon, 19 Apr 2004, Joe Conway wrote:
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


do you force learn those spam that get through the cracks? I get about 20
or 30 messages that slip through the cracks, which I process through with
sa-learn nightly ...


i have been doing that some -- but i still get about 200 false
negatives per day. takes too much time to run 'sa-learn' all the
time when it seems like spam #n is an awful lot like spam #n-1.

--
"Why did they hard code that value into the program?".
"My only guess would be to maximize suckage."
http://suso.suso.org/docs/apache_and.../part4-2.phtml

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #32

P: n/a
On Tue, Apr 20, 2004 at 10:17:05AM -0300, Marc G. Fournier wrote:
On Mon, 19 Apr 2004, Joe Conway wrote:
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


do you force learn those spam that get through the cracks? I get about 20
or 30 messages that slip through the cracks, which I process through with
sa-learn nightly ...


i have been doing that some -- but i still get about 200 false
negatives per day. takes too much time to run 'sa-learn' all the
time when it seems like spam #n is an awful lot like spam #n-1.

--
"Why did they hard code that value into the program?".
"My only guess would be to maximize suckage."
http://suso.suso.org/docs/apache_and.../part4-2.phtml

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #33

P: n/a
On Tue, 20 Apr 2004, Will Trillich wrote:
On Tue, Apr 20, 2004 at 10:17:05AM -0300, Marc G. Fournier wrote:
On Mon, 19 Apr 2004, Joe Conway wrote:
Marc G. Fournier wrote:
> Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
> enabled ...

I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


do you force learn those spam that get through the cracks? I get about 20
or 30 messages that slip through the cracks, which I process through with
sa-learn nightly ...


i have been doing that some -- but i still get about 200 false
negatives per day. takes too much time to run 'sa-learn' all the
time when it seems like spam #n is an awful lot like spam #n-1.


I'm down to ~20 false positives right now ... usually spent my last half
hour in front of the tv at night sorting them out and filtering them
through bayes ...

My spam filters right now are picking up between 2000->3000 messages per
day which aren't getting into my main folders ...
----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: sc*****@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #34

P: n/a
On Tue, 20 Apr 2004, Will Trillich wrote:
On Tue, Apr 20, 2004 at 10:17:05AM -0300, Marc G. Fournier wrote:
On Mon, 19 Apr 2004, Joe Conway wrote:
Marc G. Fournier wrote:
> Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
> enabled ...

I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


do you force learn those spam that get through the cracks? I get about 20
or 30 messages that slip through the cracks, which I process through with
sa-learn nightly ...


i have been doing that some -- but i still get about 200 false
negatives per day. takes too much time to run 'sa-learn' all the
time when it seems like spam #n is an awful lot like spam #n-1.


I'm down to ~20 false positives right now ... usually spent my last half
hour in front of the tv at night sorting them out and filtering them
through bayes ...

My spam filters right now are picking up between 2000->3000 messages per
day which aren't getting into my main folders ...
----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: sc*****@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #35

P: n/a

Tom Lane <tg*@sss.pgh.pa.us> wrote:
[snip]
3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses.
IOW: Anything the HELOs with your mail server's own hostname. If you
can do it: Changing that to anything that HELOs with your domain name
(that's not supposed to) and you'll catch still more. Add to that
anything HELOing with your mail server's IP address and you'll catch
more yet.
I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.

[snip]

Postfix, which is what I use, has built-in support for HELO checks.

--
Jim Seymour | Spammers sue anti-spammers:
js******@LinxNet.com | http://www.LinxNet.com/misc/spam/slapp.php
http://jimsun.LinxNet.com | Please donate to the SpamCon Legal Fund:
| http://www.spamcon.org/legalfund/

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #36

P: n/a

Tom Lane <tg*@sss.pgh.pa.us> wrote:
[snip]
3. I have noticed that bouncing any machine that sends "HELO
sss.pgh.pa.us" gets rid of a ton of spam and viruses.
IOW: Anything the HELOs with your mail server's own hostname. If you
can do it: Changing that to anything that HELOs with your domain name
(that's not supposed to) and you'll catch still more. Add to that
anything HELOing with your mail server's IP address and you'll catch
more yet.
I don't know of
any real clean way to do this, but I have a sendmail.cf hack for it.

[snip]

Postfix, which is what I use, has built-in support for HELO checks.

--
Jim Seymour | Spammers sue anti-spammers:
js******@LinxNet.com | http://www.LinxNet.com/misc/spam/slapp.php
http://jimsun.LinxNet.com | Please donate to the SpamCon Legal Fund:
| http://www.spamcon.org/legalfund/

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #37

P: n/a
On Tue, Apr 20, 2004 at 01:30:59PM -0300, Marc G. Fournier wrote:
Also check to make sure that you don't have autolearn disabled ... you
would have had to do it manually, as it is enabled by default, but, for
instance, if you are a user on a system, the site-wide may be set to
disable autolearn, so you'd have to enable it yourself ...

I'm looking forward to 3.x coming out, as the Bayes stuff will be able to
run out of an SQL database instead of flat files ... so servers running
Cyrus IMAPd, where there are no physical user accounts, will be able to
start makng use of Bayes as well ...


You should look into MailScanner, at www.mailscanner.info. I use it as
the framework for running SA and anti-virus software, using Exim as my
mail server. There are no physical user accounts; all virtual stuff.
MailScanner let's SA, along with the Bayesian filter, work for all email
coming through.

Michael
--
Michael Darrin Chaney
md******@michaelchaney.com
http://www.michaelchaney.com/

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #38

P: n/a
On Tue, Apr 20, 2004 at 01:30:59PM -0300, Marc G. Fournier wrote:
Also check to make sure that you don't have autolearn disabled ... you
would have had to do it manually, as it is enabled by default, but, for
instance, if you are a user on a system, the site-wide may be set to
disable autolearn, so you'd have to enable it yourself ...

I'm looking forward to 3.x coming out, as the Bayes stuff will be able to
run out of an SQL database instead of flat files ... so servers running
Cyrus IMAPd, where there are no physical user accounts, will be able to
start makng use of Bayes as well ...


You should look into MailScanner, at www.mailscanner.info. I use it as
the framework for running SA and anti-virus software, using Exim as my
mail server. There are no physical user accounts; all virtual stuff.
MailScanner let's SA, along with the Bayesian filter, work for all email
coming through.

Michael
--
Michael Darrin Chaney
md******@michaelchaney.com
http://www.michaelchaney.com/

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #39

P: n/a
On Mon, Apr 19, 2004 at 09:19:05PM -0700, Joe Conway wrote:
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


No offense, but that means you're not doing it right. I use SA with
Bayes (and everything else), and I'm getting better than 98% with no
false positives. Yesterday I had 823 spams (you read that correctly)
with 9 that made it through. When I woke up this morning, I had 334
spams with 2 that made it through.

I constantly train my Bayesian filter by using an email address I set
up where I forward all false-negatives. So the few that get through
won't be doing that again. It simply runs them through sa-learn. If I
get some time, I'll post the code to my web site.

Spammers cannot outsmart a Bayesian filter. It's game-over. You don't
need to upgrade, you need to figure out how to make your current setup
work.

Make sure you have the latest SA and make sure that Bayesian filtering
is turned on and working, and make sure to train the filter. Reply to
me offlist if you need a group of 5000 or so spams to help train it.

Michael
--
Michael Darrin Chaney
md******@michaelchaney.com
http://www.michaelchaney.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #40

P: n/a
On Mon, Apr 19, 2004 at 09:19:05PM -0700, Joe Conway wrote:
Marc G. Fournier wrote:
Huh? I just use Spamassassin myself, with Razor/Pyzor/DCC and Bayes all
enabled ...


I use exactly the same setup. But recently I've noticed that the
spammers are getting smarter -- I think 20% of it is slipping by the
filters. I'm going to need something better.


No offense, but that means you're not doing it right. I use SA with
Bayes (and everything else), and I'm getting better than 98% with no
false positives. Yesterday I had 823 spams (you read that correctly)
with 9 that made it through. When I woke up this morning, I had 334
spams with 2 that made it through.

I constantly train my Bayesian filter by using an email address I set
up where I forward all false-negatives. So the few that get through
won't be doing that again. It simply runs them through sa-learn. If I
get some time, I'll post the code to my web site.

Spammers cannot outsmart a Bayesian filter. It's game-over. You don't
need to upgrade, you need to figure out how to make your current setup
work.

Make sure you have the latest SA and make sure that Bayesian filtering
is turned on and working, and make sure to train the filter. Reply to
me offlist if you need a group of 5000 or so spams to help train it.

Michael
--
Michael Darrin Chaney
md******@michaelchaney.com
http://www.michaelchaney.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #41

P: n/a
On Wed, 21 Apr 2004, Michael Chaney wrote:
On Tue, Apr 20, 2004 at 01:30:59PM -0300, Marc G. Fournier wrote:
Also check to make sure that you don't have autolearn disabled ... you
would have had to do it manually, as it is enabled by default, but, for
instance, if you are a user on a system, the site-wide may be set to
disable autolearn, so you'd have to enable it yourself ...

I'm looking forward to 3.x coming out, as the Bayes stuff will be able to
run out of an SQL database instead of flat files ... so servers running
Cyrus IMAPd, where there are no physical user accounts, will be able to
start makng use of Bayes as well ...


You should look into MailScanner, at www.mailscanner.info. I use it as
the framework for running SA and anti-virus software, using Exim as my
mail server. There are no physical user accounts; all virtual stuff.
MailScanner let's SA, along with the Bayesian filter, work for all email
coming through.


Does it allow for per user preferences? I haven't found a clean way to do
that yet, other using using the spamcheck.py lmtpproxy ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: sc*****@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #42

P: n/a
On Wed, 21 Apr 2004, Michael Chaney wrote:
On Tue, Apr 20, 2004 at 01:30:59PM -0300, Marc G. Fournier wrote:
Also check to make sure that you don't have autolearn disabled ... you
would have had to do it manually, as it is enabled by default, but, for
instance, if you are a user on a system, the site-wide may be set to
disable autolearn, so you'd have to enable it yourself ...

I'm looking forward to 3.x coming out, as the Bayes stuff will be able to
run out of an SQL database instead of flat files ... so servers running
Cyrus IMAPd, where there are no physical user accounts, will be able to
start makng use of Bayes as well ...


You should look into MailScanner, at www.mailscanner.info. I use it as
the framework for running SA and anti-virus software, using Exim as my
mail server. There are no physical user accounts; all virtual stuff.
MailScanner let's SA, along with the Bayesian filter, work for all email
coming through.


Does it allow for per user preferences? I haven't found a clean way to do
that yet, other using using the spamcheck.py lmtpproxy ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: sc*****@hub.org Yahoo!: yscrappy ICQ: 7615664

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #43

P: n/a
Michael Chaney wrote:
Make sure you have the latest SA and make sure that Bayesian filtering
is turned on and working, and make sure to train the filter. Reply to
me offlist if you need a group of 5000 or so spams to help train it.


I've got the latest SA and I'm using Bayesian filtering, autolearn,
razor2, dcc, and pyzor. I'm also using relays.ordb.org,
sbl.spamhaus.org, bl.spamcop.net, and blackholes.five-ten-sg.com
(although I just added that last one yesterday). I've verified that
autolearn is working. I have my threshold set downward, from the default
of 5.0, to 2.5.

I get a comparible amount of spam (~600 to 1000 per day) and my setup
*was* about 98% effective until a month or so ago. These days it is more
like 80%. I've noticed many of the spam getting through appears
specifically targeted at getting by SA -- no HTML, a paragraph of
nonsense (or sometimes out of some public domain book), and a one liner
trying to sell me a mortgage or something.

The one thing I had *not* been doing, but started to do as of last
night, is to use the false-negatives to explicitly train the Bayesian
filter. It was easy enough to set up. I created an hourly cron job as
follows:

/usr/bin/sa-learn --mbox --spam /path/to/false-neg.mbox

Now I just drop all false negatives into that mailbox, and clean them
out periodically. Hopefully that will make a significant improvement.

Joe

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 23 '05 #44

P: n/a
Michael Chaney wrote:
Make sure you have the latest SA and make sure that Bayesian filtering
is turned on and working, and make sure to train the filter. Reply to
me offlist if you need a group of 5000 or so spams to help train it.


I've got the latest SA and I'm using Bayesian filtering, autolearn,
razor2, dcc, and pyzor. I'm also using relays.ordb.org,
sbl.spamhaus.org, bl.spamcop.net, and blackholes.five-ten-sg.com
(although I just added that last one yesterday). I've verified that
autolearn is working. I have my threshold set downward, from the default
of 5.0, to 2.5.

I get a comparible amount of spam (~600 to 1000 per day) and my setup
*was* about 98% effective until a month or so ago. These days it is more
like 80%. I've noticed many of the spam getting through appears
specifically targeted at getting by SA -- no HTML, a paragraph of
nonsense (or sometimes out of some public domain book), and a one liner
trying to sell me a mortgage or something.

The one thing I had *not* been doing, but started to do as of last
night, is to use the false-negatives to explicitly train the Bayesian
filter. It was easy enough to set up. I created an hourly cron job as
follows:

/usr/bin/sa-learn --mbox --spam /path/to/false-neg.mbox

Now I just drop all false negatives into that mailbox, and clean them
out periodically. Hopefully that will make a significant improvement.

Joe

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 23 '05 #45

P: n/a
On Wed, Apr 21, 2004 at 02:11:16PM -0300, Marc G. Fournier wrote:
You should look into MailScanner, at www.mailscanner.info. I use it as
the framework for running SA and anti-virus software, using Exim as my
mail server. There are no physical user accounts; all virtual stuff.
MailScanner let's SA, along with the Bayesian filter, work for all email
coming through.


Does it allow for per user preferences? I haven't found a clean way to do
that yet, other using using the spamcheck.py lmtpproxy ...


Yes, MailScanner allows per-user and per-domain preferences.

Michael
--
Michael Darrin Chaney
md******@michaelchaney.com
http://www.michaelchaney.com/

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #46

P: n/a
On Wed, Apr 21, 2004 at 02:11:16PM -0300, Marc G. Fournier wrote:
You should look into MailScanner, at www.mailscanner.info. I use it as
the framework for running SA and anti-virus software, using Exim as my
mail server. There are no physical user accounts; all virtual stuff.
MailScanner let's SA, along with the Bayesian filter, work for all email
coming through.


Does it allow for per user preferences? I haven't found a clean way to do
that yet, other using using the spamcheck.py lmtpproxy ...


Yes, MailScanner allows per-user and per-domain preferences.

Michael
--
Michael Darrin Chaney
md******@michaelchaney.com
http://www.michaelchaney.com/

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #47

P: n/a
Joe Conway wrote:
I get a comparible amount of spam (~600 to 1000 per day) and my setup
*was* about 98% effective until a month or so ago. These days it is more
like 80%. I've noticed many of the spam getting through appears
specifically targeted at getting by SA -- no HTML, a paragraph of
nonsense (or sometimes out of some public domain book), and a one liner
trying to sell me a mortgage or something.

The one thing I had *not* been doing, but started to do as of last
night, is to use the false-negatives to explicitly train the Bayesian
filter. It was easy enough to set up. I created an hourly cron job as
follows:

/usr/bin/sa-learn --mbox --spam /path/to/false-neg.mbox

Now I just drop all false negatives into that mailbox, and clean them
out periodically. Hopefully that will make a significant improvement.


I can tell you it certainly will.

--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #48

P: n/a
Joe Conway wrote:
I get a comparible amount of spam (~600 to 1000 per day) and my setup
*was* about 98% effective until a month or so ago. These days it is more
like 80%. I've noticed many of the spam getting through appears
specifically targeted at getting by SA -- no HTML, a paragraph of
nonsense (or sometimes out of some public domain book), and a one liner
trying to sell me a mortgage or something.

The one thing I had *not* been doing, but started to do as of last
night, is to use the false-negatives to explicitly train the Bayesian
filter. It was easy enough to set up. I created an hourly cron job as
follows:

/usr/bin/sa-learn --mbox --spam /path/to/false-neg.mbox

Now I just drop all false negatives into that mailbox, and clean them
out periodically. Hopefully that will make a significant improvement.


I can tell you it certainly will.

--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #49

P: n/a
Bruce Momjian wrote:
Joe Conway wrote:
The one thing I had *not* been doing, but started to do as of last
night, is to use the false-negatives to explicitly train the Bayesian
filter. It was easy enough to set up. I created an hourly cron job as
follows:

/usr/bin/sa-learn --mbox --spam /path/to/false-neg.mbox

Now I just drop all false negatives into that mailbox, and clean them
out periodically. Hopefully that will make a significant improvement.


I can tell you it certainly will.


Doesn't sa-learn also require you to teach it Ham as well? My
problem has been that sa-learn appears to ignore white-listed emails
and therefore can't learn from 90% of my Ham. Meanwhile, I get spam
that slips through SA that my Mozilla client *correctly* identifies
as Junk. Once a week, I take that Junk email, along with all Ham and
run sa-learn with the appropriate --spam/--ham switch. But it
doesn't seem to be improving. I still get spam which SA fails to
identify but which, 95% of the time, Mozilla correctly identifies.

Mike Mascari

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #50

69 Replies

This discussion thread is closed

Replies have been disabled for this discussion.