[Full-disclosure] throwing the book at spam



http://www.cyberdelix.net/tech/kaboom.htm


This page is to help you kill spammers, err, I mean spam, here's the
blueprints to my silver bullet, which (when combined with my other
filters) kills 99.75% of my spam. That 0.25% corresponds to about 1
message per day (and it is the target of further work).

where this filter fits

This is the Filter of Last Resort. Reason being, it's very
aggressive. You'll see why below. For this reason, this filter should
be used last, after all the other filters. This way, this filter will
only ever deal with the dregs, which means it's not so dangerous.

As this is the Filter of Last Resort, it does NOT include filtering
for all kinds of spam. Rather, it is designed to kill the spams that
the other filters miss.

It is dangerous. Most filters mark the spam and let it be. This
filter kills it. No Deleted Items, no Recycle Bin, no undo, no 'are
you sure'. Bang bang, dead dead. All that is left is a logfile entry.

automatic whitelist maintenance

To minimise the chances of a legit mail being terminated, this filter
includes a "make whitelist" command. This command tells the filter to
collect all email addresses from inside a number of other files (my
address books), eliminate the duplicates and save the list to disk.
This list is then used by the "move whitelisted" command, which moves
any message containing whitelisted strings to a separate folder (the
"whitebox").

This filter also supports three other whitelists, these are
good_senders, good_recipients and good_subjects. Any mail containing
a whitelisted string in the correct location is automatically
"whiteboxed" (moved to the whitebox).

how it works

This filter is actually built of 11 special-purpose filters. Any mail
matching one of these tests is deleted. The filters are as follows:

missing_addressee (missing 'To:' or 'for' field)
missing_sender (missing 'From:' field)
unlikely_chars (non-alphabetic subject or sender)
unlikely_dates (message date too old, or in future)
bounces (mail delivery failure, etc)
blacklisted (bad_senders/bad_recipients/bad_subjects)
gifs_attached (message has an attached GIF image)
X-RBL (message contains X-RBL-Warning: headerline)
X-DNS (message contains X-DNS-Warning: headerline)
X-SVF (message contains X-Sender-Verification-Failed: headerline)
analyse_received (Received: line invalid - see below)

These tests are fairly self-explanatory, with the exception of the
analyse_received test. This test analyses the significant Received:
headerline inside each mail (there are usually several Received:
lines, but only one is relevant for our purpose). Any mail with an
invalid Received: line is deleted. The tests for validity are as
follows:

IP_missing
IP_obfuscation
IP_unreversible
by-line_not_present
sending_SMTP_server_unresolvable
sending_hostname_not_provided

If these tests all pass, the message is then tested for a mismatch
between the sender's hostname and the hostname of the sender recorded
by the receiver. Again, a fail results in the message being deleted.

why it works

Spammers try and get their messages through by hiding, disguising or
armouring their spams. This filter spends most of its time looking
for evidence of armour. It assumes that an attempt at armouring means
the mail is spam.

Note that this approach is very unforgiving toward badly configured,
but legitimate systems, or systems using non-standard data formats.
Another reason to run this filter last. Mistakes can be minimised by
keeping the whitelists up-to-date, and encouraging all to run RFC-
compliant nodes.

And no, I'm not worried about posting my blueprints. Spammers are
welcome to use less obfuscation - this will send them straight into
the jaws of standard spam filters, but life's a bitch eh. They can
use more obfuscation, be my guest cos I need some extra handles to
kill that last 0.25%.

caveats

These notes are posted ahead of any software release, so as to
maximise the damage they can cause.

This is developmental software. It works, but only in the development
environment. In particular, it supports Pegasus Mail ONLY.

Addressbooks must be TEXT files, or they will not be processed by the
whitelister.

The whitebox must currently be processed manually.

greetz

Thanks have got to go to the twits out there sending me 1000+ spams a
day. Without your contribution, I would never have had the sample
size I needed.

---
Stuart Udall
stuart at@xxxxxxxxxxxxxx net - http://www.cyberdelix.net/

---
* Origin: lsi: revolution through evolution (192:168/0.2)

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/



Relevant Pages

  • Re: Why cant ISPs stop spam/virus ?!
    ... I don't doubt that a small load of well designed spam can pass through. ... You need to get a decent ISP. ... The method of distribution is now thousands of Windows computers, ... You cannot filter by place of origin. ...
    (comp.os.linux.misc)
  • RE: Bystander shot by a spam filter.
    ... Bystander shot by a spam filter. ... bad advice is being mass marketed through the good offices of FreeBSD, ... Spambouncer doesn't like Inflow. ...
    (FreeBSD-Security)
  • Re: Look at these update from M$ Corporation.
    ... a mass scale which results in the complete breakdown of communication without ... few samples for the filters to learn that this is spam and that is not. ... because you're posting tripe to mailing lists with a needless Reply-To set ... samples of what I don't want and feeding them to the filter when the show up. ...
    (Debian-User)
  • Re: SPAM increasing?
    ... past my ISP's spam filter and also past Thunderbird's spam filter. ... which has my usenet activity history [dating to about 1980 ... something to add that flag to bulk mail. ...
    (soc.retirement)
  • Re: OT: writing resumes with VT100 for a Lisp job
    ... The more spam a user gets, the less likely he'll be to notice one ... There's a very simple way any spammer can defeat such a filter: ... it would not be enough for spammers to make ... Bubby We Need Your PERMISSI0N ...
    (comp.lang.lisp)