To demonstrate just how much of a problem it really is, I have created a system that counts spam as it is identified by my spam filtering program (SpamBayes), places the count into a database (RRDtool) and periodically updates charts showing the rate of incoming spam over a period of time (see bottom of page for more detail).
|
People have been asking me for source code for this page. The truth is there isn't much source code needed to get this going. There is probably a miriad of other ways to get the same thing accomplished, this is just one of them.
I use procmail and spambayes to filter my mail. Spambayes comes with a handy script called hammie that you use in your procmail rules. Hammie will insert a header into every email with a determination of the probability of the e-mail being spam.
Before anything, you will need to create the RRD file. (For more info on rrdtool use google). I don't remember how I created mine, but it's something like:
rrdtool create spam.rrd --step 3600 DS:count:ABSOLUTE:864000:0:100000 \ RRA:AVERAGE:.5:1:87600 \ RRA:MIN:.5:288:3650 \ RRA:MAX:.5:288:3650
Here is the relevant part of my .procmailrc file. This calls the spamcount.py script whenever spam is encountered:
# Call spambayes hammie :0fw | /usr/local/bin/hammiefilter.py # SPAM? :0 c: /home/grisha/.procmail/spamcount.lck * ^X-Spambayes-Classification: spam | /home/grisha/.procmail/spamcount.py :0: * ^X-Spambayes-Classification: spam $HOME/mail/spamThe spamcount.py is a Python script that updates the RRD file. It uses the RRDtool interface for Python:
#!/usr/local/bin/python import time import sys import RRDtool try: rrd = RRDtool.RRDtool() rrd.update(("/home/grisha/.procmail/spam.rrd", "%d:1" % int(time.time()))) finally: # consume all input sys.stdin.read()Finally you need a script to generate the graphs, which you'd call from cron on a regular interval. Mine looks like this:
#!/usr/local/bin/python import time import sys import RRDtool rrd = RRDtool.RRDtool() rrd.graph(("spam.gif", "-s", "1010000000", '--title=Spam Graph. Last Updated: %s' % time.ctime(), "DEF:count=/home/grisha/.procmail/spam.rrd:count:AVERAGE", "CDEF:hr=count,86400,*", 'LINE2:hr#ff0000:Spams/Day')) rrd.graph(("spamweek.gif", "-s", "-604800", '--title=Spam Graph. Last Updated: %s' % time.ctime(), "DEF:count=/home/grisha/.procmail/spam.rrd:count:AVERAGE", "CDEF:hr=count,3600,*", 'LINE2:hr#ff0000:Spams/Hr')) rrd.graph(("spamday.gif", "-s", "-86400", '--title=Spam Graph. Last Updated: %s' % time.ctime(), "DEF:count=/home/grisha/.procmail/spam.rrd:count:AVERAGE", "CDEF:hr=count,3600,*", 'LINE2:hr#ff0000:Spams/Hr'))You also need a script to copy/upload the graphs to your website, this I leave as an excercise for the reader.