Spambots

I keep meaning to post short posts in the technical section of this blog on regular intervals talking about interesting things I’ve come across while running an email service. I also seem to end up being distracted by something else, so I’ve decided I’m going to try and post a Friday summary and see if forcing myself to do it once a week helps.

This week I’ve been dealing a bit with spambots. Currently we have 6 primary incoming MX servers, and 1 secondary MX server. The primary MX record for all our domains (in1.smtp.messagingengine.com) resolves to the 6 IP addresses of these primary servers. Each server runs postfix as the incoming MTA, with a number of patches and external helper programs to determine what are valid delivery addresses, and also to do greylisting, address enumeration detection, etc. On top of this, there’s a separate process that monitors the log file that postfix generates looking for the extra patterns of behavior that give away spamming bots that might not be obvious on one connection, but over time are clear patterns of behavior that legitimate email sending hosts don’t do.

During the week I noticed a new pattern of behavior that seemed to be specific to spam bots only, so I’ve now got the monitoring script detecting that, and adding the detected machines to our “early” block list for 72 hours at a time. The early block list is an internal list of IP addresses and if a host connects to us that is on that list, we immediately return a 554 reject code and disconnect the host. Over the last 2 days the early list has built up to over 600,000 separate IP addresses (up from ~200,000 with the previous pattern detectors), and we’ve noticed a significant reduction in the amount of storm bot spam getting through (you might have noticed the storm bot spam previously as the “greeting card” spam, or the “new online office/club” spam, or the recent “check this video of you on youtube” spam)

Note that this special early block list only helps with email delivered directly to our servers, unfortunately it doesn’t help if you use a forwarding service because the IP address we see is that of the forwarding service, not the connecting spam bot. If you use an @fastmail.fm address or any of our other domains, this protection is automatic. If you have your own domain, we highly recommend you host the email for your domain with us for the best spam protection possible. For more information, see these FAQ entries:

http://www.fastmail.fm/docs/faqparts/VirtualDomains.htm#VirtualVsForward

http://www.fastmail.fm/docs/faqparts/VirtualDomains.htm#VirtualSetup

Posted in Technical. Comments Off

Improved spam filtering with per-user bayes databases

FastMail now has individually trainable bayes databases for each user to improve spam filtering. This is currently only available for Full and Enhanced accounts with Normal, Aggressive or Custom filtering setup on the Options -> Spam/Virus Protection screen.

For personal bayes databases to be effective, you have to train them with at least 200 spam messages and 200 non-spam messages. You can train your personal bayes database by selecting some messages and using the Report spam and Report non-spam options from the Actions menu on the Mailbox screen.

If you haven’t yet trained 200 spam and non-spam messages, then we use a global continuously updated bayes database against your incoming messages, which helps detect spam/non-spam, but isn’t as good as a personally trained one. To see if email being delivered to you is currently using the global or personal bayes database, you can look at the headers of the message (on the view message screen, click the Show full headers link). One of the headers present should be called X-Spam-hits, within that header will be the text BAYES_USED and immediately after it either global or user. If it currently says global it means you still need to train more spam/non-spam messages before the personal database can be used.

For more information on the per-user bayes database system, please see this forum thread: http://www.emaildiscussions.com/showthread.php?t=49547

For IMAP users who don’t use the web interface much, there’s also an experimental feature in beta testing that lets you specify folders as containing spam/non-spam. Messages placed in those folders will automatically be learnt as appropriate. For more information on this feature, please see this forum thread: http://www.emaildiscussions.com/showthread.php?t=49936

Posted in News. Comments Off
Follow

Get every new post delivered to your Inbox.

Join 5,143 other followers