An Anti-Spam Method with SMTP Session Abort Nariyoshi YAMAI 1 Kiyohiko OKAYAMA 1 Takumi SEIKE 1 Keita KAWANO 1 Motonori NAKAMURA 2 Shin MARUYAMA 3 1 Okayama University, Japan 2 National Institute of Informatics, Japan 3 CO-CONV Corporation, Japan
2008/3/27 MIT Spam Conference Contents Existing anti-spam methods Anti-spam method with SMTP session abort Implementation and evaluation of prototype system Conclusions
Existing anti-spam methods 2008/3/27 MIT Spam Conference
2008/3/27 MIT Spam Conference Tempfailing (1) Utilizes difference of MTA behavior after temporary error –Legitimate MTAs Retry to send the temporarily failed messages –Spam sending MTAs Prefer throughput Give up resending the temporarily failed messages
2008/3/27 MIT Spam Conference First Delivery Second delivery Tempfailing (2) Spam sending MTA Legitimate MTA temporary error MTA temporary error Recipients retry Saves triplet ( Sender IP, SMTP From, SMTP To) Sender IP SMTP From SMTP To Sender IP SMTP From SMTP To
2008/3/27 MIT Spam Conference Tempfailing (3) Problems –RFC2821: Sending Strategy (excerpt) The sender MUST delay retrying a particular destination after one attempt has failed. In general, the retry interval SHOULD be at least 30 minutes. Causes large delay for legitimate mail delivery
2008/3/27 MIT Spam Conference Tempfailing (4) Problems (cont.) –Utilizes the following triplet for retransmission judgment: Sender IP SMTP From SMTP To Rejects retries from a different MTA
2008/3/27 MIT Spam Conference Tempfailing (5) Problems (cont.) –Rejects before receiving header/body –Logs only the triplet (Sender IP, SMTP From, SMTP To) Difficult to recover false positives
2008/3/27 MIT Spam Conference Distributed collaborative filter MTA Spam sending MTA Recipients Spam database check not found spam register found Only messages already read by existent recipients can be filtered out
Anti-spam method with SMTP session abort 2008/3/27 MIT Spam Conference
2008/3/27 MIT Spam Conference Summary of known problems (Tempfailing) Large delay (Tempfailing) Retries from a different MTA (Tempfailing) Recovery from false positives (Distributed collaborative filter) only messages read by recipients into DB
2008/3/27 MIT Spam Conference Features of the proposed method (Tempfailing) Large delay (Tempfailing) Retries from a different MTA (Tempfailing) Recovery from false positives (Distributed collaborative filter) only messages read by recipients into DB Introducing two mail gateways (MGs) Immediate fallback to the secondary MG SMTP session abort function Preserving header/body on first attempt Retransmission judgment with Message-ID or checksum instead of IP Automatic registration of unresent/undeliverable messages Early registration of many spam mails
2008/3/27 MIT Spam Conference System layout and behavior (1) Organization Inside MTA Recipients Spam database Primary mail gateway Secondary mail gateway Mail gateway × TCP segment (RST) SMTP session abort After SMTP session to the primary MG is aborted, a legitimate MTA usually sends the message to the secondary MG immediately. Retry Reducing delay of legitimate mail delivery header body Preservingheader/body Check triplet (MsgID/checksum, SMTP From, SMTP To) Retransmission judgment based on header(MsgID) or body(checksum) header body Sender MTA Preserving header/body in case of false positive
2008/3/27 MIT Spam Conference System layout and behavior (2-1) Organization Inside MTA Recipients Spam database Primary mail gateway Secondary mail gateway Spam sending MTA undeliverable RCPT TO recipient check Unknown recipient register header body × SMTP session abort headerbody
2008/3/27 MIT Spam Conference System layout and behavior (2-2) Organization Inside MTA Recipients Spam database Primary mail gateway Secondary mail gateway formerly deliverable RCPT TO Recipient check Unknown recipient register header body × SMTP session abort headerbody Recipient check header body cancel RCPT TO Automatic registration of unresent/undeliverable messages Sender MTA
User preference of abort timing (1) Affects network traffic and delay Possible options –Accept No session abort –Header Abort after End of Header Low traffic/delay –Body Abort after End of Message Easy recovery on false positives 2008/3/27 MIT Spam Conference
2008/3/27 MIT Spam Conference User preference of abort timing (2) Organization Inside MTA A Spam database Primary mail gateway Secondary mail gateway RCPT TO: A RCPT TO: B RCPT TO: C RCPT TO: A × SMTP session abort at end of message RCPT TO: B RCPT TO: C RCPT TO: A RCPT TO: B RCPT TO: C Sender MTA accept BC headerbody header body
Implementation and evaluation of prototype system 2008/3/27 MIT Spam Conference
2008/3/27 MIT Spam Conference Prototype system implementation Platform –FreeBSD with sendmail & DCC SMTP session abort function –An external program using “ipfw” Retransmission judgment –(Message-ID, SMTP From, SMTP To)
2008/3/27 MIT Spam Conference First operation test (1) Objectives –Performance evaluation of blocking/filtering Test domains –Some sub-domains in okayama-u.ac.jp –Already obsolete five years before –To be removed in one month –Some legitimate mails were possibly sent to these domains Test period –Seven days from Jan. 29 to Feb. 5th, 2006
2008/3/27 MIT Spam Conference First operation test (2) Result Number of mails processed54,719 Number of mails blocked44,303 Number of mails received10,416 Number of mails filtered out by DCC2,180 81% (44303/54719) of mails processed were blocked by SMTP session abort 20% (2180/10416) of mails received were filtered out by DCC NB: we counted both legitimate mails and spam mails.
2008/3/27 MIT Spam Conference Second operation test (1) Objectives –Comparison with conventional tempfailing as for processing of legitimate mails Test domain –New sub-domain dedicated for this test –Only 1 IP address available Two MGs have the same IP address Usual in small companies in Japan
2008/3/27 MIT Spam Conference Second operation test (2) Result Domain (service)MTAResendDifferent MTAMin. interval cc.okayama-u.ac.jp (Univ.)sendmailYESNO0(sec) nifty.com (ISP)sendmailYESNO1 listbox.com (ML)postfixYESNO1 yahoo.com (free mail)?YESNO10 gmail.com (free mail)?YES 385 aol.com (free mail)?YESNO6 hotmail.com (free mail)SMTPSVCYESNO6 yahoogroups.jp (free ML)?YESNO1 freeml.com (free ML)qmailYESNO399 mag2.com (mail magazine)qmailYESNO3264 trashmail.net (anonymous mail)postfixYESNO6 All messages even from gmail.com were accepted without whitelist Small delays of mail delivery from many domains Some domains using qmail still had large delays
2008/3/27 MIT Spam Conference Possible false positives Messages without Message-ID –Use Date: field (mandatory), or –Use the checksum of the body MTAs without retransmission –Can recover lost headers/bodies easily –Find such MTAs and register them into whitelist MTAs changing SMTP From address –Use (Message-ID, SMTP To) without SMTP From for retransmission judgment
Conclusions 2008/3/27 MIT Spam Conference
2008/3/27 MIT Spam Conference Conclusions Combination of three functions –Tempfailing –Distributed Collaborative filter –SMPT session abort Reduces the drawbacks of existing two methods Future works –Long term actual performance evaluation –Combination with on-the-fly filters
Questions ? Please speak slowly and clearly 2008/3/27 MIT Spam Conference