Download presentation
Presentation is loading. Please wait.
Published byKristian Holmes Modified over 9 years ago
1
Fighting Spam Enterprise Spam Filtering Using Open Source Tools
2
Introduction Newsflash: SPAM is a problem Newsflash: SPAM is a problem SRJC: 60-80% of mail received is Spam! Commercial Solutions exist, but are expensive Open Source tools are a powerful alternative
3
Tonight’s Agenda SpamAssassin Overview Additional Spam Rules (S.A.R.E.) Integrating with Multiple Mail Servers Bayesian Filtering
4
SpamAssassin – How It Works Uses the combined score from multiple types of checks to determine if a given message is spam. Header tests Body phrase tests Bayesian filtering Automatic address whitelist/blacklist Manual address whitelist/blacklist Collaborative spam identification databases (DCC, Pyzor, Razor2) DNS Blocklists ( "RBLs" ) Character sets and locales Even though any one of these tests might, by themselves, mis-identify a Ham or Spam, their combined score is terribly difficult to fool. HamSpamHamSpam
5
SpamAssassin - Advantages Wide-spectrum of different tests Open Source and Free! Flexible – works with many platforms and servers Easy Configuration
6
SpamAssassin Rules Emporium http://rulesemporium.com/ Popular Repository for Third Party SpamAssassin Rules “Actively” Updated between SpamAssassin releases
7
SARE Usage Guidelines Just download rules into SpamAssassin directory (i.e.: /etc/spamassassin) Restart daemon if necessary Most Popular Rules have “levels” (i.e.: 0 = conservative, 3 = aggressive) Choose Rules you use carefully!
8
Rules Du Jour http://www.exit0.us/index.php?pagename= RulesDuJour http://www.exit0.us/index.php?pagename= RulesDuJour http://www.exit0.us/index.php?pagename= RulesDuJour Automates updating, downloading and installation of most popular SARE rules
9
Rules Du Jour Install script in $PATH (i.e.: /usr/local/sbin) and make executable Create a blank configuration file at /etc/rulesdujour/config Add a TRUSTED_RULESETS line to your config file that contains the names of the rulesets you chose. i.e.: TRUSTED_RULESETS="SARE_ADULT SARE_OBFU0 SARE_OBFU1 SARE_URI0 SARE_URI1" Configure any local settings. Examples below: SA_DIR="/etc/mail/spamassassin" MAIL_ADDRESS="administrator@example.com" SA_RESTART="killall -HUP spamd" Run this script periodically (manually or via crontab)
10
SpamAssassin Serving Multiple Servers Problem: How do you keep multiple mail servers syncronized? Spam checking adds load to mail server
11
SpamAssassin Serving Multiple Servers Solution: Use a single machine to manage spam sitewide! Logs, Configuration unified on a single machine
12
SA/multi-server – set up server Server must be running SpamAssassin as a daemon (spamd -d) Server must accept outside connections (i.e.: spamd –A 127.0.0.1,192.168.1.10,192.168.1.11) Make sure server can listen to port 783 (spamd’s default port)
13
SA/multi-server – set up client Use “spamc” command instead of “spamassassin” Use switch for remote server: spamc -d 192.168.1.10, and so forth … Test: spamc –d my.server.net < /path/to/sample/email
14
Bayesian Filtering - Introduction “Bayesian Filtering uses statistics from previously-classified messages to estimate the likelihood that a particular message is spam.”* “This likelihood estimate is converted to a (possibly negative) weight which is added to the ad hoc spamminess score.”* *GORDON V. CORMACK and THOMAS R. LYNAM, University of Waterloo
15
Bayes – Getting Started Enable Bayes in Config: use_bayes 1 Put aside space for Bayes DB (either file- based or SQL) bayes_path /var/local/spamassassin/bayes or bayes_store_module Mail::SpamAssassin::BayesStore::SQL
16
Bayes – Getting Started Feed Bayes “ham” and “spam” You MUST feed it samples of good and bad messages to start! At least 200 samples of each, but use as much as possible sa-learn --spam --dir /path/to/directory/full/of/spam/msgs sa-learn --ham --dir /path/to/directory/full/of/ham/msgs
17
Bayes – Enhancing Enable automated learning: bayes_auto_learn 1 bayes_auto_learn_threshold_nonspam 0.1 bayes_auto_learn_threshold_spam 6.0 “Teach” Bayes Create mailbox for “ham” and “spam” and scan periodically Note: “Resend” email, don’t forward! You can’t overtrain the Bayes database!
18
Bayes – Enhancing Give more “weight” to Bayesian Results score BAYES_00 -4 score BAYES_05 -2 score BAYES_95 6 score BAYES_99 9
19
Conclusion World-class Spam Prevention is Possible with Freely Available Tools! SRJC Stats: Process 30,000 – 60,000 messages per day with one dual-processor server Most messages scanned < 10 seconds ( < 1 without network tests) < 0.007% false positives/negatives
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.