Download presentation
Presentation is loading. Please wait.
Published byMeagan Williams Modified over 9 years ago
1
A Technical Approach to Minimizing Spam Mallory J. Paine
2
The Spam Epidemic By June 2003, unsolicited commercial email, or spam, accounted for nearly 55% of total email traffic on the Internet. Spam is on the rise! On 2/1/05, the New York Times reported that spam now accounts for “80% or more” of all email traffic. Why is spam a problem? Can’t users just ignore it?
3
Current Anti-Spam Technologies Protection of Email Address Keyword- and Rule-based Filters Verification Filters Bayesian Analytical Filters T
4
Keyword- and Rule-Based Filters Consists of a series of rules that use Boolean logic and the textual contents of a message to determine its legitimacy. Good because it’s simple to implement. Bad because analytical process is ‘dumb.’ Example: Suppose your filter contains a rule that blocks messages containing both the word ‘viagra’ and a URL. Messages containing a URL and a slight variation of ‘viagra’, like ‘vi@gra’ or ‘v i a g r a’ pass through the filter as legitimate messages.
5
Verification Filters Message classification is based on the sender of the email. If message is from a trusted sender, it’s marked as legitimate. If message is from a known spammer, message is immediately classified as spam. If message is from an unknown sender, then message is placed in quarantine until successful completion of a verification process. Failure to complete the verification process results in sender labeled as a spammer, message classified as spam.
6
Verification Filters: The Verification Process Many different implementations of the Verification Process. Ideally, verification process includes a task that is difficult or impossible for a computer to complete without human assistance. Two examples: Simple image verification, more complex image verification: the CAPTCHA.
7
Verification Filters: Advantages and Disadvantages Good because nearly 100% effective at catching spam. Bad because: Friends have to “jump through a hoop” to send email to someone who uses a verification filter. Verification filter ignores text contents of a message. Senders of messages that are obviously spam are still asked to complete verification. This means lots of unnecessary verification emails and wasted bandwidth. Spam with a fake sender’s address is incorrectly classified as legitimate.
8
Bayesian Filters: The Best Real-World Anti-Spam Technique Calculates probability that a given message is spam by examining the words (or phrases) contained in the message. By examining the words found in a message, and taking note of how many times each word has occurred in a user’s legitimate messages versus how many times that same word has occurred in junk emails, the Bayesian Filters computes the probability that the entire message is junk.
9
Bayesian Filters: Advantages and Disadvantages Good because a good implementation will classify messages with >99% accuracy and close to zero false positives. Bad because the filter must be trained. The filter maintains databases of words from legit emails and of words from junk emails. Initially, these databases are empty. Until the databases contain a substantial amount of information about a user’s email, the filter will be unable to classify messages with any accuracy whatsoever.
10
Now what? Given that all of the anti-spam techniques presented possess significant disadvantages to their use, it’s clear that none will effectively solve the spam epidemic. Basically, it’s clear that there are huge, gaping flaws in the current email system, which is based on the POP3 and SMTP protocols. These protocols are very “open” and they make it easy for spammers to send millions of messages without restriction. The email system must be redesigned entirely.
11
Email and the Future Two Variations of a Viable Overhaul of the Email System Implement an email system where the sender of a message is charged a small monetary fee for every message sent ($0.001, for example). This fee is negligible to the average user, but translates into a cost-prohibitive fee of $1000 for every million messages sent. Implement a similar system where the sender is instead charged a computational fee for each message sent. This computational fee requires perhaps.1 seconds of processing time to complete, which translates into ~28 hours of computing time for every million messages. Given that spammers rely on the free cost and ease of sending email to send tens of millions of messages per day, either of these two solutions is more than adequate to solve the spam epidemic. Unlikely that either system will be implemented because the current email system is very, very deeply ingrained in all sorts of mainstream technology.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.