Download presentation
Presentation is loading. Please wait.
Published bySydney Pearson Modified over 9 years ago
1
A COMPARISON OF ANN, NAÏVE BAYES, AND DECISION TREE FOR THE PURPOSE OF SPAM FILTERING KAASHYAPEE JHA ECE/CS 539 1
2
NAÏVE BAYES CLASSIFIER Bayes Theorem:
3
PREPROCESSING Stop list: do not take into account trivial words like {or, and, but, a, an, the, is, in, for} Do not take into account words that are very uncommon
4
NAÏVE BAYES CLASSIFIER RESULTS Trial #1# of spam documents # of ham documents False positive rateAccuracy Training Set66743553.66%98.9% Testing Set82476 Trial #3# of spam documents # of ham documents False positive rateAccuracy Training Set2185707.02%96.9% Testing Set57232 Trial #2# of spam documents # of ham documents False positive rateAccuracy Training Set19711287.35%98.1% Testing Set68451
5
SVM RESULTS Trial #1# of spam documents # of ham documents False positive rateAccuracy Training Set66743552.43%99.6% Testing Set82476 Trial #3# of spam documents # of ham documents False positive rateAccuracy Training Set2185705.3%97.6% Testing Set57232 Trial #2# of spam documents # of ham documents False positive rateAccuracy Training Set19711285.9%99.3% Testing Set68451
6
WEAKNESS OF NAÏVE BAYES CLASSIFIER Example: hey man are you interested in sports? then email me at imcool@gmail.com imcool@gmail.com Spammers can avoid using words that are more prone to being in a spam email
7
WORK AHEAD Finish implementing and testing Decision Tree More preprocessing of the data Perform more trials with different ratios of training set and testing set
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.