Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spam Image Identification Using an Artificial Neural Network

Similar presentations


Presentation on theme: "Spam Image Identification Using an Artificial Neural Network"— Presentation transcript:

1 Spam Image Identification Using an Artificial Neural Network
2008 MIT Spam Conference Spam Image Identification Using an Artificial Neural Network Jason R. Bowling, Priscilla Hope and Kathy J. Liszka The University of Akron

2

3

4

5

6

7

8 We know it’s bad… 2005 – roughly 1% of all emails
mid 2006 – rose to 21% J. Swartz, “Picture this: A sneakier kind of spam,” USA Today, Jul. 23, 2006.

9 The University of Akron
December 2007 28,000,000 messages 24,000,000 identified as spam and dropped

10 Inspiration

11 FANN Fast Artificial Neural Network Library open source
adaptive, learn by example (given good input) input hidden output

12 Image Preparation open source
converts from virtually any format to another tradeoffs

13 image2fann.cpp input images training data 150 × 150 pixel
8-bit grayscale jpg images

14 number of input nodes number of output nodes number of images (input sets) 1 -1 spam ham

15 22,500 input nodes two layers of hidden nodes 1 output node

16 Training the Network A fully connected back propagation neural network. Supervised learning paradigm.

17 Activation Function Takes the inputs to a node, uses a weight for each input and determines the weight of the output from the node.

18 Steepness 1.0 0.5 0.0

19 Widrow and Nguyen’s algorithm
An even distribution of weights across each input node’s active region. Used at initialization.

20 Epoch One cycle where the weights are adjusted to match the output in the training file. I’m spam! Too many epochs can cause the ANN to fail. I’m ham!

21 Learning Rate Train to a desired error.
Step down the training rate at preset intervals to avoid oscillation.

22 Training 22604 nodes in network Max epochs 200. Desired error: 0.4
Epochs Current error: Bit fail 56. Learning rate is: Max epochs Desired error: Epochs Current error: Bit fail 56. Epochs Current error: Bit fail 56. Epochs Current error: Bit fail 65. Epochs Current error: Bit fail 48.

23 image2fann.cpp train.c test.c ham spam input images training data FANN

24 572 Trained Images 75 hidden nodes

25 572 Trained Images 50 hidden nodes

26 Corpus

27

28 Scaling to number < 1 (divide by 1000) grayscale intensity training data limited to 0 – 0.25

29 Current Work complete corpus multipart images separate ANNs
hidden nodes color image size

30

31

32

33

34 Priscilla Hope

35 Thank you!


Download ppt "Spam Image Identification Using an Artificial Neural Network"

Similar presentations


Ads by Google