Download presentation
Presentation is loading. Please wait.
Published byRoland Morrison Modified over 9 years ago
1
Discovering Outlier Filtering Rules from Unlabeled Data Author: Kenji Yamanishi & Jun-ichi Takeuchi Advisor: Dr. Hsu Graduate: Chia- Hsien Wu
2
Outline Motivation Objective Introduction Main Framework Outlier Detector - SmartSifter Rule Generator – DL-ESC/DL-SC Experimentation–The network intrusion Experimental Results Conclusion Opinion
3
Motivation The problem of the SmartSifter’s accuracy The SmartSifter cannot find the general pattern of the identified outliers
4
Objective Improving the accuracy of SmartSiFter. Discovering a new pattern that outliers in a specific group may commonly have
5
Introduction Developing SmartSifer : It is an on-line outlier detection algorithm Improving the power of the SamtSifer by combining supervised learning method
6
Main Framework Classifier L A New Rule
7
Outlier Detector - SmartSifter ->SS Using a probabilistic (Gaussian mixture) model->P(x,y) = p(x)p(y|x) Employing an on-line discounting learning algorithm (SDLE)/(SDEM) to update the model Giving a score to each datum
8
Outlier Detector - SmartSifter ->SS (cont.) SDLE algorithm: An on-line discounting variant of the Laplace law based estimation algorithm SDEM algorithm: An on-line discounting variant of the incremental EM (Expectation Maximization) algorithm
9
Outlier Detector - SmartSifter ->SS (cont.) Outputting a sorted dataset A highly scored data indicates a high possibility be an outlier
10
Rule Generator – DL-ESC/DL-SC Using a stochastic decision list Employing the principle of minimizing extended stochastic complexity or stochastic complexity
11
Rule Generator – DL-ESC/DL-SC (cont.) If ξ makes t 1 true, then μ = v 1 with probability p 1 else if ξ makes t 2 true, then μ = v 2 with probability p 2 ……………………… else μ = v s with probability p s
12
Experimentation - Network intrusion detection The purpose of our experiment is to detect without making use of the labels concerning intrusions
13
Experimentation – Dataset (cont.) Using the dataset KDD Cup 1999 prepared for network intrusion detection Using the 13 attributes for DL-ESC Using four attributes for SmartSifter (service,duration,src_bytes,dst_bytes) Only “service” is categorical Y= log(x+0.1),where the base of logarithm is e Generating five datasets S0,S1,S2,S3,S4
14
Experimentation – Dataset (cont.)
15
Experimentation – Illustration by an Example (cont.) Update Rule – S1 First Rule – S1 Update Rule – S2
16
Experimental Results SS : SmartSifter R&S: Rule and SmartSifter (This framework) Using S0 as a training set to construct a filtering rule, each of S1,S2,S3,and S4 is used for test
17
Experimental Results (cont.)
19
Conclusion This new framework has two features Improving the power of SmartSifter Helping the user discovers a general pattern
20
Opinion Making the detection process more effective and more understandable This framework can apply to other field
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.