Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discovering Outlier Filtering Rules from Unlabeled Data Author: Kenji Yamanishi & Jun-ichi Takeuchi Advisor: Dr. Hsu Graduate: Chia- Hsien Wu.

Similar presentations


Presentation on theme: "Discovering Outlier Filtering Rules from Unlabeled Data Author: Kenji Yamanishi & Jun-ichi Takeuchi Advisor: Dr. Hsu Graduate: Chia- Hsien Wu."— Presentation transcript:

1 Discovering Outlier Filtering Rules from Unlabeled Data Author: Kenji Yamanishi & Jun-ichi Takeuchi Advisor: Dr. Hsu Graduate: Chia- Hsien Wu

2 Outline Motivation Objective Introduction Main Framework Outlier Detector - SmartSifter Rule Generator – DL-ESC/DL-SC Experimentation–The network intrusion Experimental Results Conclusion Opinion

3 Motivation The problem of the SmartSifter’s accuracy The SmartSifter cannot find the general pattern of the identified outliers

4 Objective Improving the accuracy of SmartSiFter. Discovering a new pattern that outliers in a specific group may commonly have

5 Introduction Developing SmartSifer : It is an on-line outlier detection algorithm Improving the power of the SamtSifer by combining supervised learning method

6 Main Framework Classifier L A New Rule

7 Outlier Detector - SmartSifter ->SS Using a probabilistic (Gaussian mixture) model->P(x,y) = p(x)p(y|x) Employing an on-line discounting learning algorithm (SDLE)/(SDEM) to update the model Giving a score to each datum

8 Outlier Detector - SmartSifter ->SS (cont.) SDLE algorithm: An on-line discounting variant of the Laplace law based estimation algorithm SDEM algorithm: An on-line discounting variant of the incremental EM (Expectation Maximization) algorithm

9 Outlier Detector - SmartSifter ->SS (cont.) Outputting a sorted dataset A highly scored data indicates a high possibility be an outlier

10 Rule Generator – DL-ESC/DL-SC Using a stochastic decision list Employing the principle of minimizing extended stochastic complexity or stochastic complexity

11 Rule Generator – DL-ESC/DL-SC (cont.)  If ξ makes t 1 true, then μ = v 1 with probability p 1 else if ξ makes t 2 true, then μ = v 2 with probability p 2 ……………………… else μ = v s with probability p s

12 Experimentation - Network intrusion detection The purpose of our experiment is to detect without making use of the labels concerning intrusions

13 Experimentation – Dataset (cont.) Using the dataset KDD Cup 1999 prepared for network intrusion detection Using the 13 attributes for DL-ESC Using four attributes for SmartSifter (service,duration,src_bytes,dst_bytes) Only “service” is categorical Y= log(x+0.1),where the base of logarithm is e Generating five datasets S0,S1,S2,S3,S4

14 Experimentation – Dataset (cont.)

15 Experimentation – Illustration by an Example (cont.) Update Rule – S1 First Rule – S1 Update Rule – S2

16 Experimental Results SS : SmartSifter R&S: Rule and SmartSifter (This framework) Using S0 as a training set to construct a filtering rule, each of S1,S2,S3,and S4 is used for test

17 Experimental Results (cont.)

18

19 Conclusion This new framework has two features  Improving the power of SmartSifter  Helping the user discovers a general pattern

20 Opinion Making the detection process more effective and more understandable This framework can apply to other field


Download ppt "Discovering Outlier Filtering Rules from Unlabeled Data Author: Kenji Yamanishi & Jun-ichi Takeuchi Advisor: Dr. Hsu Graduate: Chia- Hsien Wu."

Similar presentations


Ads by Google