Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Top-n Local Outliers in Large Databases Author: Wen Jin, Anthony K. H. Tung, Jiawei Han Advisor: Dr. Hsu Graduate: Chia- Hsien Wu.

Similar presentations


Presentation on theme: "Mining Top-n Local Outliers in Large Databases Author: Wen Jin, Anthony K. H. Tung, Jiawei Han Advisor: Dr. Hsu Graduate: Chia- Hsien Wu."— Presentation transcript:

1 Mining Top-n Local Outliers in Large Databases Author: Wen Jin, Anthony K. H. Tung, Jiawei Han Advisor: Dr. Hsu Graduate: Chia- Hsien Wu

2 Outline Motivation Objective Introduction Definition of Local Outlier Algorithm for Finding Top-n Local Outliers Micro-Cluster-Based Algorithm Experimentation Conclusion Opinion

3 Motivation By using a novel notion about LOF for outlier detection has some problems  Resulting a unnecessary computation for every object about its value of LOF  Handling the overlapping data

4 Objective To propose the good performance in finding the most outstanding local outliers We use a meaningful cut-plan to make our algorithms to be useful

5 Introduction There are five general categories about outlier detection. i.e., density-based Using a novel method for top-n local outliers mining that avoid computation of LOF for most objects Using the “Micro-clusters” and bounds notions

6 Definition of Local Outlier DEFINITION 1.(k-distance of p)

7 Definition of Local Outlier (cont.) DEFINITION 2.(k-distance neighborhood of p)

8 Definition of Local Outlier (cont.) DEFINITION 3. (reachability distance of p w.r.t object o)

9 Definition of Local Outlier (cont.)

10 DEFINITION 4.(local reachability density of p) There MinPis is equal to K DEFINITION 5.(local outlier factor of p)

11 Algorithm for Finding Top-n Local Outliers

12 Algorithm for Finding Top-n Local Outliers (cont.)

13 THEOREM 3.2

14 Algorithm for Finding Top-n Local Outliers (cont.)

15

16 DEFINITION 7.

17 Algorithm for Finding Top-n Local Outliers (cont.) DEFINITION 8.

18 Algorithm for Finding Top-n Local Outliers (cont.) DEFINITION 9.

19 Algorithm for Finding Top-n Local Outliers (cont.)

20 COROLLARY 3.2.

21 Algorithm for Finding Top-n Local Outliers (cont.) DEFINITION 10.(INTERNAL REACHABILITY BOUND OF A MICRO-CLUSTER)

22 Algorithm for Finding Top-n Local Outliers (cont.) DEFINITION 11.(EXTERNAL REACHABILITY BOUND OF TWO MICRO-CLUSTER)

23 Micro-Cluster-Based Algorithm Preprocessing, Computing LOF bound for micro-clusters, Rank top-n local outliers

24 Micro-Cluster-Based Algorithm (cont.) Preprocessing  Load data into CF tree  Fix CF Node and generate micro-clusters  Insert micro-clusters into X-tree

25 Micro-Cluster-Based Algorithm (cont.) Computing LOF Bound for Micro-clusters  Algorithm 1

26 Micro-Cluster-Based Algorithm (cont.) Algorithm 2.

27 Micro-Cluster-Based Algorithm (cont.) Rank Top-n local outliers  Algorithm 3

28 Experimentation

29 Experimentation (cont.)

30

31 Conclusion We have proposed a novel and efficient method for mining top-n local outliers To find strong and nested local outlier at multiple levels of granularity

32 Opinion How to combine this novel notion and other ways for finding outlier may be a research topic


Download ppt "Mining Top-n Local Outliers in Large Databases Author: Wen Jin, Anthony K. H. Tung, Jiawei Han Advisor: Dr. Hsu Graduate: Chia- Hsien Wu."

Similar presentations


Ads by Google