ICDM'07 1 Depth-Based Novelty Detection Yixin Chen Dept. of Computer and Information Science University of Mississippi Joint work with Henry Bart, Xin Dang, and Hanxiang Peng
ICDM'072 Outline Novelty detection Motivations Kernelized spatial depth (KSD) Bounds on the false alarm probability Empirical studies Discussions
ICDM'073 Outlier Detection Missing label problem One-class learning
ICDM'074 A Simple Outlier Detector 1-d example Sensitivity Threshold Structure of the data X mean median X X X ?
ICDM'075 Median The sign function Median is
ICDM'076 Spatial Median The spatial sign function The spatial median is
ICDM'077 Spatial Depth Sample version The expectation of the unit vector starting from x
ICDM'078 Spatial Depth and Outlier Detection outlier
ICDM'079 Example: Half-Moon Data FAR = 70%
ICDM'0710 Example: Ring Data FAR = 100%
ICDM'0711 Kernelized Spatial Depth (KSD) σ→∞, KSD converges to SD σ→0, KSD → 0.293
ICDM'0712 Example: Half-Moon Data
ICDM'0713 Example: Ring Data
ICDM'0714 KSD Outlier Detector outliers normal observations b is margin How should we decide the threshold t?
ICDM'0715 Threshold Selection Largest threshold such that upper bound on FAP ≤ desired level
ICDM'0716 Bounds on the False Alarm Probability A training set bound A test set bound
ICDM'0717 Empirical Study 1 10 species under the order Cypriniforms 989 specimens from Tulane University Museum of Natural History
ICDM'0718 Empirical Study 1 Masking Effect
ICDM'0719 Empirical Study 2
ICDM'0720 Discussions KSD outlier detection and density based approaches
ICDM'0721 Acknowledgment Kory P. Northrop, Tulane University Huimin Chen, University of New Orleans University of Mississippi National Science Foundation