1 Negative selection algorithms: from the thymus to V-detector Dissertation defense Zhou Ji Major professor: Prof. Dasgupta Advisory committee: Dr. Lin, Dr. McCauley, Dr. Phan
2 Outline Background of the research area V-detector: a new algorithm Experiments Discussion on applicability and others Conclusions
3 Background What are negative selection algorithms?
4 Related research areas Artificial Intelligence … Biology-inspired methods Neural network Evolutionary computation Artificial immune system (AIS) Negative selection algorithms Immune network Clonal selectionOther models …
5 Biological metaphor: negative selection in the thymus How T cells mature in the thymus
6 Biological metaphor: negative selection in the thymus How T cells mature in the thymus
7 Biological metaphor: negative selection in the thymus How T cells mature in the thymus The immature T cells have diversified receptors. Those that recognize self are eliminated. The rest can become mature T cells.
8 Basic idea of negative selection algorithms: The problem to solve: anomaly detection or one-class classification
9 Basic idea of negative selection algorithms: Possible detectors are generated randomly.
10 Basic idea of negative selection algorithms: Those that cover self region are eliminated.
11 Components that make up a negative selection algorithm Data and detector representation Binary (or string) representation Real-valued representation; detectors as hypersphere, or hyper-rectangle Hybrid representation Generate/elimination mechanism Random generation + censoring Genetic algorithm Greedy algorithm or other deterministic algorithm Matching rule Rcb (r contiguous bits) for binary representation Euclidean distance-based for real-valued representation
12 Major issues in negative selection algorithms Number of detectors Affecting the efficiency of generation and detection Detector coverage Affecting the accuracy of detection Algorithm of generating detectors Linked to efficiency and quality of detector set
13 V-detector A new algorithm
14 V-detector: new development in NSA 1. Variable-sized detectors 2. Estimation of detector coverage 3. Boundary-aware interpretation of self samples 4. A generic algorithm Important features of V-detector:
15 Detectors can be just a point Detectors in their basic form (constant size) Feature 1: variable size
16 Detectors with their individual radii Detectors with maximized coverage Feature 1: variable size
17 How many detectors to generate: approach in earlier worksV-detector’s approach Feature 2: coverage estimate
18 How to estimate the coverage: Feature 2: coverage estimate A random point may be in self region in nonself region, but already covered In nonself region, not covered yet More consecutive “already covered” point more coverage is achieved 1. An intuitive estimate; 2. hypothesis testing (Is the target coverage achieved?)
19 What does one self sample point mean? Point-wise interpretation of self samples Feature 3: boundary-aware Smaller matching thresholdLarge matching threshold
20 “The whole is more than the sum of its parts.” Self sample could be near the boundary. The neighboring points provide the hint. Feature 3: boundary-aware
21 How to be boundary-aware by using detectors: Feature 3: boundary-aware Point-wise interpretation Boundary-aware interpretation Large threshold Small threshold
22 V-detector as a generic algorithm Components that can be plugged in: Data representation Distance measure Matching rule The other three features are available for different customized variations. Feature 4: a generic algorithm
23 Example: generalized Euclidean distance Minkowski distance of order m (m-norm distance or L-m distance) Feature 4: a generic algorithm
24 Different detector shapes resulted Feature 4: a generic algorithm
25 V-detector's advantage Efficiency: fewer detectors fast generation Coverage confidence (reliability) Applicable to more applications
26 Experiments V-detector in action
27 extensive experiments ** Synthetic 2-D data Real world data Famous iris data Air pollution Biomedical data Gene expression Indian Telugu Ball bearing measurement Ball bearing measurement KDD cup data Dental image Dental image
28 2-D synthetic data Training points (1000)Test data (1000 points) and the ‘real shape’ we try to learn
29 Starting with training points, … 1000 training points 100 training points
30 Actual detectors generated Detector set based 1000 training points Detector set based 100 training points
31 Ball bearing’s structure and damage Damaged cage Raw data: measure of acceleration (time series)
32 Image-based dental diagnosis Normal occlusion Malocclusion
33 Comparison with other negative selection algorithms: iris data Training DataAlgorithmDetection RateFalse Alarm rateNumber of Detectors MeanSDMeanSDMeanSD Setosa 100% MILA * 0 NSA V-detector Setosa 50% MILA * 0 NSA V-detector Versicolor 100% MILA * 0 NSA V-detector Versicolor 50% MILA * 0 NSA V-detector Virginica 100% MILA * 0 NSA V-detector Virginica 50% MILA * 0 NSA V-detector
34 Strength of hypothesis testing ‘intersection’ shape pentagram
35 Effect of the self region’s shape stripe cross triangle ring intersection pentagram
36 Effect of the self region’s shape
37 Difference of boundary-aware interpretation (ball bearing data)
38 Comparison with SVM On disconnected 2-D self region On reduced representation of dental images
39 Discussion Whether and when are negative selection algorithm appropriate?
40 NSA’s applicability Applicable scenario Large amount of self (normal) samples Rare or no abnormal samples another possible usage: “negative database” When it is not appropriate: for example, number of self samples is small.
41 Comparison with other methods Other negative selection algorithms SVM (Support Vector machines) One-class SVM is comparable. Kernel function is very important for SVM
42 Conclusions Review of negative selection algorithms V-detector: a new development High efficiency Generic algorithm Real world application Prospect of NSA and AIS in general
43 My publications for this dissertation Dasgupta, Ji, Gonzalez, Artificial immune system (AIS) research in the last five years, IEEE CEC 2003 Ji, Dasgupta, Augmented negative selection algorithm with variable-coverage detectors, IEEE CEC 2004 Ji, Dasgupta, Real-valued negative selection algorithm with variable-sized detectors, GECCO 2004 Ji, Dasgupta, Estimating the detector coverage in a negative selection algorithm, GECCO 2005 Ji, A boundary-aware negative selection algorithm, ASC 2005 Ji, Dasgupta, Applicability Issues of the real-valued negative selection algorithms, GECCO 2006 Ji, Dasgupta, Analysis of Dental Images using Artificial Immune Systems, IEEE CEC 2006 Ji, Dasgupta, Revisiting negative selection algorithms, revised submission to the Evolutionary Computation Journal
44 Questions and comments? Thanks to everybody!
45 Organs in immune system