V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital.

Slides:



Advertisements
Similar presentations
High Resolution studies
Advertisements

Números.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
/ /17 32/ / /
1 MPE and Partial Inversion in Lifted Probabilistic Variable Elimination Rodrigo de Salvo Braz University of Illinois at Urbana-Champaign with Eyal Amir.
Reflection nurulquran.com.
Worksheets.
STATISTICS HYPOTHESES TEST (I)
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS Random Variables and Distribution Functions
Applicability Issues of the (Real-valued) Negative Selection Algorithms Zhou Ji, Dipankar Dasgupta The University of Memphis GECCO 2006: July 11, 2006.
Augmented Negative Selection Algorithm with Variable-Coverage Detectors Zhou Ji, Zhou Ji, St. Jude Childrens Research Hospital Dipankar Dasgupta, Dipankar.
10/14/20051 Dissertation Proposal Negative selection algorithms: from the thymus to V-detector Zhou Ji, advised by Prof. Dasgupta.
Analysis of Dental Images using Artificial Immune Systems Zhou Ji 1, Dipankar Dasgupta 1, Zhiling Yang 2 & Hongmei Teng 1 1: The University of Memphis.
Estimating the detector coverage in a negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital Dipankar Dasgupta The University of Memphis.
Negative Selection Algorithms at GECCO /22/2005.
V-Detector: A Negative Selection Algorithm Zhou Ji, advised by Prof. Dasgupta Computer Science Research Day The University of Memphis March 25, 2005.
Real-valued negative selection algorithms Zhou Ji
Addition and Subtraction Equations
1 When you see… Find the zeros You think…. 2 To find the zeros...
Whiteboardmaths.com © 2004 All rights reserved
Copyright © 2010 Pearson Education, Inc. Slide
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Lecture 7 POWER IN STATISTICAL TESTING
Class 6: Hypothesis testing and confidence intervals
Quantitative Methods Lecture 3
Chapter 7 Sampling and Sampling Distributions
Box and Whiskers with Outliers. Outlier…… An extremely high or an extremely low value in the data set when compared with the rest of the values. The IQR.
Simple Linear Regression 1. review of least squares procedure 2
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
1 OFDM Synchronization Speaker:. Wireless Access Tech. Lab. CCU Wireless Access Tech. Lab. 2 Outline OFDM System Description Synchronization What is Synchronization?
Sampling in Marketing Research
The basics for simulations
Discrete Math Recurrence Relations 1.
EE, NCKU Tien-Hao Chang (Darby Chang)
Chapter 16 Goodness-of-Fit Tests and Contingency Tables
1 Analysis of Random Mobility Models with PDE's Michele Garetto Emilio Leonardi Politecnico di Torino Italy MobiHoc Firenze.
Feature Selection 1 Feature Selection for Image Retrieval By Karina Zapién Arreola January 21th, 2005.
Hypothesis Tests: Two Independent Samples
Chapter 10 Estimating Means and Proportions
Artificial Intelligence
When you see… Find the zeros You think….
1 Motion and Manipulation Configuration Space. Outline Motion Planning Configuration Space and Free Space Free Space Structure and Complexity.
Module 16: One-sample t-tests and Confidence Intervals
Module 17: Two-Sample t-tests, with equal variances for the two populations This module describes one of the most utilized statistical tests, the.
Confidence intervals for means and proportions FETP India
Before Between After.
Subtraction: Adding UP
Detecting Spam Zombies by Monitoring Outgoing Messages Zhenhai Duan Department of Computer Science Florida State University.
Statistical Inferences Based on Two Samples
Static Equilibrium; Elasticity and Fracture
Ch 14 實習(2).
Chapter 8 Estimation Understandable Statistics Ninth Edition
Resistência dos Materiais, 5ª ed.
CHAPTER 15: Tests of Significance: The Basics Lecture PowerPoint Slides The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.
Experimental Design and Analysis of Variance
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
Module 20: Correlation This module focuses on the calculating, interpreting and testing hypotheses about the Pearson Product Moment Correlation.
Simple Linear Regression Analysis
Biostatistics course Part 14 Analysis of binary paired data
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
Evolving Edge detection Final project by Rubshtein Andrey ( )
Commonly Used Distributions
A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Anomaly Detection in Data Docent Xiao-Zhi Gao
1 Negative selection algorithms: from the thymus to V-detector Dissertation defense Zhou Ji Major professor: Prof. Dasgupta Advisory committee: Dr. Lin,
Presentation transcript:

V-detector: a real-valued negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital

What is negative selection? Biological background: T cells, thymus Major steps: 1. Generate candidates randomly 2. Eliminate those that recognize self samples

Main steps Generation detection

What is matching rule? When a sample and a detector are considered matching. Matching rule plays an important role in negative selection algorithm. It largely depends on the data representation.

In real-valued representation, detector can be visualized as hyper-sphere. Candidate 1: thrown-away; candidate 2: made a detector. Match or not match?

Main idea of V-detector By allowing the detectors to have some variable properties, V-detector enhances negative selection algorithm from several aspects: It takes fewer large detectors to cover non-self region – saving time and space Small detector covers holes better. Coverage is estimated when the detector set is generated. The shapes of detectors or even the types of matching rules can be extended to be variable too.

Main concept of Negative Selection and V-detector Constant-sized detectorsVariable-sized detectors

Outline of the algorithm (generation of variable-sized detector set)

Detector Set Generation Algorithm Constant-sized detectors Variable-sized detectors

Screenshots of the software Message view Visualization of data points and detectors

Experiments and Results Synthetic Data 2D. Training data are randomly chosen from the normal region. Fishers Iris Data One of the three types is considered as normal. Biomedical Data Abnormal data are the medical measures of disease carrier patients. Air Pollution Data Abnormal data are made by artificially altering the normal air measurements Ball bearings: Measurement: time series data with preprocessing - 30D and 5D

Synthetic data - Cross-shaped self space Shape of self region and example detector coverage (a) Actual self space (b) self radius = 0.05 (c) self radius = 0.1

Synthetic data - Cross-shaped self space Results Detection rate and false alarm rateNumber of detectors

Error rates

Synthetic data - Ring-shaped self space Shape of self region and example detector coverage (a) Actual self space (b) self radius = 0.05 (c) self radius = 0.1

Synthetic data - Ring-shaped self space Results Detection rate and false alarm rateNumber of detectors

Iris data Comparison with other methods: performance Detection rateFalse alarm rate Setosa 100%MILA NSA (single level)100 0 V-detector Setosa 50%MILA NSA (single level) V-detector Versicolor 100%MILA NSA (single level) V-detector Versicolor 50%MILA NSA (single level) V-detector Virginica 100%MILA NSA (single level) V-detector Virginica 50%MILA NSA (single level) V-detector

Iris data Comparison with other methods: number of detectors meanmaxMinSD Setosa 100% Setosa 50% Veriscolor 100% Versicolor 50% Virginica 100% Virginica 50%

Iris data Virginica as normal, 50% points used to train Detection rate and false alarm rateNumber of detectors

Biomedical data Blood measure for a group of 209 patients Each patient has four different types of measurement 75 patients are carriers of a rare genetic disorder. Others are normal.

Biomedical data: results comparison Training DataAlgorithmDetection RateFalse Alarm rateNumber of Detectors MeanSDMeanSDMeanSD 100% trainingMILA * 0 NSA r= r= % trainingMILA * 0 NSA r = r= % trainingMILA * 0 NSA r= r=

Biomedical data Detection rate and false alarm rateNumber of detectors

Air pollution data Totally 60 original records. Each is 16 different measurements concerning air pollution. All the real data are considered as normal. More data are made artificially: 1. Decide the normal range of each of 16 measurements 2. Randomly choose a real record 3. Change three randomly chosen measurements within a larger than normal range 4. If some the changed measurements are out of range, the record is considered abnormal; otherwise they are considered normal Totally 1000 records including the original 60 are used as test data. The original 60 are used as training data.

Air pollution data Detection rate and false alarm rateNumber of detectors

Ball bearing data raw data: time series of acceleration measurements Preprocessing (from time domain to representation space for detection) 1. FFT (Fast Fourier Transform) with Hanning windowing: window size Statistical moments: up to 5 th order

Example of data (raw data of new bearings) --- first 1000 points

Example of data (FFT of new bearings) --- first 3 coefficients of the first 100 points

Example of data (statistical moments of new bearings) --- moments up to 3rd order of the first 100 points

Ball bearings structure and damage Damaged cage

Ball bearing data: results Ball bearing conditionsTotal number of data pointsNumber of detected anomalies Percentage detected New bearing (normal)273900% Outer race completely broken % Broken cage with one loose element % Damage cage, four loose elements % No evident damage; badly worn % Ball bearing conditionsTotal number of data pointsNumber of detected anomalies Percentage detected New bearing (normal)265100% Outer race completely broken % Broken cage with one loose element % Damage cage, four loose elements289200% No evident damage; badly worn289200% Preprocessed with FFT Preprocessed with statistical moments

Ball bearing data: performance summary

New development of this work A new algorithm to generate variable-sized detectors. Purpose: reduce the possible false negative at the boundary of self region Why the issue exits: some self samples may be very close to the boundary. Main idea: differentiate between internal self samples and boundary self samples Solution: combine the advantage of the algorithms to generate variable-sized and constant-sized detectors described previously.

How much one sample tells

Samples may be on boundary

In term of detectors

Comparing three methods Constant-sized detectors V-detector New algorithm Self radius = 0.05

Comparing three methods Constant-sized detectors V-detectorsNew algorithm Self radius = 0.1

Work ongoing Estimate of coverage using formal statistics point estimate is the simplest method. Two types of statistical inference: 1. Confidence interval 2. Hypothesis testing

Point estimate of proportion

Summary 1. V-detector uses fewer detectors to obtain similar coverage. 2. Smaller detectors are more acceptable if the total number of detectors are largely controlled. 3. Coverage estimate is superior to fixed number of detectors. 4. V-detector can deal with high-dimensional data, including time series, better. 5. Self radius and estimated coverage are the two control parameters in V-detector. 6. Variable size, variable shape, variable matching rules, or other variable properties of detectors provide encouraging opportunity to enhance negative selection mechanism.