ECE 471/571 – Lecture 18 Classifier Fusion 04/12/17.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Intelligent Environments1 Computer Science and Engineering University of Texas at Arlington.
An Overview of Machine Learning
What is Statistical Modeling
Classification and risk prediction
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Decision Theory Naïve Bayes ROC Curves
This week: overview on pattern recognition (related to machine learning)
1 Collaborative Processing in Sensor Networks Lecture 4 - Distributed In-network Processing Hairong Qi, Associate Professor Electrical Engineering and.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Feb 2007Alon Slapak 1 of 1 Classification A practical approach Classification Methods Training Set Classifier Example Definition Bibliography.
Computational Intelligence: Methods and Applications Lecture 12 Bayesian decisions: foundation of learning Włodzisław Duch Dept. of Informatics, UMK Google:
Detection, Classification and Tracking in a Distributed Wireless Sensor Network Presenter: Hui Cao.
Today Ensemble Methods. Recap of the course. Classifier Fusion
A Brief Review of Theory for Information Fusion in Sensor Networks Xiaoling Wang February 19, 2004.
ECE 471/571 – Lecture 2 Bayesian Decision Theory 08/25/15.
ECE 471/571 – Lecture 6 Dimensionality Reduction – Fisher’s Linear Discriminant 09/08/15.
AUTOMATIC TARGET RECOGNITION AND DATA FUSION March 9 th, 2004 Bala Lakshminarayanan.
Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.
© Devi Parikh 2008 Devi Parikh and Tsuhan Chen Carnegie Mellon University April 3, ICASSP 2008 Bringing Diverse Classifiers to Common Grounds: dtransform.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
ECE 471/571 - Lecture 19 Review 11/12/15. A Roadmap 2 Pattern Classification Statistical ApproachNon-Statistical Approach SupervisedUnsupervised Basic.
ECE 471/571 – Lecture 21 Syntactic Pattern Recognition 11/19/15.
ECE 471/571 – Lecture 3 Discriminant Function and Normal Density 08/27/15.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Syntactic Pattern Recognition 04/28/17
ECE 471/571 - Lecture 19 Review 02/24/17.
Nonparametric Density Estimation – k-nearest neighbor (kNN) 02/20/17
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Performance Evaluation 02/15/17
Fast Kernel-Density-Based Classification and Clustering Using P-Trees
Trees, bagging, boosting, and stacking
Unsupervised Learning - Clustering 04/03/17
Basic machine learning background with Python scikit-learn
Unsupervised Learning - Clustering
Course Outline MODEL INFORMATION COMPLETE INCOMPLETE
Nearest-Neighbor Classifiers
ECE539 final project Instructor: Yu Hen Hu Fall 2005
ECE 471/571 – Lecture 12 Perceptron.
The Combination of Supervised and Unsupervised Approach
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)
ECE 471/571 – Review 1.
Parametric Estimation
Hairong Qi, Gonzalez Family Professor
Model Evaluation and Selection
Generally Discriminant Analysis
The Naïve Bayes (NB) Classifier
Prepared by: Mahmoud Rafeek Al-Farra
ECE – Pattern Recognition Lecture 16: NN – Back Propagation
ECE – Pattern Recognition Lecture 15: NN - Perceptron
ECE – Pattern Recognition Lecture 13 – Decision Tree
Information Retrieval
Hairong Qi, Gonzalez Family Professor
Roc curves By Vittoria Cozza, matr
Evolutionary Ensembles with Negative Correlation Learning
Hairong Qi, Gonzalez Family Professor
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Lecture 16. Classification (II): Practical Considerations
ECE – Lecture 1 Introduction.
ECE – Pattern Recognition Lecture 10 – Nonparametric Density Estimation – k-nearest-neighbor (kNN) Hairong Qi, Gonzalez Family Professor Electrical.
Bayesian Decision Theory
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 8 – Performance Evaluation
ECE – Pattern Recognition Lecture 4 – Parametric Estimation
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 14 – Gradient Descent
ECE – Pattern Recognition Midterm Review
Presentation transcript:

ECE 471/571 – Lecture 18 Classifier Fusion 04/12/17

Recap Pattern Classification Statistical Approach Syntactic Approach Supervised Unsupervised Basic concepts: Baysian decision rule (MPP, LR, Discri.) Basic concepts: Distance Agglomerative method Parametric learning (ML, BL) k-means Non-Parametric learning (kNN) Winner-take-all LDF (Perceptron) Kohonen maps NN (BP) Mean-shift Dimensionality Reduction FLD K-L transform (PCA) Performance Evaluation ROC curve TP, TN, FN, FP Stochastic Methods local optimization (GD) Global optimization (SA, GA) Classifier Fusion Voting, NB BKS

Motivation Three heads are better than one. Combining classifiers to achieve higher accuracy Combination of multiple classifiers Classifier fusion Mixture of experts Committees of neural networks Consensus aggregation … Reference: L. I. Kuncheva, J. C. Bezdek, R. P. W. Duin, “Decision templates for multiple classifier fusion: an experimental comparison,” Pattern Recognition, 34: 299-314, 2001.

Popular Approaches Majority voting Naive Bayes combination (NB) Behavior-knowledge space (BKS) Interval-based integration

Consensus patterns Unanimity (100%) Simple majority (50%+1) Plurality (most votes)

Example of Majority Voting - Temporal Fusion Fuse all the 1-sec sub-interval local processing results corresponding to the same event (usually lasts about 10-sec) Majority voting number of possible local processing results number of local output c occurrence

NB (the independence assumption) The real class is DW, the classifier says it’s HMV Confusion matrix s C1 AAV DW HMV 894 329 143 99 411 274 98 42 713 C2 AAV DW HMV 1304 156 77 114 437 83 13 107 450 k i = 1, 2 (classifiers) L1 AAV DW HMV L2 AAV DW HMV Probability that the true class is k given that Ci assigns it to s Probability multiplication

NB – Derivation Assume the classifiers are mutually independent Bayes combination - Naïve Bayes, simple Bayes, idiot’s Bayes Assume L classifiers, i=1,..,L c classes, k=1,…,c si: class label given by the ith classifier, i=1,…,L, s={s1,…,sL}

BKS Assumption: Majority voting won’t work - 2 classifiers - 3 classes - 100 samples in the training set Then: - 9 possible classification combinations c1, c2 samples from each class fused result 1,1 10/3/3 1 1,2 3/0/6 3 1,3 5/4/5 1,3 … 3,3 0/0/6 3 Majority voting won’t work Behavior-Knowledge Space algorithm (Huang&Suen)

Value-based vs. Interval-based Fusion Interval-based fusion can provide fault tolerance Interval integration – overlap function Assume each sensor in a cluster measures the same parameters, the integration algorithm is to construct a simple function (overlap function) from the outputs of the sensors in a cluster and can resolve it at different resolutions as required Crest: the highest, widest peak of the overlap function w5s5 w4s4 w3s3 w2s2 w1s1 w6s6 w7s7 3 2 1 O(x)

A Variant of kNN Generation of local confidence ranges (For example, at each node i, use kNN for each k{5,…,15}) Apply the integration algorithm on the confidence ranges generated from each node to construct an overlapping function confidence range level smallest largest in this column Class 1 Class 2 … Class n k=5 3/5 2/5 … 0 k=6 2/6 3/6 … 1/6 … … … … … k=15 10/15 4/15 … 1/15 {2/6, 10/15} {4/15, 3/6} … {0, 1/6}

Example of Interval-based Fusion stop 1 stop 2 stop 3 stop 4 c acc class 1 1 0.2 0.5 0.125 0.75 class 2 2.3 0.575 4.55 0.35 0.6 0.1 class 3 0.7 0.175 0.25 3.3 0.55 3.45

Confusion Matrices of Classification on Military Targets AAV DW HMV 29 2 1 18 8 23 Acoustic (75.47%, 81.78%) Multi-modality fusion (84.34%) Multi-sensor fusion (96.44%) Seismic (85.37%, 89.44%)

Civilian Target Recognition Ford 250 Harley Motocycle Ford 350 Suzuki Vitara

Confusion Matrices Acoustic Seismic Multi-modal

Reference For details regarding majority voting and Naïve Bayes, see http://www.cs.rit.edu/~nan2563/combining_classifiers_notes.pdf