ADAPTIVE HIERARCHICAL CLASSIFICATION WITH LIMITED TRAINING DATA Dissertation Defense of Joseph Troy Morgan Committee: Dr Melba Crawford Dr J. Wesley Barnes Dr Joydeep Ghosh Dr John Hasenbein Dr Elmira Popova
Overview Introduction Motivation for research Limited quantity of training data Limited quality of training data Output space precision Research contributions
Introduction: classification Assigning labels (L i ) to data (x) Typically use “supervised” methods when possible “Label specific” probability distributions Predefined vs. unknown labels Input space into feature space
Binary hierarchical classifier (BHC) framework C terminal nodes and (C-1) internal nodes Feature selection specific to each partition More natural and easier discriminations first * * Internal node 1 Leaf node 2 C A+1 C3 A 6
Introduction: training/testing Bayesian approach parameter estimation: Labeled data used for training and testing Sample selection is important Reported results may be “misleading” Real-world selection problems
Introduction: hyperspectral data
Overview Introduction Motivation for research Limited quantity of training data Limited quality of training data Output space precision Research contributions
Motivation for proposed research Robustness related to the training data Classification dependency on an adequate quantity of training data Dealing with training data that is of poor quality
Overview Introduction Motivation for research Limited quantity of training data Limited quality of training data Output space precision Research contributions
Limited quantity of training data Covariance matrix: parameters Literature: obs 4 x dimensionality Previous work Parameter stabilization techniques Improving ratio of training data to dimensionality * * Sub-sampling and combining schemes
Land cover labels: Bolivar Peninsula
Adaptive Best-Basis BHC Feature space is dependent upon quantity of data at each split in the BHC Set d=|X|/Threshold at each split Reduce dimensionality by merging highly correlated bands Use “ancestors” in the hierarchy for help in generating the best-basis
Adaptive BB-BHC algorithm Threshold: Correlation measure :
Bolivar Peninsula Acquired Fall ’99 Pixel spatial resolution of 5m Shoreline changes and sedimentary process
Adaptive BB-BHC: Bolivar Adaptive BB- BHC retains high level of accuracy Applicable even at 75% rate Generally less variability due to training data sample
Bolivar Peninsula TD-BHC classified images Sampling Percentage: BB vs Pseudo
Kennedy Space Center Acquired Spring ‘96 Pixel resolution of 18m Merritt Island National Wildlife Refuge > 1,000 plant and 500 animal species
Adaptive BB-BHC: Canaveral Adaptive BB-BHC performs better except at 1.5% Reduction is too severe
KSC Images Sampling Percentage: BB vs Pseudo Ex: TD-BHC
Overview Introduction Motivation for research Limited quantity of training data Limited quality of training data Output space precision Research contributions
Limited quality of training data Training data not representative of the entire population Detrimental impact Has not been demonstrated Potential solutions unexplored
“Misleading” accuracies Training and testing look great Transferred classifier performs poorly
Limited data problem Results from combined data very good 1 on 2 2 on 2 1,2 on 2
Where is the Problem Distribution has shifted and variance changed
Parameter Updating Methodology Example: 5 classes identified from “old” area “Reuse” knowledge acquired from previous data Assumptions: Applicable/extendable class structure from old area Projections will still work well at separating the meta- classes in the new area S1 S2 C3C1 S3 S4 C4C2 C5
Parameter Updating Methodology Compare relative magnitude of the means of the clusters to the previous means to identify the “hidden” cluster meta-class labels Old projection from S1 will be used to separate the unlabeled [C1,C3] from [C2,C4,C5] The meta-class distributions will be updated based upon the pseudo- labeled cluster distributions Those pixels identified as [C1,C3] will be separated based upon the old projection at S2 S1 S2 C3C1 S3 S4 C4C2 C5
Parameter Updating Results Improved accuracies for Bolivar Peninsula Mixed results for KSC
Confusion Matrix vs Precision Tree
Overview Introduction Motivation for research Limited quantity of training data Limited quality of training data Output space precision Research contributions
Output Space Precision Precision may not be supportable for transferal Oak Hammock, Slash Pine, etc vs Trees Need to find an applicable “level” of classes Goal: provide tools for researcher Use the BHC hierarchy Distance measure Ability to use multiple trees and sub-sampling due to the performance of the Adaptive BB-BHC “Purity” of label: classifier agreement
Separation of the Distributions Compare old separation vs new separation Comon distance measure: Bhattacharyya Best-basis approach for limited data quantity
Multiple BHC TD and BU hierarchies Adaptive BB-BHC allows for sub-sampling Common “Master Tree” Proximity of classes in each BHC are used for distance matrix Greedily merge classes Voting method to combine votes of the multiple BHCs
Output Space Precision Results Improved accuracies for Bolivar over each individual transferred classifier
Output Space Precision Tools * *
Overview Introduction Motivation for research Limited quantity of training data Limited quality of training data Output space decomposition Research contributions
Adaptive Best-Basis BHC Information Recycling Output space scalability Compare “trees” from different samples Master-basis construction Tool for classifier transferal
Future research topics Feature selection Unsupervised clustering necessary Focus on class homogeneity rather than validation Investigate techniques developed in signal processing community Build “library” of spectral signatures
Bottom-Up (BU) BHC * *
Fisher’s linear discriminant * * Maximize ratio of between (B) and within (W) class covariance
Output Space Precision Tools