Evolutionary Feature Extraction for SAR Air to Ground Moving Target Recognition – a Statistical Approach Evolving Hardware Dr. Janusz Starzyk Ohio University.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Noise & Data Reduction. Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis - Spectrum.
Face Recognition and Biometric Systems Eigenfaces (2)
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Pattern Recognition and Machine Learning
Automatic classification of weld cracks using artificial intelligence and statistical methods Ryszard SIKORA, Piotr BANIUKIEWICZ, Marcin CARYK Szczecin.
2 – In previous chapters: – We could design an optimal classifier if we knew the prior probabilities P(wi) and the class- conditional probabilities P(x|wi)
Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”
Analog Circuits for Self-organizing Neural Networks Based on Mutual Information Janusz Starzyk and Jing Liang School of Electrical Engineering and Computer.
AUTOMATIC SPEECH CLASSIFICATION TO FIVE EMOTIONAL STATES BASED ON GENDER INFORMATION ABSTRACT We report on the statistics of global prosodic features of.
Lecture 20 Object recognition I
Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Three Papers: AUC, PFA and BIOInformatics The three papers are posted online.
Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap
Speaker Adaptation for Vowel Classification
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
CSE803 Fall Pattern Recognition Concepts Chapter 4: Shapiro and Stockman How should objects be represented? Algorithms for recognition/matching.
Optimization of Signal Significance by Bagging Decision Trees Ilya Narsky, Caltech presented by Harrison Prosper.
1 FPGA Lab School of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701, U.S.A. An Entropy-based Learning Hardware Organization.
Neural Networks: A Statistical Pattern Recognition Perspective
Independent Component Analysis (ICA) and Factor Analysis (FA)
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Stockman CSE803 Fall Pattern Recognition Concepts Chapter 4: Shapiro and Stockman How should objects be represented? Algorithms for recognition/matching.
Self-organizing Learning Array based Value System — SOLAR-V Yinyin Liu EE690 Ohio University Spring 2005.
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Isolated-Word Speech Recognition Using Hidden Markov Models
MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way
1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University.
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
1 Pattern Recognition Concepts How should objects be represented? Algorithms for recognition/matching * nearest neighbors * decision tree * decision functions.
Stochastic Monte Carlo methods for non-linear statistical inverse problems Benjamin R. Herman Department of Electrical Engineering City College of New.
Image Classification 영상분류
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 24 Nov 2, 2005 Nanjing University of Science & Technology.
Applying Statistical Machine Learning to Retinal Electrophysiology Matt Boardman January, 2006 Faculty of Computer Science.
CSE 185 Introduction to Computer Vision Face Recognition.
23 November Md. Tanvir Al Amin (Presenter) Anupam Bhattacharjee Department of Computer Science and Engineering,
CCN COMPLEX COMPUTING NETWORKS1 This research has been supported in part by European Commission FP6 IYTE-Wireless Project (Contract No: )
Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.
Digital Image Processing Lecture 25: Object Recognition Prof. Charlene Tsai.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Data Transformation: Normalization
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Outlier Processing via L1-Principal Subspaces
The Elements of Statistical Learning
Classification with Perceptrons Reading:
Digital Systems: Hardware Organization and Design
Face Recognition and Detection Using Eigenfaces
EE513 Audio Signals and Systems
Parametric Methods Berlin Chen, 2005 References:
Presentation transcript:

Evolutionary Feature Extraction for SAR Air to Ground Moving Target Recognition – a Statistical Approach Evolving Hardware Dr. Janusz Starzyk Ohio University

-2 Neural Network Data Classification Concept of “ Logic Brain” Random learning data generation Multiple space classification of data Feature function extraction Dynamic selectivity strategy Training procedure for data identification FPGA implementation for fast training process

-3 Neural Network Data Classification Concept of “ Logic Brain” Threshold setup converts analog to digital world Threshold setup converts analog to digital world “Logic Brain” is possible based on artificial neural “Logic Brain” is possible based on artificial neural network network Random learning data generation Gaussian distribution random multiple dimension Gaussian distribution random multiple dimension data generation data generation Half data sets prepared for learning procedure Half data sets prepared for learning procedure Another half used later for training procedure Another half used later for training procedure Abdulqadir Alaqeeli, and Jing Pang

-4 Neural Network Data Classification Multiple space classification of data Each space can be represented by a set of minimum base vectors Each space can be represented by a set of minimum base vectors Feature function extraction and dynamic selecting strategy Conditional entropy extracts information in each Conditional entropy extracts information in each subspace subspace Different combinations of base vectors compose the redundant sets of new subspace Different combinations of base vectors compose the redundant sets of new subspace  expansion strategy Minimum function selection Minimum function selection  shrinking strategy

-5 Neural Network Data Classification FPGA implementation for fast training process Learning results are saved on board Learning results are saved on board Testing data sets are generated on board and sent Testing data sets are generated on board and sent through the artificial neural network generated on through the artificial neural network generated on board to test the successful data classification rate board to test the successful data classification rate The results are displayed on board The results are displayed on board Promising application Especially useful for feature extraction of large data sets Especially useful for feature extraction of large data sets Catastrophic circuit fault detection Catastrophic circuit fault detection

-6 Information Index: Background A priori class probabilities are known Entropy measure based on conditional probabilities X X X X X X X XX X X X X OO O O O O O O O O O Class A Class B X

-7 Information Index: Background P 1 and P 2 and a priori class probabilities P 1w and P 2w are conditional probabilities of correct classification for each class P 12w and P 21w are conditional probabilities of misclassification given a test signal P 1w, P 2w, P 12w and P 21w are calculated using Bayesian estimates of their probability density functions

-8 Information Index: Background probability density functions of P 1w, P 2w, P 12w, P 21w

-9 Direct Integration for N dimensions, m n grid points are needed to estimate  S i <  S k uniform grid nonuniform grid nonuniform grid  S i =  S k  S k  S i  S k  S i

-10 Monte Carlo Integration pdf 1 pdf 2 W(X i ) xixixixi X i generated with pdf 1 pdf

-11 Information Index: probability density functions P2w

-12 Information Index: weighted pdfs P2w

-13 Information Index: Monte Carlo Integration To integrate the probability density function – generate random points x i with pdf 1 – weight generated points according to –estimate the conditional probability P 1w using

-14 Information Index and Probability of Misclassification

-15 Standard Deviation of Information in MC Simulation

-16 Normalized Standard Deviation of Information

-17 Information Index: Status MIIFS was generalized to continuous distributions N-dimensional information index was developed Efficient N-dimensional integration was used Information error analysis was performed Information index can be used with non Gaussian distributions For small training sets and low information index information error is larger than information

-18 Optimum Transformation: Background Principal Component Analysis (PCA) based on Mahalanobis distance suffers from scaling PCA assumes Gaussian distributions and estimates covariance matrices and mean values PCA is sensitive to outliers Wavelets provide compact data representation and improve recognition Improvement shows no statistically significant difference in recognition for different wavelets Need for a specialized transformation

-19 Optimum Transformation: Haar Wavelet Example

-20 Optimum Transformation: Haar Wavelet Repeat average and difference log 2 (n) times

-21 Optimum Transformation: Haar Wavelet Waveform interpretation

-22 Optimum Transformation: Haar Wavelet Matrix interpretation b=W*a where

-23 Optimum Transformation: Haar Wavelet Matrix interpretation for the class of signals B=W*A where A is (n x m) input signal matrix Selection of n best coefficients performed using the information index B s1 =S 1 *W*A where S 1 is (n x n*log 2 (n)) selection matrix

-24 Optimum Transformation: Evolutionary Iterations Iterating on the selected result B s2 =S 2 *W* B s1 where S 2 is a selection matrix or B s2 =S 2 *W* S 1 *W* A after k iterations B sk = S k *W*... S 2 *W* S 1 *W* A So, the optimized transformation matrix T= S k *W*... S 2 *W* S 1 *W can be obtained from the Haar wavelet

-25 Optimum Transformation: Evolutionary Iterations Learning with the evolved features

-26 Optimum Transformation: Evolutionary Iterations Waveform interpretation of T rows

-27 Optimum Transformation: Evolutionary Iterations Mean values and the evolved transformation Original Signals and the evolved transformation Bin Index Signal Value

-28 Two Class Training Training on HRR signals 17 o depression angle profiles of BMP2 and BTR60

-29 window t 8bit Sample # 1 Sample # m 8bit Haar-Wavelet Transform N.N.input signal is recognized 1 k Note: k  m Wavelet-Based Reconfigurable FPGA for Classification

-30 Block Diagram of The Parallel Architecture (0+1)/ (0+1)/

-31 Simplified Block Diagram of The Serial Architecture RR ARDR DRARDRAR ADADADAD R: register using CLBs A: registered Average D: registered difference R: register using IOBs 2 01 (0+1)/2 (0-1) (2+3)/2 (2-3) (0+1)/2 (0-1) 4 23 First the Blue Second the Green

-32 RAM-Based Wavelet RAM 16x8 RAM 16x8 RAM 16x8 RAM 16x8 RAM 16x8 PE WA RA Control Done Start Data In

-33 The Processing Element X XX X X

-34 Results: For One Iteration of Haar Wavelet For 8 samples: –Parallel arch.: 120 CLBs, 128 IOBs, 58ns. –Serial arch. : 98 CLBs*, 72 IOBs, 148ns*. Parallel Arch. wins for larger number of samples. For 16 samples: –Parallel arch.: 320 CLBs, 256 IOBs, 233ns. –RAM-Based arch.: 136 CLBs, 16 IOBs, ~ 1  s. RAM-Based Arch. Wins since 1  s is not so slow * These values increase very fast when the # of samples increases, and the delay becomes extremely higher.

-35 Reconfigurable Haar-Wavelet-Based Architecture PE ‘ Data

-36

-37 Test Results Testing on HRR signals 15 o depression angle profiles of BMP2 and BTR60 With 15 features selected correct classification for BMP2 data is 69.3% and for BTR60 is 82.6% Comparable results in SHARP Confusion Matrix for BMP2 data is 56.7% and for BTR60 is 67%

-38 Problem Issues BTR60 signals with 17 o and 15 o depression angles do not have compatible statistical distributions

-39 Problem Issues BMP2 and BTR60 signal distributions are not Gaussian

-40 Work Completed Information index and its properties Multidimensional MC integration Information as a measure of learning quality Information error Wavelets and their effect on pattern recognition Haar wavelet as a linear matrix operator Evolution of the Haar wavelet Statistical support for classification

-41 Recommendations and Future Work Training Data must represent a statistical sample of all signals not a hand picked subset Probability density functions will be approximated using parametric or NN approach Information measure will be extended to k-class problems Training and test will be performed on 12 class data Dynamic clustering will prepare decision tree structure Hybrid, evolutionary classifier will be developed