ECE 8527 Homework Final: Common Evaluations By Andrew Powell.

Slides:

Advertisements

Similar presentations

Artificial Neural Networks

Advertisements

Data Mining Classification: Alternative Techniques

Salvatore giorgi Ece 8110 machine learning 5/12/2014

1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.

K Means Clustering , Nearest Cluster and Gaussian Mixture

Supervised Learning Recap

Automatic Speech Recognition II  Hidden Markov Models  Neural Network.

Machine Learning Neural Networks

Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.

Artificial Neural Networks ECE 398BD Instructor: Shobha Vasudevan.

MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 

Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Data Mining Techniques Outline

Chapter 2: Pattern Recognition

Chapter 6: Multilayer Neural Networks

1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.

Lecture #1COMP 527 Pattern Recognition1 Pattern Recognition Why? To provide machines with perception & cognition capabilities so that they could interact.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.

Radial Basis Function Networks

8/10/ RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.

Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.

EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

Isolated-Word Speech Recognition Using Hidden Markov Models

Alignment and classification of time series gene expression in clinical studies Tien-ho Lin, Naftali Kaminski and Ziv Bar-Joseph.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Presented by Tienwei Tsai July, 2005

Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.

Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.

Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.

Hidden Markov Models in Keystroke Dynamics Md Liakat Ali, John V. Monaco, and Charles C. Tappert Seidenberg School of CSIS, Pace University, White Plains,

1 Hidden Markov Model 報告人：鄒昇龍. 2 Outline Introduction to HMM Activity of HMM Problem and Solution Conclusion Reference.

Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.

ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.

Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.

1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.

CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:

Prototype Classification Methods Fu Chang Institute of Information Science Academia Sinica ext. 1819

Artificial Neural Networks Approach to Stock Prediction Presented by Justin Jaeck.

Chapter 13 (Prototype Methods and Nearest-Neighbors )

John Lafferty Andrew McCallum Fernando Pereira

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 04: GAUSSIAN CLASSIFIERS Objectives: Whitening.

Predictive Application- Performance Modeling in a Computational Grid Environment (HPDC ‘99) Nirav Kapadia, José Fortes, Carla Brodley ECE, Purdue Presented.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Classification of melody by composer using hidden Markov models Greg Eustace MUMT 614: Music Information Acquisition, Preservation, and Retrieval.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.

Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.

Linear Models & Clustering Presented by Kwak, Nam-ju 1.

CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.

Big data classification using neural network

LECTURE 11: Advanced Discriminant Analysis

Hidden Markov Models Part 2: Algorithms

network of simple neuron-like computing elements

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Parametric Methods Berlin Chen, 2005 References:

Evolutionary Ensembles with Negative Correlation Learning

Presentation transcript:

ECE 8527 Homework Final: Common Evaluations By Andrew Powell

Outline Introduction Objectives Tools Obstacles Algorithms (Training / Classification / Performance) Hidden Markov Models (HMM) K-Nearest Neighbors (KNN) Neural Network (NN) Conclusion Final Thoughts References

Outline Introduction Objectives Tools Obstacles Algorithms (Training / Classification / Performance) Hidden Markov Models (HMM) K-Nearest Neighbors (KNN) Neural Network (NN) Conclusion Final Thoughts References

Introduction: Objectives Select three nontrivial, machine-learning algorithms supported in MATLAB ©. The selected algorithms are the following. Hidden Markov Models (HMM) *** K-Nearest Neighbors (KNN) Neural Networks (NN) Determine the performance of the selected algorithms as a function of their “key” parameters through a series of simulations. Recognition Error Rate Discriminant Values Timing Analysis Present the information.

Introduction: Tools MATLAB Used for all the programming and simulation Important to note since performing the same simulations in different languages/environments will most likely produce different results Kevin Murphy’s Hidden Markov Model Toolbox Used for training HMMs that emit Gaussian-mixtures and classification HMM training supported in MATLAB’s Statistics Toolbox is relatively more difficult to use since there is no built-in support for continuous observations. Statistics Toolbox Used for classification with KNN Neural Network Toolbox Used for training regular feed-forward NN and classification

Introduction: Obstacles Timing Although determining the time it takes to train the models and classify the data vectors is relatively simple, comparing the timing results of each algorithm with each other’s timing results is problematic since a number of factors impact how well an algorithm can perform. The focus will be on how the timing results change when parameters are altered Discriminant Values Part of the project requires to show that the recognition error rates correlate with the discriminant values generated for the purpose of classifying the data vectors. Due to the nature of how MATLAB’s proprietary toolboxes are implemented (e.g. Statistics Toolbox), it is difficult to obtain these values with the functions that perform the classification.

Outline Introduction Objectives Tools Obstacles Algorithms (Training / Classification / Performance) Hidden Markov Models (HMM) K-Nearest Neighbors (KNN) Neural Network (NN) Conclusion Final Thoughts References

Algorithms: Hidden Markov Models Introduction A HMM model is created for each class. Each model consists of transitional probabilities from one hidden state to another and the Gaussian mixtures emitted from each hidden state. Training of each model is of course done with the Baum-Welch (i.e. Expectation Maximization) algorithm. Prior to training, the models are randomized but stochastic matrices are ensured. The sequential data vectors of each class are considered to result from sequential time steps, and a class’s entire set of data vectors is interpreted as a single set of data Classification of a data vector is simply choosing the class whose model produces the largest log-likelihood. Parameters Iterations of Trainingdomain: 1 to 4, inclusively Number of Mixturesdomain: 1 to 4, inclusively Number of Hidden Statesdomain: 1 to 10, inclusively Performance Metrics Recognition Error Rate Average Difference between Largest and Second-Largest Log-Posterior Training Time Classification Time

Algorithms: Hidden Markov Models

Key Observations Recognition error rate is relatively low for fewer iterations The average difference between the two largest discriminant values for each assignment correlates with the recognition error rate, excluding when the number of hidden states is equal to 1. The more iterations appears to cause to the recognition error rate to increase.

Algorithms: K-Nearest Neighbors Introduction Computation is all done during classification Classification of a data vector is simply determining closest neighboring data vectors and then selecting the class that has the most data vectors within the set of closest neighboring data vectors. Parameters Number of Neighborsdomain: odd numbers between 1 to 201, inclusively Performance Metrics Recognition Error Rate Classification Time

Algorithms: K-Nearest Neighbors

Key Observations The recognition error rate actually grows with larger number of neighbors (i.e. k)

Algorithms: Neural Networks Introduction The single output of the forward-feed NN is converted from a continuous value to one of the possible discrete values (i.e. the possible classes) with Nearest Neighbors (i.e. KNN, where K=1). The training algorithm for the weights and biases is simply MATLAB’s default algorithm: Levenberg-Marquardt Recommended by MATLAB for its speed, though requires more memory The Neural Network Training toolbox allows for the manipulation of many parameters (such as allowing parallelization over GPUs and cross-validation); however, to keep the simulations relatively simple, only two parameters are modified Parameters Number of Layersdomain: 1, 5, 10, and 15 Number of Neutrons Per Layerdomain: 1, 5, 10, and 15 Performance Metrics Recognition Error Rate Average Distance from Selected Class Training Time Classification Time

Algorithms: Neural Networks

Key Observations In regards to classification, extra neutrons per layer do not appear to have much of an affect on the classification time. Extra layers appear to increase the classification time linearly. In regards to training, extra layers and neutrons per layer greatly increase training time.

Outline Introduction Objectives Tools Obstacles Algorithms (Training / Classification / Performance) Hidden Markov Models (HMM) K-Nearest Neighbors (KNN) Neural Network (NN) Conclusion Final Thoughts References

Conclusion: Final Thoughts Simulations for the algorithms HMM, KNN, and NN were implemented. The algorithms were tested for their performance Recognition Error Rate Discriminant Values Timing Analysis

Conclusion: References Keven Murphy’s HMM Implementation: Neural Network Toolbox and Statistics Toolbox MATLAB documentation