COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

Slides:



Advertisements
Similar presentations
Principles of Density Estimation
Advertisements

PARTITIONAL CLUSTERING
Unsupervised Learning
Data Mining Classification: Alternative Techniques
Salvatore giorgi Ece 8110 machine learning 5/12/2014
K Means Clustering , Nearest Cluster and Gaussian Mixture
An Overview of Machine Learning
Supervised Learning Recap
Classification and Decision Boundaries
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Clustering.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
8/10/ RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.
Evaluating Performance for Data Mining Techniques
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Nearest Neighbor (NN) Rule & k-Nearest Neighbor (k-NN) Rule Non-parametric : Can be used with arbitrary distributions, No need to assume that the form.
CHAPTER 7: Clustering Eick: K-Means and EM (modified Alpaydin transparencies and new transparencies added) Last updated: February 25, 2014.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Clustering Algorithms Presented by Michael Smaili CS 157B Spring
Cluster Analysis Potyó László. Cluster: a collection of data objects Similar to one another within the same cluster Similar to one another within the.
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Prototype Classification Methods Fu Chang Institute of Information Science Academia Sinica ext. 1819
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.
Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.
Flat clustering approaches
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Fuzzy Pattern Recognition. Overview of Pattern Recognition Pattern Recognition Procedure Feature Extraction Feature Reduction Classification (supervised)
Debrup Chakraborty Non Parametric Methods Pattern Recognition and Machine Learning.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
KNN & Naïve Bayes Hongning Wang
Gaussian Mixture Model classification of Multi-Color Fluorescence In Situ Hybridization (M-FISH) Images Amin Fazel 2006 Department of Computer Science.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
Ch8: Nonparametric Methods
Clustering (3) Center-based algorithms Fuzzy k-means
K Nearest Neighbor Classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
INTRODUCTION TO Machine Learning
LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
CSE572: Data Mining by H. Liu
Machine Learning – a Probabilistic Perspective
EM Algorithm and its Applications
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Presentation transcript:

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition

Data sets O Two speech data sets trainingtest O Each has a training and a test data sets O Set 1 O 10 dimensions; 11 classes O 528/379/83 – training/development/evaluation O Set 2 O 39 dimensions; 5 classes O 925/350/225– training/development/evaluation O 5 sets of vectors for each class

Methods O K-Means Clustering (K-Means) O K-Nearest Neighbor (KNN) O Gaussian Mixture Model (GMM)

K-Means Clustering O It is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. O k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. O K-Means aims to minimize the within-cluster sum of squares [5] O The problem is computationally difficult; however, there are optimizations O K-Means tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.

O Euclidean distance is used as a metric and variance is used as a measure of cluster scatter. O The number of clusters k is an input parameter needed and convergence to a local minimum may be possible O A key limitation of k-means is its cluster model. The concept is based on spherical clusters that are separable in a way so that the mean value converges towards the cluster center. O The clusters are expected to be of similar size, so that the assignment to the nearest cluster center is the correct assignment. Good for compact clusters O Sensitive to outlayers K-Means Clustering

O Parameters: Euclidian distance; k selected randomly O Results O Not much change in error from changes in parameters Misclassification Error, % Trials Set 1Set 2 Trial Trial Trial Trial Trial Average Error, %

K-Nearest Neighbor O A non-parametric method used for classification and regression. O The input consists of the k closest training examples in the feature space. O The output is a class membership. An object is classified by a majority vote of its neighbors O KNN is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classification. O the simplest of all machine learning algorithms. O sensitive to the local structure of the data.

K-Nearest Neighbor O The high degree of local sensitivity makes 1NN highly susceptible to noise in the training data. A higher value of k results in a smoother, less locally sensitive, function. O The drawback of increasing the value of k is of course that as k approaches n, where n is the size of the instance base, the performance of the classifier will try to fit to the class most frequently represented in the training data [6].

K-Nearest Neighbor O Results Set 1 O Results Set 2

Gaussian Mixtures Model O Is a parametric probability density function represented as a weighted sum of Gaussian component densities. O Commonly used as a parametric model of the probability distribution of continuous measurements or features in biometric systems (speech recognition) O Parameters are estimated from training data using the iterative Expectation- Maximization (EM) algorithm or Maximum A Posteriory (MAP) estimation from well trained prior model.

Gaussian Mixtures Model O Not really a model but a probability distribution O Unsupervised O Convecs combination of Gaussian PDF O Each has mean and covarience O Good for clustering O Capable of representing a large class of sample distributions O Ability to form smooth approximations to arbitrary smoothed densities [6] O Great for modeling human speech

Gaussian Mixtures Model O Results O Long computations

Discussion O Current performance: Method Probability of error Set 1Set 2 K-Means KNN GMM

Discussion O What can be done: O normalization of the data sets O removal the outliers O Improving on the clustering techniques O Combining methods for better performance

References [1] R.O. Duda, P.E. Hart, and D. G. Stork, “Pattern Classification,” 2 nd ed., pp., New York : Wiley, [2] C.M. Bishop, “Pattern Recognition and Machine Learning,” New York : Springer, pp., [3] [4] [5] [6] files/full_papers/0802_Reynolds_Biometrics-GMM.pdfhttp://llwebprod2.ll.mit.edu/mission/cybersec/publications/publication- files/full_papers/0802_Reynolds_Biometrics-GMM.pdf

Thank you!