A Toolkit for Remote Sensing Enviroinformatics Clustering Fazlul Shahriar, George Bonev Advisors: Michael Grossberg, Irina Gladkova, Srikanth Gottipati.

Slides:

Advertisements

Similar presentations

Clustering Basic Concepts and Algorithms

Advertisements

Component Analysis (Review)

K Means Clustering , Nearest Cluster and Gaussian Mixture

Supervised Learning Recap

Model assessment and cross-validation - overview

Chapter 4: Linear Models for Classification

Machine Learning and Data Mining Clustering

Assessment. Schedule graph may be of help for selecting the best solution Best solution corresponds to a plateau before a high jump Solutions with very.

Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.

Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.

Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.

Radial Basis Functions

© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Lecture 4 Unsupervised Learning Clustering & Dimensionality Reduction

Unsupervised Learning and Data Mining

End of Chapter 8 Neil Weisenfeld March 28, 2005.

Jeremy Tantrum, Department of Statistics, University of Washington joint work with Alejandro Murua & Werner Stuetzle Insightful Corporation University.

Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.

Clustering Unsupervised learning Generating “classes”

Evaluating Performance for Data Mining Techniques

Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.

Computer Vision James Hays, Brown

Probability of Error Feature vectors typically have dimensions greater than 50. Classification accuracy depends upon the dimensionality and the amount.

2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 1 Hastie, Tibshirani and Friedman.The Elements of Statistical Learning (2nd edition,

CSE 185 Introduction to Computer Vision Pattern Recognition.

Efficient Model Selection for Support Vector Machines

Data mining and machine learning A brief introduction.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Model Inference and Averaging

Presented by Tienwei Tsai July, 2005

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

CSE 185 Introduction to Computer Vision Pattern Recognition 2.

1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.

CHAPTER 7: Clustering Eick: K-Means and EM (modified Alpaydin transparencies and new transparencies added) Last updated: February 25, 2014.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:

MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:

ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.

Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Clustering Algorithms Presented by Michael Smaili CS 157B Spring

1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.

Prototype Classification Methods Fu Chang Institute of Information Science Academia Sinica ext. 1819

Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.

Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.

Flat clustering approaches

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

Chapter 13 (Prototype Methods and Nearest-Neighbors )

Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!

Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.

Intro. ANN & Fuzzy Systems Lecture 16. Classification (II): Practical Considerations.

Non-parametric Methods for Clustering Continuous and Categorical Data Steven X. Wang Dept. of Math. and Stat. York University May 13, 2010.

SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.

Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.

Gaussian Mixture Model classification of Multi-Color Fluorescence In Situ Hybridization (M-FISH) Images Amin Fazel 2006 Department of Computer Science.

Methods of multivariate analysis Ing. Jozef Palkovič, PhD.

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Data Science Practical Machine Learning Tools and Techniques 6.8: Clustering Rodney Nielsen Many / most of these slides were adapted from: I. H. Witten,

Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.

Deep Feedforward Networks

LECTURE 09: BAYESIAN ESTIMATION (Cont.)

Course Outline MODEL INFORMATION COMPLETE INCOMPLETE

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.

Lecture 16. Classification (II): Practical Considerations

Presentation transcript:

A Toolkit for Remote Sensing Enviroinformatics Clustering Fazlul Shahriar, George Bonev Advisors: Michael Grossberg, Irina Gladkova, Srikanth Gottipati Clustering Module: k-means - This is a top-down clustering algorithm which attempts to find k representative centers for the data. The initial means are selected from the training data itself. Fuzzy k-means - This is a top-down clustering algorithm which attempts to find k representative centers for the data. The initial means are selected from the training data itself. This algorithm uses a slightly different gradient search than the simple standard k-means algorithm, but generally yields the same final solution. Expectation-Maximization - Estimates the means and covariances of components in Gaussian Mixture Model. Competitive learning - Competitive learning clustering, where the nearest cluster center is updated according to the position of a randomly selected training pattern. Leader follower - Basic leader-follower clustering, which is similar to competitive learning but additionally generates a new cluster center whenever a new input pattern differs by more than threshold distance \theta from existing clusters. ADDC (agglomerative clustering) - An on-line (simple-pass) clustering algorithm which accepts a single sample at each step, updates the cluster centers and generates new centers as needed. The algorithm is efficient in that it generates the cluster centers with a single pass of the data. DSLVQ (distinction sensitive linear vector quantization) - Performs earning vector quantization (i.e., represents a data set by a small number of cluster centers) using a distinction or classification criterion rather than a traditional sum-square-error criterion. Minimum spanning tree (undirected) - Builds a minimum spanning tree for a data set based on nearest neighbors. Connected components - Finds connected components for a data set based on nearest neighbor. Returns a list of the connected components of the given graph. Graph cut - Graph cut clustering is achieved by cutting edges of the graph to form a good set of connected components such that the weights of within-components edges will be minimized compared to across-component edges. Spectral clustering - Spectral clustering techniques make use of the spectrum of the similarity matrix of the data to perform dimensionality reduction for clustering in fewer dimensions. HDR (hierarchical dimensionality reduction) - Clusters similar features so as to reduce the dimensionality of the data. SOHC (stepwise optimal hierarchical clustering) - Bottom-up clustering. The algorithm starts by assuming each training point is its own cluster and then iteratively merges the two clusters that change a clustering criterion the least, until the desired number of clusters, k are formed. Utility Functions: Make graph (similarity matrix) - Gen a set of data points A, the similarity matrix may be defined as a matrix S where each elements represents a measure of the similarity between points i,j in A. Normalization (standard deviation based) - Normalize a group of observations on a per feature basis. This is done by dividing each feature by its standard deviation across all observations. UniqueRand - Generates unique set of random points drawn from N(0,1) Training - Nearest neighbor classifier is used to classify test data set with the clustering obtained from trained data set. UniqueVector - Computes the unique set of feature vectors from a given set of feature vectors. Whitening transform - Performed on d-dimensional data set, it first subtracts the sample mean from each point, and then multiplies the data set by inverse of the square root of the covariance matrix Validation Module: Cross validation – statistical method for validating a predictive model. Subsets of the data are held out, to be used as validating sets; a model is fit to the remaining data (a training set) and used to predict for the validation set. Averaging the quality of the predictions across the validation sets yields an overall measure of prediction accuracy. Bootstrap - statistical method for estimating the sampling distribution of an estimator by sampling with replacement from the original sample, most often with the purpose of deriving robust estimates of standard errors and confidence intervals of a population parameter. Jackknife - To estimate the bias and standard error in a statistic, when a random sample of observations is used to calculate it. The basic idea behind the jackknife estimator lies in systematically recomputing the statistic estimate leaving out one observation at a time from the sample set. BIC - In parametric methods, there might be various candidate models each with a different number of parameters to represent a data set. The Bayesian information criterion is a useful statistical criterion for model selection for parametric methods. AIC – Tool for nonparametric model selection. Given a data set, several competing models may be ranked according to their AIC, with the one having the lowest AIC being the best. Clustering obtained using K-means algorithm where 5 clusters were specified and run with a random initial staring point. Clustering obtained with preprocessing step using whitening followed by K-means algorithm Clustering obtained with preprocessing step using whitening followed by EM algorithm Clustering obtained using EM algorithm where 5 clusters were specified and run with a random initial staring point. The algorithm usually gets stuck in a local minima. Modes obtained during the mean shift algorithm. Red dots represent the local peaks of the density estimate of the data Clustering obtained using a combination of mean shift and connected components algorithms Issues: Remotely sensing data is typically vast. Data size requires advanced tools to explore them semi-automatically. Clustering is one such tool. Implementation: Many clustering algorithms have been proposed in the literature but they are dispersed in multiple libraries in different languages. Hence it becomes difficult to test these algorithms on applications at hand. Our goal is to create a single library (platform independent) so that users can test them on remote sensing data To accomplish this, we choose Python programming language which gives a MATLAB-like interface and and at the same time lends to deal with large databases Furthermore, Python allows easy integration with C/C++/R libraries. Physics based cluster labeling: 1Physics based cluster labeling: 2Unsupervised nonparametric classification Colors are the same as used in the scatter plot above Clustering obtained on 2-d cloud of points by running mean shift procedure with various initializations Trajectories of the mean shift procedures drawn over the density estimate computed over the same data set. The peaks retained for final classification are marked with red dots. 2-d cloud of points where clustering using normal distribution based methods could fail while methods like spectral and geometric clustering algorithms could do a better job MODIS cloud classification over the eastern part of United States. 3-d cloud of points which could easily be clustered using a parametric method like the Expectation- Maximization (EM) algorithm Code fragment from clustering toolkitIPython - MATLAB-like interface for Python This research has been funded by NOAA-CREST grant # NA06OAR