Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.

Slides:

Advertisements

Similar presentations

3D Geometry for Computer Graphics

Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Mutidimensional Data Analysis Growth of big databases requires important data processing.  Need for having methods allowing to extract this information.

Two-View Geometry CS Sastry and Yang

Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009

Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Algorithm.

Principal Component Analysis CMPUT 466/551 Nilanjan Ray.

One-Shot Multi-Set Non-rigid Feature-Spatial Matching

“Random Projections on Smooth Manifolds” -A short summary

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Principal Component Analysis

Dimensional reduction, PCA

Uncalibrated Geometry & Stratification Sastry and Yang

1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.

Real-time Combined 2D+3D Active Appearance Models Jing Xiao, Simon Baker,Iain Matthew, and Takeo Kanade CVPR 2004 Presented by Pat Chan 23/11/2004.

The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.

Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.

A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.

Continuous Latent Variables --Bishop

Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.

Camera parameters Extrinisic parameters define location and orientation of camera reference frame with respect to world frame Intrinsic parameters define.

NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.

Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.

Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.

Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University

Summarized by Soo-Jin Kim

Camera Geometry and Calibration Thanks to Martial Hebert.

Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.

Learning a Kernel Matrix for Nonlinear Dimensionality Reduction By K. Weinberger, F. Sha, and L. Saul Presented by Michael Barnathan.

THE MANIFOLDS OF SPATIAL HEARING Ramani Duraiswami | Vikas C. Raykar Perceptual Interfaces and Reality Lab University of Maryland, College park.

Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)

Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:

ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.

GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.

ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.

Manifold learning: MDS and Isomap

CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.

Dimensionality Reduction Part 2: Nonlinear Methods

Nonlinear Dimensionality Reduction Approach (ISOMAP)

Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.

Data Mining Course 0 Manifold learning Xin Yang. Data Mining Course 1 Outline Manifold and Manifold Learning Classical Dimensionality Reduction Semi-Supervised.

Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.

Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.

Linear Models for Classification

Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.

MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.

CSC321: Lecture 25: Non-linear dimensionality reduction Geoffrey Hinton.

Chapter 13 Discrete Image Transforms

CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Dimensionality Reduction CS 685: Special Topics in Data Mining Spring 2008 Jinze.

Nonlinear Dimension Reduction: Semi-Definite Embedding vs. Local Linear Embedding Li Zhang and Lin Liao.

Manifold Learning JAMES MCQUEEN – UW DEPARTMENT OF STATISTICS.

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Dimensionality Reduction Part 1: Linear Methods Comp Spring 2007.

Unsupervised Learning II Feature Extraction

Multi-index Evaluation Algorithm Based on Locally Linear Embedding for the Node importance in Complex Networks Fang Hu

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:

Eric Xing © Eric CMU, Machine Learning Data visualization and dimensionality reduction Eric Xing Lecture 7, August 13, 2010.

Spectral Methods for Dimensionality

Unsupervised Riemannian Clustering of Probability Density Functions

René Vidal and Xiaodong Fan Center for Imaging Science

Machine Learning Basics

Spectral Methods Tutorial 6 1 © Maks Ovsjanikov

Machine Learning Dimensionality Reduction

Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE

Principal Component Analysis

Principal Component Analysis

Nonlinear Dimension Reduction:

NonLinear Dimensionality Reduction or Unfolding Manifolds

Presentation transcript:

Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally linear embedding," Roweis & Saul, Science, "Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifolds, " Lawrence K. Saul, Sam T. Roweis, JMLR, 4(Jun): , Presented by Yueng-tien, Lo

In Memoriam SAM ROWEIS

outline Introduction Algorithm Examples Conclusion

introduction How do we judge similarity? – Our mental representations of the world are formed by processing large numbers of sensory inputs— including, for example, the pixel intensities of images, the power spectra of sounds, and the joint angles of articulated bodies. While complex stimuli of this form can be represented by points in a high-dimensional vector space, they typically have a much more compact description.

the curse of dimensionality Hughes effect The problem caused by the exponential increase in volume associated with adding extra dimensions to a (mathematical) space. From wiki

The problem of nonlinear dimensionality reduction An unsupervised learning algorithm must discover the global internal coordinates of the manifold without external signals that suggest how the data should be embedded in two dimensions.

manifold In mathematics, more specifically in differential geometry and topology, a manifold is a mathematical space that on a small enough scale resembles the Euclidean space of a specific dimension, called the dimension of the manifold The concept of manifolds is central to many parts of geometry and modern mathematical physics because it allows more complicated structures to be expressed and understood in terms of the relatively well- understood properties of simpler spaces. From wiki

introduction Coherent structure in the world leads to strong correlations between inputs (such as between neighboring pixels in images), generating observations that lie on or close to a smooth low- dimensional manifold. To compare and classify such observations—in effect, to reason about the world—depends crucially on modeling the nonlinear geometry of these low- dimensional manifolds.

introduction The color coding illustrates the neighborhood preserving mapping discovered by LLE Unlike LLE, projections of the data by principal component analysis (PCA) or classical MDS map faraway data points to nearby points in the plane, failing to identify the underlying structure of the manifold

introduction Previous approaches to this problem, based on multidimensional scaling (MDS), have computed embeddings that attempt to preserve pairwise distances [or generalized disparities ] between data points. Locally linear embedding (LLE) eliminates the need to estimate pairwise distances between widely separated data points. Unlike previous methods, LLE recovers global nonlinear structure from locally linear fits. The problem involves mapping high-dimensional inputs into a low dimensional “description” space with as many coordinates as observed modes of variability.

algorithm ee

algorithm-step1 Suppose the data consist of N real-valued vectors each of dimensionality D, sampled from some underlying manifold. we expect each data point and its neighbors to lie on or close to a locally linear patch of the manifold.

algorithm-step 2 Characterize the local geometry of these patches by linear coefficients that reconstruct each data point from its neighbors. Reconstruction errors are measured by the cost function which adds up the squared distances between all the data points and their reconstructions.

algorithm-step 2 Minimize the cost function subject to two constraints: – First, each data point is reconstructed only from its neighbors, enforcing does not belong to the set of neighbors of – Second, that the rows of the weight matrix sum to one:.

algorithm: a least-squares problem The reconstruction is minimized: we have introduced the “local” Gram matrix By construction, this Gram matrix is symmetric and semipositive definite. The reconstruction error can be minimized analytically using a Lagrange multiplier to enforce the constraint that

algorithm: a least-squares problem Third, compute the reconstruction weights: If the correlation matrix G is nearly singular, it can be conditioned (before inversion) by adding a small multiple of the identity matrix. This amounts to penalizing large weights that exploit correlations beyond some level of precision in the data sampling process

algorithm The constrained weights that minimize these reconstruction errors obey an important symmetry: for any particular data point, they are invariant to rotations, rescalings, and translations of that data point and its neighbors. Suppose the data lie on or near a smooth nonlinear manifold of lower dimensionality

algorithm By design, the reconstruction weights reflect intrinsic geometric properties of the data that are invariant to exactly such transformations.

algorithm-step3 Each high-dimensional observation is mapped to a low-dimensional vector representing global internal coordinates on the manifold. This is done by choosing d-dimensional coordinate to minimize the embedding cost function This cost function, like the previous one, is based on locally linear reconstruction errors, but here we fix the weights while optimizing the coordinates

algorithm-step3 Subject to constraints that make the problem well- posed, it can be minimized by solving a sparse eigenvalue problem, whose bottom nonzero eigenvectors provide an ordered set of orthogonal coordinates centered on the origin.

algorithm: a Eigenvalue problem The embedding vectors are found by minimizing the cost function over with fixed weights We remove this degree of freedom by requiring the coordinates to be centered on the origin To avoid degenerate solutions, we constrain the embedding vectors to have unit covariance, with outer products that satisfy,where :

algorithm: a Eigenvalue problem Now the cost defines a quadratic form involving inner products of the embedding vectors and the symmetric matrix where is 1 if and 0 otherwise. The optimal embedding, up to a global rotation of the embedding space, is found by computing the bottom d+1 eigenvectors of this matrix

algorithm For such implementations of LLE, the algorithm has only one free parameter: the number of neighbors, K.

algorithm For such implementations of LLE, the algorithm has only one free parameter: the number of neighbors, K.

Effect of neighborhood size on LLE.

example Coherent structure in th Note how LLE colocates words with similar contexts in this continuous semantic fd space.

algorithm

Conclusion LLE is likely to be even more useful in combination with other methods in data analysis and statistical learning. Perhaps the greatest potential lies in applying LLE to diverse problems beyond those considered here. Given the broad appeal of traditional methods, such as PCA and MDS, the algorithm should find widespread use in many areas of science.

Appendix