University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955 www.cs.joensuu.fi Isomap Algorithm.

Slides:



Advertisements
Similar presentations
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Advertisements

Manifold Learning Dimensionality Reduction. Outline Introduction Dim. Reduction Manifold Isomap Overall procedure Approximating geodesic dist. Dijkstra’s.
Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009
1 A Survey on Distance Metric Learning (Part 1) Gerry Tesauro IBM T.J.Watson Research Center.
Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,
1 High dimensionality Evgeny Maksakov CS533C Department of Computer Science UBC.
“Random Projections on Smooth Manifolds” -A short summary
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
1 High dimensionality Evgeny Maksakov CS533C Department of Computer Science UBC.
Dimensionality Reduction and Embeddings
LLE and ISOMAP Analysis of Robot Images Rong Xu. Background Intuition of Dimensionality Reduction Linear Approach –PCA(Principal Component Analysis) Nonlinear.
1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.
Manifold Learning: ISOMAP Alan O'Connor April 29, 2008.
1 NONLINEAR MAPPING: APPROACHES BASED ON OPTIMIZING AN INDEX OF CONTINUITY AND APPLYING CLASSICAL METRIC MDS TO REVISED DISTANCES By Ulas Akkucuk & J.
Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.
Distance Metric Learning: A Comprehensive Survey
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.
Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.
Dimensionality Reduction
NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.
Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.
Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.
CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Speech and.
Graph Embedding: A General Framework for Dimensionality Reduction Dong XU School of Computer Engineering Nanyang Technological University
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Adaptive nonlinear manifolds and their applications to pattern.
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
Learning a Kernel Matrix for Nonlinear Dimensionality Reduction By K. Weinberger, F. Sha, and L. Saul Presented by Michael Barnathan.
THE MANIFOLDS OF SPATIAL HEARING Ramani Duraiswami | Vikas C. Raykar Perceptual Interfaces and Reality Lab University of Maryland, College park.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
Materials Process Design and Control Laboratory A NONLINEAR DIMENSION REDUCTION STRATEGY FOR GENERATING DATA DRIVEN STOCHASTIC INPUT MODELS Baskar Ganapathysubramanian.
ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.
GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.
CSE 185 Introduction to Computer Vision Face Recognition.
Dimensionality Reduction
Manifold learning: MDS and Isomap
CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.
Dimensionality Reduction Part 2: Nonlinear Methods
1 LING 696B: MDS and non-linear methods of dimension reduction.
Nonlinear Dimensionality Reduction Approach (ISOMAP)
Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.
Non-Linear Dimensionality Reduction
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Data Projections & Visualization Rajmonda Caceres MIT Lincoln Laboratory.
Data Mining Course 2007 Eric Postma Clustering. Overview Three approaches to clustering 1.Minimization of reconstruction error PCA, nlPCA, k-means clustering.
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
CSC321: Lecture 25: Non-linear dimensionality reduction Geoffrey Hinton.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Dimensionality Reduction CS 685: Special Topics in Data Mining Spring 2008 Jinze.
Nonlinear Dimension Reduction: Semi-Definite Embedding vs. Local Linear Embedding Li Zhang and Lin Liao.
CSC321: Extra Lecture (not on the exam) Non-linear dimensionality reduction Geoffrey Hinton.
Manifold Learning JAMES MCQUEEN – UW DEPARTMENT OF STATISTICS.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Dimensionality Reduction Part 1: Linear Methods Comp Spring 2007.
國立雲林科技大學 National Yunlin University of Science and Technology Supervised Nonlinear Dimensionality Reduction for Visualization and Classification Xin Geng,
Multi-index Evaluation Algorithm Based on Locally Linear Embedding for the Node importance in Complex Networks Fang Hu
Columbia University Advanced Machine Learning & Perception – Fall 2006 Term Project Nonlinear Dimensionality Reduction and K-Nearest Neighbor Classification.
Eric Xing © Eric CMU, Machine Learning Data visualization and dimensionality reduction Eric Xing Lecture 7, August 13, 2010.
Spectral Methods for Dimensionality
Nonlinear Dimensionality Reduction
Dipartimento di Ingegneria «Enzo Ferrari»,
Dimensionality Reduction
Spectral Methods Tutorial 6 1 © Maks Ovsjanikov
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
Dimensionality Reduction
Nonlinear Dimension Reduction:
NonLinear Dimensionality Reduction or Unfolding Manifolds
Presentation transcript:

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Algorithm Yuri Barseghyan Yasser Essiarab

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Linear Methods for Dimensionality Reduction –PCA (Principal Component Analysis): rotate data so that principal axes lie in direction of maximum variance –MDS (Multi-Dimensional Scaling): find coordinates that best preserve pairwise distances

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Limitations of Linear methods What if the data does not lie within a linear subspace? Do all convex combinations of the measurements generate plausible data? Low-dimensional non-linear Manifold embedded in a higher dimensional space

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Non-linear Dimensionality Reduction What about data that cannot be described by linear combination of latent variables? –Ex: swiss roll, s-curve In the end, linear methods do nothing more than “globally transform” (rotate/translate/scale) data. Sometimes need to “unwrap” the data first PCA

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Non-linear Dimensionality Reduction Unwrapping the data = “manifold learning” Assume data can be embedded on a lower-dimensional manifold Given data set X = {x i } i=1…n, find representation Y = {y i } i=1…n where Y lies on lower-dimensional manifold Instead of preserving global pairwise distances, non-linear dimensionality reduction tries to preserve only the geometric properties of local neighborhoods

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isometry From Mathworld: two Riemannian manifolds M and N are isometric if there is a diffeomorphism such that the Riemannian metric from one pulls back to the metric on the other. For a complete Riemannian manifold: d(x, y) = geodesic distance between x and y Informally, an isometry is a smooth invertible mapping that looks locally like a rotation plus translation Intuitively, for 2-dimensional case, isometries include whatever physical transformations one can perform on a sheet of paper without introducing tears, holes, or self-intersections

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Trustworthiness [2] The trustworthiness quanties how trustworthy is a projection of a high-dimensional data set onto a low-dimensional space. Specically a projection is trustworthy if the set of the t nearest neighbors of each data point in the lowdimensional space are also close-by in the original space. r(i, j) is the rank of the data point j in the ordering according to the distance from i in the original data space U t (i) denotes the set of those data points that are among the t- nearest neighbors of the data point i in the low-dimensional space but not in the original space. The maximal value that trustworthiness can take is equal to one. The closer M(t) is to one, the better the low-dimensional space describes the originaldata.

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Several methods to learn a manifold Two to start: –Isomap [Tenenbaum 2000] –Locally Linear Embeddings (LLE) [Roweis and Saul, 2000] Recently: –Semidefinite Embeddings (SDE) [Weinberger and Saul, 2005]

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax An important observation Small patches on a non-linear manifold look linear These locally linear neighborhoods can be defined in two ways –k-nearest neighbors: find the k nearest points to a given point, under some metric. Guarantees all items are similarly represented, limits dimension to K-1 –ε-ball: find all points that lie within ε of a given point, under some metric. Best if density of items is high and every point has a sufficient number of neighbors

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Find coordinates on lower-dimensional manifold that preserve geodesic distances instead of Euclidean distances Key Observation: If goal is to discover underlying manifold, geodesic distance makes more sense than Euclidean Small Euclidean distance Large geodesic distance

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Calculating geodesic distance We know how to calculate Euclidean distance Locally linear neighborhoods mean that we can approximate geodesic distance within a neighborhood using Euclidean distance A graph is constructed by connecting each point to its K nearest neighbours. Approximate geodesic distances are calculated by finding the length of the shortest path in the graph between points Use Dijkstra’s algorithm to fill in remaining distances

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Dijkstra’s Algorithm Greedy breadth-first algorithm to compute shortest path from one point to all other points

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Algorithm –Compute fully-connected neighborhood of points for each item Can be k nearest neighbors or ε-ball –Calculate pairwise Euclidean distances within each neighborhood –Use Dijkstra’s Algorithm to compute shortest path from each point to non-neighboring points –Run MDS on resulting distance matrix

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Algorithm [3]

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Time Complexity of Algorithm

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Results Find a 2D embedding of the 3D S-curve

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Residual Fitting Error Plotting eigenvalues from MDS will tell you dimensionality of your data

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Neighborhood Graph

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax More Isomap Results

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Results on projecting the face dataset to two dimensions (Trustworthiness−Continuity) [1]

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax More Isomap Results

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Failures Isomap has problems on closed manifolds of arbitrary topology

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap: Advantages Nonlinear Globally optimal –Still produces globally optimal low-dimensional Euclidean representation even though input space is highly folded, twisted, or curved. Guarantee asymptotically to recover the true dimensionality.

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap: Disadvantages Guaranteed asymptotically to recover geometric structure of nonlinear manifolds –As N increases, pairwise distances provide better approximations to geodesics by “hugging surface” more closely –Graph discreteness overestimates dM(i,j) K must be high to avoid “linear shortcuts” near regions of high surface curvature Mapping novel test images to manifold space

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Literature [1] Jarkko Venna and Samuel Kaski, Nonlinear dimensionality reduction viewed as information retrieval, NIPS' 2006 workshop on Novel Applications of Dimensionality Reduction, 9 Dec [2] Claudio Varini, Visual Exploration of Multivariate Data in Breast Cancer by Dimensional Reduction, March bin/dokserv?idn= x&dok_var=d1&dok_ext=pdf&filena me= x.pdf [3] YimingWu, Kap Luk Chan, An Extended Isomap Algorithm for Learning Multi-Class Manifold, Machine Learning and Cybernetics, Proceedings of 2004 International Conference, Aug