1 NONLINEAR MAPPING: APPROACHES BASED ON OPTIMIZING AN INDEX OF CONTINUITY AND APPLYING CLASSICAL METRIC MDS TO REVISED DISTANCES By Ulas Akkucuk & J.

Slides:

Advertisements

Similar presentations

Self-Organizing Maps Projection of p dimensional observations to a two (or one) dimensional grid space Constraint version of K-means clustering –Prototypes.

Advertisements

Manifold Learning Dimensionality Reduction. Outline Introduction Dim. Reduction Manifold Isomap Overall procedure Approximating geodesic dist. Dijkstra’s.

Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.

Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009

Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.

Multimedia DBs. Multimedia dbs A multimedia database stores text, strings and images Similarity queries (content based retrieval) Given an image find.

Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Algorithm.

Two Technique Papers on High Dimensionality Allan Rempel December 5, 2005.

“Random Projections on Smooth Manifolds” -A short summary

LLE and ISOMAP Analysis of Robot Images Rong Xu. Background Intuition of Dimensionality Reduction Linear Approach –PCA(Principal Component Analysis) Nonlinear.

Localization from Mere Connectivity Yi Shang (University of Missouri - Columbia); Wheeler Ruml (Palo Alto Research Center ); Ying Zhang; Markus Fromherz.

Dimensionality Reduction

1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.

1 Numerical geometry of non-rigid shapes Lecture II – Numerical Tools Numerical geometry of shapes Lecture II – Numerical Tools non-rigid Alex Bronstein.

Manifold Learning: ISOMAP Alan O'Connor April 29, 2008.

Optimization Methods One-Dimensional Unconstrained Optimization

Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.

1 Numerical geometry of non-rigid shapes A journey to non-rigid world objects Numerical methods non-rigid Alexander Bronstein Michael Bronstein Numerical.

Non-Euclidean Embedding

A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.

Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.

1 Numerical geometry of non-rigid shapes Non-Euclidean Embedding Non-Euclidean Embedding Lecture 6 © Alexander & Michael Bronstein tosca.cs.technion.ac.il/book.

NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.

Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.

Optimization Methods One-Dimensional Unconstrained Optimization

Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.

Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.

Radial Basis Function Networks

Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.

Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University

By: De’Aja Koontz 6 Th Period.  A member of the set of positive whole numbers {1, 2, 3,... }, negative whole numbers {-1, -2, -3,... }, and zero {0}.

Graph Embedding: A General Framework for Dimensionality Reduction Dong XU School of Computer Engineering Nanyang Technological University

EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Adaptive nonlinear manifolds and their applications to pattern.

Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.

Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.

ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.

GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.

Dimensionality Reduction

Manifold learning: MDS and Isomap

CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.

Nonlinear Dimensionality Reduction Approach (ISOMAP)

Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.

Data Mining Course 0 Manifold learning Xin Yang. Data Mining Course 1 Outline Manifold and Manifold Learning Classical Dimensionality Reduction Semi-Supervised.

Non-Linear Dimensionality Reduction

Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.

Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.

Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.

381 Self Organization Map Learning without Examples.

CS621 : Artificial Intelligence

Data Mining Course 2007 Eric Postma Clustering. Overview Three approaches to clustering 1.Minimization of reconstruction error PCA, nlPCA, k-means clustering.

1 E.V. Myasnikov 2007 Digital image collection navigation based on automatic classification methods Samara State Aerospace University RCDL 2007Интернет-математика.

Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.

Spectrum Reconstruction of Atmospheric Neutrinos with Unfolding Techniques Juande Zornoza UW Madison.

CSC321: Lecture 25: Non-linear dimensionality reduction Geoffrey Hinton.

CSC321: Extra Lecture (not on the exam) Non-linear dimensionality reduction Geoffrey Hinton.

Spectral Methods for Dimensionality

Nonlinear Dimensionality Reduction

Building Adaptive Basis Function with Continuous Self-Organizing Map

Data Mining, Neural Network and Genetic Programming

Morphing and Shape Processing

Dimensionality Reduction

Clustering (3) Center-based algorithms Fuzzy k-means

Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE

Enumerating Distances Using Spanners of Bounded Degree

Quality of Service in Multimedia Distribution

Juande Zornoza UW Madison

NonLinear Dimensionality Reduction or Unfolding Manifolds

Presentation transcript:

1 NONLINEAR MAPPING: APPROACHES BASED ON OPTIMIZING AN INDEX OF CONTINUITY AND APPLYING CLASSICAL METRIC MDS TO REVISED DISTANCES By Ulas Akkucuk & J. Douglas Carroll Rutgers Business School – Newark and New Brunswick

2 Outline Introduction Nonlinear Mapping Algorithms Parametric Mapping Approach ISOMAP Approach Other Approaches Experimental Design and Methods Error Levels Evaluation of Mapping Performance Problem of Similarity Transformations Results Discussion and Future Direction

3 Introduction Problem: To determine a smaller set of variables necessary to account for a larger number of observed variables PCA and MDS are useful when relationship is linear Alternative approaches needed when the relationship is highly nonlinear

4 Shepard and Carroll (1966) –Locally monotone analysis of proximities: Nonmetric MDS treating large distances as missing Worked well if the nonlinearities were not too severe (in particular if the surface is not closed such as a circle or sphere) –Optimization of an index of “continuity” or “smoothness” Incorporated into a computer program called “PARAMAP” and tested on various sets of data

5 20 points on a circle

6 62 regularly spaced points on a sphere, and the azimuthal equidistant projection of the world

7 49 points regularly spaced on a torus embedded in four dimensions

8 In all cases the local structure is preserved except points at which the shape is “cut open” or “punctured” Results were successful, but severe local minimum problem existed Addition of error to the regular spacing made the local minimum problem worse Current work is stimulated by two articles on nonlinear mapping (Tenenbaum, de Silva, & Langford, 2000; Roweis & Saul, 2000)

9 Nonlinear Mapping Algorithms –n : number of objects –M : dimensionality of the input coordinates, in other words of the configuration for which we would like to find an underlying lower dimensional embedding. –R : dimensionality of the space of recovered configuration, where R<M –Y : n  M input matrix –X : n  R output matrix

10 –The distances between point i and point j in the input and output spaces respectively are calculated as:   [  ij ] D  [ d ij ]

11 Parametric Mapping Approach Works via optimizing an index of “continuity” or “smoothness” Early application in the context of time- series data (von Neuman, Kent, Bellison, & Hart, 1941; von Neuman, 1941)

12 A more general expression for the numerator is: Generalizing to the multidimensional case we reach 

13 Several modifications needed for the minimization procedure: –d 2 ij + Ce 2 is substituted for d 2 ij, C is a constant equal to 2 / (n - 1) and e takes on values between 0 and 1 –e has a practical effect on accelerating the numerical process –Can be thought of as an extra “specific” dimension, as e gets closer to 0 points are made to approach “common” part of space

14 –In the numerator the constant z, and in the denominator [2/n(n  1)] 2 Final form of function:

15 Implemented in C++ (GNU-GCC compiler) Program takes as input e, number of repetitions, dimensionality R to be recovered, and number of random starts or starting input configuration 200 iterations each for 100 different random configurations yields reasonable solutions Then this resulting best solution can be further fine tuned by performing more iterations

16 ISOMAP Approach Tries to overcome difficulties in MDS by replacing the Euclidean metric by a new metric Figure (Lee, Landasse, & Verleysen, 2002)

17 To approximate the “geodesic” distances ISOMAP constructs a neighborhood graph that connects the closer points –This is done by connecting the k closest neighbors or points that are close to each other by  or less distance A shortest path procedure is then applied to the resulting matrix of modified distances Finally classical metric MDS is applied to obtain the configuration in the lower dimensionality

18 Other Approaches Nonmetric MDS: Minimizes a cost function Needed to implement locally monotone MDS approach of Shepard (Shepard & Carroll, 1966)

19 Sammon’s mapping: Minimizes a mapping error function Kruskal (1971) indicated certain options used with nonmetric MDS programs would give the same results

20 Multidimensional scaling by iterative majorization (Webb, 1995) Curvilinear Distance Analysis (CDA) (Lee et al., 2002), analogue of ISOMAP, omits the MDS step replacing it by a minimization step Self organizing map (SOM) (Kohonen 1990, 1995) Auto associative feedforward neural networks (AFN) (Baldi & Hornik, 1989; Kramer, 1991)

21 Experimental Design and Methods Primary focus: 62 located at the intersection of 5 equally spaced parallels and 12 equally spaced meridians Two types of error A and B –A: 0%, 10%, 20% –B: ±0.00, ±0.01, ±0.05, ±0.10, ±0.20 Control points being irregularly spaced and being inside or outside the sphere respectively

22

23 To evaluate mapping performance:We calculate “rate of agreement in local structure”abbreviated “agreement rate” or A –Similar to RAND index used to compare partitions (Rand, 1971; Hubert & Arabie, 1985) – Let a i stand for the number of points that are in the k-nearest neighbor list for point i in both X and Y. A will be equal to

k=2, Agreement rate = 2/10 or 20 % Example of calculating agreement rate

25 Problem of similarity transformations: We use standard software to rotate the different solutions into optimal congruence with a landmark solution (Rohlf & Slice 1989) We use the solution for the error free and regularly spaced sphere as the landmark We report also VAF

26 The VAF results may not be very good Similarity transformation step is not enough An alternating algorithm is needed that reorders the points on each of the five parallels and then finds the optimal similarity transformation We also provide Shepard-like diagrams

27 Why similarity transformation is not enough?

28 Results Agreement rate for the regularly spaced and errorless sphere 82.9%, k=5 Over 1000 randomizations of the solution: Average, and standard deviation of the agreement rate 8.1% and 1.9% respectively. Minimum and maximum are 3.5% and 16.7%

29

30 We can use Chebychev’s inequality stated as: 82.9 is about 40 standard deviations away from the mean, an upper bound of the probability that this event happens by chance is 1/40 2 or , very low!

31 (a) (b) (c) (d)

32 (e)(f) (g) (h)

33 (i)(j) (k)(l)

34 (m)(n) (o)

35

36 A=48.1 % A=82.9% ISOMAP PARAMAP

37 Shepard-like Diagrams

38 Agreement rate=ISOMAP 59.7%, PARAMAP 70.5% SWISS Roll Data – 130 points

39 Discussion and Future Direction Disadvantage of PARAMAP: Run time Advantage of ISOMAP: Noniterative procedure, can be applied to very large data sets with ease Disadvantage of ISOMAP: Bad performance in closed data sets like the sphere

40 Improvements in computational efficiency of PARAMAP should be explored: –Use of a conjugate gradient algorithm instead of straight gradient algorithm –Use of conjugate gradient with restarts algorithm –Possible combination of straight gradient and conjugate gradient approaches Improvements that could both benefit ISOMAP and PARAMAP: –A wise selection of landmarks and an interpolation or extrapolation scheme to recover the rest of the data