Dimensionality Reduction and Embeddings

Slides:



Advertisements
Similar presentations
3D Geometry for Computer Graphics
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) Dimensionality Reductions or data projections Random projections.
Dimensionality Reduction PCA -- SVD
PCA + SVD.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009
Multimedia DBs. Multimedia dbs A multimedia database stores text, strings and images Similarity queries (content based retrieval) Given an image find.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) – FastMap Dimensionality Reductions or data projections.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) – FastMap Dimensionality Reductions or data projections.
Affine-invariant Principal Components Charlie Brubaker and Santosh Vempala Georgia Tech School of Computer Science Algorithms and Randomness Center.
Principal Component Analysis
3D Geometry for Computer Graphics
L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Dimensionality Reduction
Dimensionality Reduction
1 Visualizing the Legislature Howard University - Systems and Computer Science October 29, 2010 Mugizi Robert Rwebangira.
1 Efficient Clustering of High-Dimensional Data Sets Andrew McCallum WhizBang! Labs & CMU Kamal Nigam WhizBang! Labs Lyle Ungar UPenn.
1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.
L15:Microarray analysis (Classification). The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.
Lecture 20 SVD and Its Applications Shang-Hua Teng.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.
E.G.M. PetrakisDimensionality Reduction1  Given N vectors in n dims, find the k most important axes to project them  k is user defined (k < n)  Applications:
DATA MINING LECTURE 7 Dimensionality Reduction PCA – SVD
Dimensionality Reduction
Multimedia Indexing and Dimensionality Reduction.
Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.
Dimensionality Reduction. Multimedia DBs Many multimedia applications require efficient indexing in high-dimensions (time-series, images and videos, etc)
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
Footer Here1 Feature Selection Copyright, 1996 © Dale Carnegie & Associates, Inc. David Mount For CMSC 828K: Algorithms and Data Structures for Information.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Summarized by Soo-Jin Kim
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Linear Least Squares Approximation. 2 Definition (point set case) Given a point set x 1, x 2, …, x n  R d, linear least squares fitting amounts to find.
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY RobustMap: A Fast and Robust Algorithm for Dimension Reduction and Clustering Lionel F.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.
CSE 185 Introduction to Computer Vision Face Recognition.
Manifold learning: MDS and Isomap
Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.
Non-Linear Dimensionality Reduction
Principal Component Analysis Machine Learning. Last Time Expectation Maximization in Graphical Models – Baum Welch.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
CpSc 881: Machine Learning PCA and MDS. 2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various sources.
Linear Subspace Transforms PCA, Karhunen- Loeve, Hotelling C306, 2000.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
Principle Component Analysis and its use in MA clustering Lecture 12.
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
FastMap : Algorithm for Indexing, Data- Mining and Visualization of Traditional and Multimedia Datasets.
Irena Váňová. B A1A1. A2A2. A3A3. repeat until no sample is misclassified … labels of classes Perceptron algorithm for i=1...N if then end * * * * *
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Dimensionality Reduction CS 685: Special Topics in Data Mining Spring 2008 Jinze.
Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Dimensionality Reduction Part 1: Linear Methods Comp Spring 2007.
Spectral Methods for Dimensionality
Support Vector Machines (SVM)
Dipartimento di Ingegneria «Enzo Ferrari»,
Spectral Methods Tutorial 6 1 © Maks Ovsjanikov
Dimensionality Reduction and Embeddings
Principal Component Analysis
PCA is “an orthogonal linear transformation that transfers the data to a new coordinate system such that the greatest variance by any projection of the.
Recitation: SVD and dimensionality reduction
Feature space tansformation methods
Generally Discriminant Analysis
Principal Component Analysis
Presentation transcript:

Dimensionality Reduction and Embeddings

Embeddings Given a metric distance matrix D, embed the objects in a k-dimensional vector space using a mapping F such that D(i,j) is close to D’(F(i),F(j)) Isometric mapping: exact preservation of distance Contractive mapping: D’(F(i),F(j)) <= D(i,j) d’ is some Lp measure

PCA Intuition: find the axis that shows the greatest variation, and project all points into this axis f2 e1 e2 f1

SVD: The mathematical formulation Normalize the dataset by moving the origin to the center of the dataset Find the eigenvectors of the data (or covariance) matrix These define the new space Sort the eigenvalues in “goodness” order f2 e1 e2 f1

SVD Cont’d Advantages: Optimal dimensionality reduction (for linear projections) Disadvantages: Computationally expensive… but can be improved with random sampling Sensitive to outliers and non-linearities

Multi-Dimensional Scaling (MDS) Map the items in a k-dimensional space trying to minimize the stress Steepest Descent algorithm: Start with an assignment Minimize stress by moving points But the running time is O(N2) and O(N) to add a new item

FastMap What if we have a finite metric space (X, d )? Faloutsos and Lin (1995) proposed FastMap as metric analogue to the PCA. Imagine that the points are in a Euclidean space. Select two pivot points xa and xb that are far apart. Compute a pseudo-projection of the remaining points along the “line” xaxb . “Project” the points to an orthogonal subspace and recurse.

Selecting the Pivot Points The pivot points should lie along the principal axes, and hence should be far apart. Select any point x0. Let x1 be the furthest from x0. Let x2 be the furthest from x1. Return (x1, x2). x2 x0 x1

Pseudo-Projections xb Given pivots (xa , xb ), for any third point y, we use the law of cosines to determine the relation of y along xaxb . The pseudo-projection for y is This is first coordinate. db,y da,b y cy da,y xa

“Project to orthogonal plane” xb cz-cy Given distances along xaxb we can compute distances within the “orthogonal hyperplane” using the Pythagorean theorem. Using d ’(.,.), recurse until k features chosen. dy,z z y xa y’ z’ d’y’,z’

Random Projections Based on the Johnson-Lindenstrauss lemma: For: any (sufficiently large) set S of M points in Rn k = O(e-2lnM) There exists a linear map f:S Rk, such that (1- e) D(S,T) < D(f(S),f(T)) < (1+ e)D(S,T) for S,T in S Random projection is good with constant probability

Random Projection: Application Set k = O(e-2lnM) Select k random n-dimensional vectors (an approach is to select k gaussian distributed vectors with variance 0 and mean value 1: N(1,0) ) Project the original points into the k vectors. The resulting k-dimensional space approximately preserves the distances with high probability

Random Projection A very useful technique, Especially when used in conjunction with another technique (for example SVD) Use Random projection to reduce the dimensionality from thousands to hundred, then apply SVD to reduce dimensionality farther