Nonlinear Dimensionality Reduction for Hyperspectral Image Classification Tim Doster Advisors: John Benedetto & Wojciech Czaja.

Slides:

Advertisements

Similar presentations

Computer examples Tenenbaum, de Silva, Langford “A Global Geometric Framework for Nonlinear Dimensionality Reduction”

Advertisements

1 Manifold Alignment for Multitemporal Hyperspectral Image Classification H. Lexie Yang 1, Melba M. Crawford 2 School of Civil Engineering, Purdue University.

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Data Mining Classification: Alternative Techniques

Data Mining Classification: Alternative Techniques

Support Vector Machines

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Manifold Learning Dimensionality Reduction. Outline Introduction Dim. Reduction Manifold Isomap Overall procedure Approximating geodesic dist. Dijkstra’s.

Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009

Nonlinear Unsupervised Feature Learning How Local Similarities Lead to Global Coding Amirreza Shaban.

Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.

Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Algorithm.

“Random Projections on Smooth Manifolds” -A short summary

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

LLE and ISOMAP Analysis of Robot Images Rong Xu. Background Intuition of Dimensionality Reduction Linear Approach –PCA(Principal Component Analysis) Nonlinear.

An Introduction to Kernel-Based Learning Algorithms K.-R. Muller, S. Mika, G. Ratsch, K. Tsuda and B. Scholkopf Presented by: Joanna Giforos CS8980: Topics.

Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.

Distance Metric Learning: A Comprehensive Survey

Support Vector Machines

A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.

Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.

CS Instance Based Learning1 Instance Based Learning.

NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.

Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.

Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.

Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.

An Introduction to Support Vector Machines Martin Law.

CSC 589 Lecture 22 Image Alignment and least square methods Bei Xiao American University April 13.

Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University

Camera Geometry and Calibration Thanks to Martial Hebert.

Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.

IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Learning a Kernel Matrix for Nonlinear Dimensionality Reduction By K. Weinberger, F. Sha, and L. Saul Presented by Michael Barnathan.

THE MANIFOLDS OF SPATIAL HEARING Ramani Duraiswami | Vikas C. Raykar Perceptual Interfaces and Reality Lab University of Maryland, College park.

An Introduction to Support Vector Machines (M. Law)

Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)

Computer examples Tenenbaum, de Silva, Langford “A Global Geometric Framework for Nonlinear Dimensionality Reduction”

ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.

GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.

Manifold learning: MDS and Isomap

CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.

Nonlinear Dimensionality Reduction Approach (ISOMAP)

Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.

H. Lexie Yang1, Dr. Melba M. Crawford2

Non-Linear Dimensionality Reduction

Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.

Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.

Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA

Chapter 13 (Prototype Methods and Nearest-Neighbors )

Data Mining Course 2007 Eric Postma Clustering. Overview Three approaches to clustering 1.Minimization of reconstruction error PCA, nlPCA, k-means clustering.

CSC321: Lecture 25: Non-linear dimensionality reduction Geoffrey Hinton.

Intro. ANN & Fuzzy Systems Lecture 16. Classification (II): Practical Considerations.

Nonlinear Dimension Reduction: Semi-Definite Embedding vs. Local Linear Embedding Li Zhang and Lin Liao.

CSC321: Extra Lecture (not on the exam) Non-linear dimensionality reduction Geoffrey Hinton.

Manifold Learning JAMES MCQUEEN – UW DEPARTMENT OF STATISTICS.

Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL

Multi-index Evaluation Algorithm Based on Locally Linear Embedding for the Node importance in Complex Networks Fang Hu

Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:

Spectral Methods for Dimensionality

Nonlinear Dimensionality Reduction

CSCE822 Data Mining and Warehousing

Unsupervised Riemannian Clustering of Probability Density Functions

CS 2750: Machine Learning Dimensionality Reduction

Machine Learning Dimensionality Reduction

Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE

Nonlinear Dimension Reduction:

NonLinear Dimensionality Reduction or Unfolding Manifolds

Progress Report Alvaro Velasquez.

Presentation transcript:

Nonlinear Dimensionality Reduction for Hyperspectral Image Classification Tim Doster Advisors: John Benedetto & Wojciech Czaja

Outline Introduction Dimensionality Reduction Algorithms Implementation Challenges Validation Next Semester

High Dimensional Data Data that contains many readings for each object Examples include: – Medical – Identification and Biometrics – Polling and Questionnaires – Imaging: Hyperspectral Images

Hyperspectral Imagery A black and white digital image contains 1 spectral band – A numerical value represents intensity – For example an integer between 0 and 255 A color digital image contains 3 spectral bands – Red, blue and green A Hyperspectral digital image contains hundreds of spectral bands – Visible, infrared, ultraviolet

Hyperspectral Imagery

Hyperspectral Cube Each pixel has an associated vector of sensor readings for specific wavelengths Typical Images are at least 1000x1000 pixels with 150 bands 150 million data points ~300 megabytes

Dimensionality Reduction Due to volume of data, to make analysis, transmission, and storage easier, we seek to reduce to dimensionality of the data without sacrificing too much of the intrinsic structure of the data We also seek to make differences between pixels more dramatic by throwing away similar information

Techniques For Dimensionality Reduction Used Local Linear Embedding ISOMAP

Local Linear Embedding Step 1: Find the KNN for each point in the data set using the Euclidean metric. Step 2: Find the linear combination (reconstruction weights), W i, for each pixel from its KNN. – We are creating a hyperplane through each point which is invariant to rotation, translation and stretching – The W i ‘s form a matrix W called the weight matrix

Local Linear Embedding Step 3. We minimize the cost function: to compute the lower dimensional embedding. It can be shown that this is equivalent to finding the the eigenvectors associated with the d smallest nonzero eigenvalues to: M=(I-W) T (I-W)

Local Linear Embedding

ISOMAP Step 1: Find the KNN for each point in the data set using the Euclidean metric. Step 2: Calculate, S, the pairwise shortest geodesic distance for all data points Step 3: Minimize the cost function, where Y is the low dimension embedding

ISOMAP Step 3 – Alternative This minimization can be shown to be the same as finding for the eigenproblem: Note H is the centering matrix

Comparing the two ISOMAP: preserve pairwise geodesic distance between points LLE: preserve local properties only, e.g., relative local distances

Implementation Initial prototyping in IDL and experimentation with Matlab DR Toolbox LLE and ISOMAP were implemented in C++ Programs take as input the number of nearest neighbors (k), the intrinsic dimensionality (dim), and the data (data.in) The programs return the embedding in a.out data file Example >>lle 3 2 data.in

Implementation Made use of platform optimized libraries for matrix operations Tried to use memory efficiently (still work in progress) – Calling destructor – Sparse matrices and solvers – Good cache management

Libraries Used Boost Libraries – BLAS Level I,II,III calculations – Dot product, vector norm, matrix-vector product, matrix-matrix product, Ax=b ARPACK – Sparse Eigenproblem solver LAPACK – General Eigenproblem solver

Data files Contain a header line with the number of rows, columns, and original dimensionality When using actual images rows and columns are necessary to reform image Values are comma separated for easy parsing in both C++ and Matlab

Program IO data.in 1,8,3 1.1,2.2, ,2.2, ,1.7, ,2.2, ,4.4, ,0.2, ,0.5, ,2.5,5.9 data.out 1,8, , , , , , , , ,

Profiling bash-3.2$./lle 12 2 in.data 1 Data has been read in seconds Pairwise distance matrix found seconds k nearest neighbors found 8.2e-05 seconds Reconstruction Weights found Output written to file seconds Program Done Runtime: seconds

Possible Optimizations Better use of sparse matrices Memory vs Processor program paths – Is it better to store some calculations that need to be used again or calculate them again and save memory? – Can likely be arrived at experimentally

Initial Validation Steps Exactly match established Matlab output for small data set (10 points, 3 dimensions)

Initial Validation Step - LLE C++ Implementation , , , , , , , , Matlab DR Toolbox

Validation Make use of artificial data sets, used in manifold learning, which are defined in 3 dimensions but actually lie on a 2 dimensional manifold

Validation

Validation - Tests

Validation Tests Truthworthiness, T(k): Measures the proportion of points that are too close together – We want numbers close to 1 Continuity, C(k): Measures the proportion of points that are too far apart – We want numbers close to 1

Challenges Encountered Using external libraries – Linking with program – different data structures – Sparsity – Matlab is really great at doing these things for you Matching output to Matlab for large data sets – Eigensolvers all slightly different

What is next? Landmark Points The basic idea is build the neighborhood graph with a small subset of the available points and then find the low dimensional embedding from these points. Those points not chosen as landmarks use the embeddings of their k-nearest landmarks to define their embedding. We sacrifice embedding accuracy for vastly improved speed and memory usage

Next Semester January – Link C++ code with IDL/ENVI – Optimize code for memory usage February and March – Implement landmarks and validate algorithms April – Use algorithms with and without landmarks on hyperspectral classification images May – Prepare final presentation If time permits: – parallel implementation, other NLDR algorithms, automatic KNN selection

References An Introduction to Local Linear Embedding – Saul and Roweis, 2001 A Global Network for Non-Linear Dimension Reducation – Teneenbaum, Silvia, Langford Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering – Bengio, Paiement and Vincent, 2003 Exploiting Manifold Geometry in Hyperspectral Imagery – Bachmann, Ainsworth and Fusina, 2005 Dimensionality Reduction: A Comparative Review – Maaten, Postma and Herik, 2008

Any Questions?

Example Data Set Land Classification PROBE1 HSI Image Ground Classification

Computation Costs For Example LLE is: – KNN: O(N log N) – Finding Weights: O(N D K 3 ) – Solving Eigenproblem: O(dN 2 ) Since N is the number of pixels in the image these calculations can become daunting

Software Flow Hyperspectral Image OR Artificial Data Set Read into IDL: Create Data Cube Read into C++: DR Algorithm ANN LAPACK BLAS Landmarks Send to IDL: Display in ENVI Find Measures Image Classifier Algorithm