Dimensionality Reduction Part 1 of 2

Slides:



Advertisements
Similar presentations
FMRI Methods Lecture 10 – Using natural stimuli. Reductionism Reducing complex things into simpler components Explaining the whole as a sum of its parts.
Advertisements

Face Recognition Sumitha Balasuriya.
Text mining Gergely Kótyuk Laboratory of Cryptography and System Security (CrySyS) Budapest University of Technology and Economics
Covariance Matrix Applications
Component Analysis (Review)
Tensors and Component Analysis Musawir Ali. Tensor: Generalization of an n-dimensional array Vector: order-1 tensor Matrix: order-2 tensor Order-3 tensor.
Machine Learning Lecture 8 Data Processing and Representation
Face Recognition and Biometric Systems
A Comprehensive Study on Third Order Statistical Features for Image Splicing Detection Xudong Zhao, Shilin Wang, Shenghong Li and Jianhua Li Shanghai Jiao.
Principal Component Analysis
Principal Component Analysis IML Outline Max the variance of the output coordinates Optimal reconstruction Generating data Limitations of PCA.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9(b) Principal Components Analysis Martin Russell.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Microarray analysis Algorithms in Computational Biology Spring 2006 Written by Itai Sharon.
DATA MINING LECTURE 7 Dimensionality Reduction PCA – SVD
Biomedical Image Analysis and Machine Learning BMI 731 Winter 2005 Kun Huang Department of Biomedical Informatics Ohio State University.
Dimensionality reduction Kenneth D. Harris 24/6/15.
Machine Learning CS 165B Spring Course outline Introduction (Ch. 1) Concept learning (Ch. 2) Decision trees (Ch. 3) Ensemble learning Neural Networks.
Enhancing Tensor Subspace Learning by Element Rearrangement
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Chapter 2 Dimensionality Reduction. Linear Methods
Presented By Wanchen Lu 2/25/2013
Multivariate Approaches to Analyze fMRI Data Yuanxin Hu.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
General Tensor Discriminant Analysis and Gabor Features for Gait Recognition by D. Tao, X. Li, and J. Maybank, TPAMI 2007 Presented by Iulian Pruteanu.
Classification Course web page: vision.cis.udel.edu/~cv May 12, 2003  Lecture 33.
Discriminant Analysis
Data Projections & Visualization Rajmonda Caceres MIT Lincoln Laboratory.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.
Principal Component Analysis Zelin Jia Shengbin Lin 10/20/2015.
Principal Component Analysis and Linear Discriminant Analysis for Feature Reduction Jieping Ye Department of Computer Science and Engineering Arizona State.
Principal component analysis of the color spectra from natural scenes Long Nguyen ECE-499 Advisor: Prof. Shane Cotter.
Principal Components Analysis ( PCA)
Principal Component Analysis (PCA)
Nonlinear Dimensionality Reduction
Dimensionality Reduction and Principle Components Analysis
CSCE822 Data Mining and Warehousing
PREDICT 422: Practical Machine Learning
Dimensionality Reduction
Dimensionality reduction
Principle Component Analysis (PCA) Networks (§ 5.8)
School of Computer Science & Engineering
LECTURE 10: DISCRIMINANT ANALYSIS
Dimensionality Reduction
Unsupervised Riemannian Clustering of Probability Density Functions
Lecture 8:Eigenfaces and Shared Features
CS 2750: Machine Learning Dimensionality Reduction
Dimensionality reduction
Unsupervised Learning: Principle Component Analysis
Machine Learning Dimensionality Reduction
Principal Component Analysis
PCA vs ICA vs LDA.
In summary C1={skin} C2={~skin} Given x=[R,G,B], is it skin or ~skin?
Object Modeling with Layers
A principled way to principal components analysis
Principal Component Analysis
Principal Component Analysis (PCA)
PCA is “an orthogonal linear transformation that transfers the data to a new coordinate system such that the greatest variance by any projection of the.
Introduction PCA (Principal Component Analysis) Characteristics:
Recitation: SVD and dimensionality reduction
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
SVD, PCA, AND THE NFL By: Andrew Zachary.
Feature space tansformation methods
LECTURE 09: DISCRIMINANT ANALYSIS
Feature Selection Methods
Principal Component Analysis
Marios Mattheakis and Pavlos Protopapas
Presentation transcript:

Dimensionality Reduction Part 1 of 2 Emily M. and Greg C. Look for the bare necessities The simple bare necessities Forget about your worries and your strife I mean the bare necessities Old Mother Nature’s recipes That bring the bare necessities of life – Baloo’s song [The Jungle Book] The real Baloo "Sloth Bear Washington DC" by Asiir - http://en.wikipedia.org/wiki/File:Sloth_Bear_Washington_DC.JPG. Licensed under Public Domain via Commons - https://commons.wikimedia.org/wiki/File:Sloth_Bear_Washington_DC.JPG#/media/File:Sloth_Bear_Washington_DC.JPG http://scikit-learn.org/stable/modules/manifold.html

Dimensionality Reduction: Outline Definition and Examples Principal Component Analysis and Singular Value Decomposition Reflections on Dimensionality Reduction “Pset” office hours

Dimensionality Reduction Each datum is a vector with m values aka dimensions Data Reshape Dim. Red. m = # pixels (256^2) 1 -5 6 3 1 -5 1 -5 6 3 1 -5 1 6 -5 3 m = # voxels (10^5) m = # features (??) Datum Dimensionality Reduction A procedure that decreases a dataset’s dimensions from m to n, n < m. https://openclipart.org/detail/211774/matticonsimagexgeneric https://openclipart.org/detail/193562/simple-brain https://openclipart.org/detail/219976/hypercube

Motivation Visualization Discovering Structure Data Compression Noise/Artifact Detection "Nldr". Licensed under Public Domain via Wikipedia - https://en.wikipedia.org/wiki/File:Nldr.jpg#/media/File:Nldr.jpg "Lle hlle swissroll" by Olivier Grisel - Generated using the Modular Data Processing toolkit and matplotlib.. Licensed under CC BY 3.0 via Commons - https://commons.wikimedia.org/wiki/File:Lle_hlle_swissroll.png#/media/File:Lle_hlle_swissroll.png "Independent component analysis in EEGLAB" by Walej - Own work. Licensed under CC BY-SA 4.0 via Commons https://commons.wikimedia.org/wiki/File:Independent_component_analysis_in_EEGLAB.png#/media/File:Independent_component_analysis_in_EEGLAB.png http://scikit-learn.org/stable/auto_examples/manifold/plot_lle_digits.html

How to represent data?

How to represent data? Introduce basis

How to represent data? New basis

How to represent data? Data in original basis Data in new basis

How to represent data? Data in new basis New basis Recode data

How to represent data? New basis

How to represent data? PCA finds the directions of greatest variance in your data, by calculating the eigenvectors of the covariance matrix.

The Data Spike data from monkey motor cortex, recorded when the monkey performed a reaching task Georgopoulos et al, 1982

The Data Spike data from monkey motor cortex, recorded when the monkey performed a reaching task Each trial has 40 time points There are 158 different trials Georgopoulos et al, 1982

The Data Each trial has 40 time points There are 158 different trials Georgopoulos et al, 1982

See MATLAB... https://www.dropbox.com/sh/hreuhjzuqfe5rpj/AAC- LydSSpRm9Hce9HnIRtwRa?dl=0

SVD (singular value decomposition)

Rewrite mean-subtracted data as a linear sum of matrices

PCA can “fail” PCA discovers intrinsic structure of data variance 1st eigenvector 2nd eigenvector … but you know there are two different classes (red and black). PCA sees this... Use Linear Discriminant Analysis instead

Dimensionality Reduction Taxonomy Supervised Unsupervised Fisher LDA, Neural Network PCA/SVD, ICA, t-SNE, ISOMAP, Neural Network Linear Non Linear PCA/SVD, ICA, LDA t-SNE, ISOMAP, MDS Out of Sample Extension Given new sample, can you reduce its dimension with a pre-learned mapping? Mapping Visualization PCA, ICA, LDA t-SNE, ISOMAP, MDS https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction

Summary Dimensionality reduction Removing information to emphasize information PCA and SVD: powerful, unsupervised, linear methods Enormous variety of techniques Independent component analysis (Thursday)

References & Further Reading Readings http://research.microsoft.com/pubs/150728/FnT_dimensionReduction.pdf https://lvdmaaten.github.io/publications/papers/TR_Dimensionality_Reduction_Review_2009.pdf http://infolab.stanford.edu/~ullman/mmds/bookL.pdf Software https://lvdmaaten.github.io/drtoolbox/ python: LMNN http://www.shogun-toolbox.org/static/notebook/current/LMNN.html http://www.cs.cmu.edu/~liuy/distlearn.htm