EE 290A: Generalized Principal Component Analysis Lecture 2 (by Allen Y. Yang): Extensions of PCA Sastry & Yang © Spring, 2011EE 290A, University of California,

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Introduction to Support Vector Machines (SVM)
Lecture 9 Support Vector Machines
Component Analysis (Review)
Face Recognition Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Input Space versus Feature Space in Kernel- Based Methods Scholkopf, Mika, Burges, Knirsch, Muller, Ratsch, Smola presented by: Joe Drish Department of.
Computer vision: models, learning and inference Chapter 8 Regression.
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
Machine Learning Lecture 8 Data Processing and Representation
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
EE 290A: Generalized Principal Component Analysis Lecture 6: Iterative Methods for Mixture-Model Segmentation Sastry & Yang © Spring, 2011EE 290A, University.
Logistic Regression Principal Component Analysis Sampling TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA A A A.
EE 290A: Generalized Principal Component Analysis Lecture 4: Generalized Principal Component Analysis Sastry & Yang © Spring, 2011EE 290A, University of.
EE 290A: Generalized Principal Component Analysis Lecture 5: Generalized Principal Component Analysis Sastry & Yang © Spring, 2011EE 290A, University of.
MASKS © 2004 Invitation to 3D vision Lecture 8 Segmentation of Dynamical Scenes.
Principal Component Analysis
Dimensionality Reduction and Embeddings
1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.
Principal Component Analysis Barnabás Póczos University of Alberta Nov 24, 2009 B: Chapter 12 HRF: Chapter 14.5.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Continuous Latent Variables --Bishop
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Dimensionality Reduction
Dimensionality Reduction. Multimedia DBs Many multimedia applications require efficient indexing in high-dimensions (time-series, images and videos, etc)
CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
PCA Extension By Jonash.
Chapter 2 Dimensionality Reduction. Linear Methods
Outline Separating Hyperplanes – Separable Case
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
Technical Report of Web Mining Group Presented by: Mohsen Kamyar Ferdowsi University of Mashhad, WTLab.
ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.
Ch 12. Continuous Latent Variables Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by S.-J. Kim and J.-K. Rhee Revised by D.-Y.
Manifold learning: MDS and Isomap
Principal Manifolds and Probabilistic Subspaces for Visual Recognition Baback Moghaddam TPAMI, June John Galeotti Advanced Perception February 12,
EE4-62 MLCV Lecture Face Recognition – Subspace/Manifold Learning Tae-Kyun Kim 1 EE4-62 MLCV.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
Principal Component Analysis (PCA).
Irena Váňová. B A1A1. A2A2. A3A3. repeat until no sample is misclassified … labels of classes Perceptron algorithm for i=1...N if then end * * * * *
K -means clustering via Principal Component Analysis (Chris Ding and Xiaofeng He, ICML 2004) 03 March 2011 Kwak, Namju 1.
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
Introduction to Machine Learning Prof. Nir Ailon Lecture 5: Support Vector Machines (SVM)
Efficient non-linear analysis of large data sets
Spectral Methods for Dimensionality
Ch 12. Continuous Latent Variables ~ 12
LECTURE 11: Advanced Discriminant Analysis
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
9.3 Filtered delay embeddings
Motion Segmentation with Missing Data using PowerFactorization & GPCA
Part I 1 Title 2 Motivation 3 Problem statement 4 Brief review of PCA
Segmentation of Dynamic Scenes
René Vidal Time/Place: T-Th 4.30pm-6pm, Hodson 301
Segmentation of Dynamic Scenes from Image Intensities
Spectral Methods Tutorial 6 1 © Maks Ovsjanikov
Machine Learning Dimensionality Reduction
EE 290A Generalized Principal Component Analysis
Dynamic graphics, Principal Component Analysis
Probabilistic Models with Latent Variables
Lecture 14 PCA, pPCA, ICA.
Principal Component Analysis (PCA)
Principal Component Analysis
Segmentation of Dynamical Scenes
Presentation transcript:

EE 290A: Generalized Principal Component Analysis Lecture 2 (by Allen Y. Yang): Extensions of PCA Sastry & Yang © Spring, 2011EE 290A, University of California, Berkeley1

Last time Challenges in modern data clustering problems. PCA reduces dimensionality of the data while retaining as much data variation as possible. Statistical view: The first d PCs are given by the d leading eigenvectors of the covariance. Geometric view: Fitting a d-dim subspace model via SVD Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 2

This lecture Determine an optimal number of PCs: d Probabilistic PCA Kernel PCA Robust PCA shall be discussed later Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 3

Determine the number of PCs Choosing the optimal number of PCs in noise-free case is straightforward: Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 4

In the noisy case Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 5 knee point

A Model Selection Problem With moderate Gaussian noise, to keep 100% fidelity of the data, all D-dim must be preserved. However, we can still find tradeoff between model complexity and data fidelity? Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 6

More principled conditions Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 7

Probabilistic PCA: A generative approach Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 8

Given sample statistics, (*) contains ambiguities Assume y is standard normal, and εis isotropic Then each observation is also Gaussian Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 9

Determining principal axes by MLE Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 10 Compute the log-likelihood for n samples The gradient of L leads to stationary points

Two nontrivial solutions Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 11

Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 12

Kernel PCA: for nonlinear data Nonlinear embedding Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 13

Example Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 14

Question: How to recover the coef? Compute the null space of the data matrix The special polynomial embedding is called the Veronese map Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 15

Dimensionality Issue in Embedding Given D and order n, what is the dimension of the Veronese map? Often the dimension blows up with large D or n. Question: Can we find the higher-order nonlinear structures without explicitly calling the embedding function? Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 16

Nonlinear PCA Nonlinear PCs Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 17

In the case M is much larger than n Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 18

Kernel PCA Computations in NLPCA only involve inner products of the embedded samples, not the samples themselves. Therefore, the mapping relation can be expressed in the the computation of PCA without explicitly calling the embedding function. The inner product of two embedded samples is called the kernel function. Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 19

Kernel Function Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 20

Computing NLPCs via Kernel Matrix Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 21

Examples of Popular Kernels Polynomial kernel: Gaussian kernel (Radial Basis Function): Intersection kernel: Sastry & Yang © Spring, 2011 EE 290A, University of California, Berkeley 22