Dynamic graphics, Principal Component Analysis Ker-Chau Li UCLA department of Statistics.

Slides:



Advertisements
Similar presentations
Dimensionality Reduction PCA -- SVD
Advertisements

Introduction Principal Component Analysis is the study of the underlying dimensionality of data sets. Often data sets will have many variables, many of.
Dimension reduction (1)
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Visualizing and Exploring Data Summary statistics for data (mean, median, mode, quartile, variance, skewnes) Distribution of values for single variables.
Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Started detailed look at PCA Reviewed linear algebra Today: More linear algebra Multivariate.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Principal Component Analysis
Computer Graphics Recitation 5.
Principal component analysis (PCA)
Dimensional reduction, PCA
Face Recognition Jeremy Wyatt.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
3D Geometry for Computer Graphics
Bayesian belief networks 2. PCA and ICA
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Techniques for studying correlation and covariance structure
Correlation. The sample covariance matrix: where.
The Multivariate Normal Distribution, Part 2 BMTRY 726 1/14/2014.
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Summarized by Soo-Jin Kim
Principle Component Analysis (PCA) Networks (§ 5.8) PCA: a statistical procedure –Reduce dimensionality of input vectors Too many features, some of them.
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Chapter 2 Dimensionality Reduction. Linear Methods
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Additive Data Perturbation: data reconstruction attacks.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Techniques for studying correlation and covariance structure Principal Components Analysis (PCA) Factor Analysis.
Chapter 7 Multivariate techniques with text Parallel embedded system design lab 이청용.
1 Matrix Algebra and Random Vectors Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;
Participant Presentations Please Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Thurs., Early, Oct., Nov.,
Principal Component Analysis Zelin Jia Shengbin Lin 10/20/2015.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Feature Extraction 主講人:虞台文.
Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL
Dimension reduction (2) EDR space Sliced inverse regression Multi-dimensional LDA Partial Least Squares Network Component analysis.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Principal Component Analysis
Introduction to Vectors and Matrices
Principal Component Analysis
Exploring Microarray data
Principle Component Analysis (PCA) Networks (§ 5.8)
LECTURE 10: DISCRIMINANT ANALYSIS
Recognition with Expression Variations
Principal Components Analysis
Principal Component Analysis
Additive Data Perturbation: data reconstruction attacks
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
Dynamic graphics, Principal Component Analysis
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
Bayesian belief networks 2. PCA and ICA
Techniques for studying correlation and covariance structure
Principal Component Analysis
PCA is “an orthogonal linear transformation that transfers the data to a new coordinate system such that the greatest variance by any projection of the.
Matrix Algebra and Random Vectors
Feature space tansformation methods
Generally Discriminant Analysis
Principal Components What matters most?.
Digital Image Processing Lecture 21: Principal Components for Description Prof. Charlene Tsai *Chapter 11.4 of Gonzalez.
LECTURE 09: DISCRIMINANT ANALYSIS
Linear Algebra Lecture 32.
Introduction to Vectors and Matrices
Principal Component Analysis
Outline Variance Matrix of Stochastic Variables and Orthogonal Transforms Principle Component Analysis Generalized Eigenvalue Decomposition.
Presentation transcript:

Dynamic graphics, Principal Component Analysis Ker-Chau Li UCLA department of Statistics

Xlisp-stat (demo) n (plot-points x y) n (scatterplot-matrix (list x y z u w)) n (spin-plot (list x y z)) n Link, remove, select, rescale n Examples : n (1) Rubber data (Rice’s book) n (2) Iris data n (3) Boston Housing data n

PCA (principal component analysis) n A fundamental tool for reducing dimensionality by finding projections with largest variance n (1)Data version n (2) Population version n Each has a number of variations n (3) Let’s begin with an illustration using n (pca-model (list x y z))

Find a 2-D plane in 4-D space n Generate 1000 data points in a unit disk on a 2-D plane. n Generate 1000 data points in a ring outside the unit disk n Append the two sets of data together in a 2-D plane; this gives the original x and y variables n Suppose we are given four variables x1,x2,x3,x4, which are just some linear combinations of the original x and y variables n We shall use pca-model to find the original 2-D plane.

Data version n 1. Construct the sample variance- covariance matrix n 2. Find the eigenvectors n 3. Projection : use each eigenvector to form a linear combination of original variables n 4. The larger, the better : the k-th principal component is the projection with the k-th largest eigenvalue

Data Version(alternative view) n 1-D data matrix : rank 1data matrixrank 1 n 2-D data matrix :rank 2rank 2 n K-D data matrix : rank k n Eigenvectors for 1-D sample covariance matrix: rank 1 n Eigenvectors for 2-D sample covariance matrix: rank 2 n Eigenvectors for k-D sample matrix n Adding i.i.d. noise n Connection with automatic basis curve finding (to be discussed later)

Population version n Let the sample size tend to the infinity n Sample covariance-matrix converges to a matrix which is the population covariance-matrix (due to law of large number) n The rest of steps remain the same n We shall use the population version for theoretical discussion

Some Basic facts n Variance of linear combination of random variables n var(a x + b y)= a^2 var(x) + b^2 var(y) + 2 a b cov(x,y) n Easier if using matrix representation : n (B.1) var ( m’ X)= m’ Cov(X) m n here m is a p-vector, X consists of p random variables (x_1, …,x_p)’ n From (B.1), it follows that

Basic facts (Cont.) n Maximizing var(m’x) subject to ||m||=1 is the same as Max m’cov(X)m subject to ||m||=1 n (here ||m|| denotes the length of the vector m) n Eigenvalue decomposition : n (B.2) M v i = i v i, where n 1 ≥ 2 ≥ …. ≥ p n Basic linear algebra tells us that the first eigenvector will do : n Solution of max m’ M m subject to ||m||=1 must satisfy M m= 1 m

Basic facts(cont.) n Covariance matrix is degenerated (I.e, some eigenvalues are zero) if data are confined to a lower dimensional space S n Rank of covariance matrix = number of non-zero eigenvalues = dim. of the space S n This explain why pca works for our first example n Why small errors can be tolerated ? n Large i.i.d. errors are fine too n Heterogeneity is harmful, correlated errors too

Further discussion n No guarantee of finding nonlinear structure like clusters, curves, etc. n In fact, sampling properties for pca are mostly developed for normal data ( Mardia, Kent, Bibby 1979, Multivariate Analysis. New York: Academic Press) n Still useful n Scaling problem n Projection pursuit: guided; random