Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dynamic graphics, Principal Component Analysis Ker-Chau Li UCLA department of Statistics.

Similar presentations


Presentation on theme: "Dynamic graphics, Principal Component Analysis Ker-Chau Li UCLA department of Statistics."— Presentation transcript:

1 Dynamic graphics, Principal Component Analysis Ker-Chau Li UCLA department of Statistics

2 Xlisp-stat (demo) n (plot-points x y) n (scatterplot-matrix (list x y z u w)) n (spin-plot (list x y z)) n Link, remove, select, rescale n Examples : n (1) Rubber data (Rice’s book) n (2) Iris data n (3) Boston Housing data n

3 PCA (principal component analysis) n A fundamental tool for reducing dimensionality by finding projections with largest variance n (1)Data version n (2) Population version n Each has a number of variations n (3) Let’s begin with an illustration using n (pca-model (list x y z))

4 Find a 2-D plane in 4-D space n Generate 1000 data points in a unit disk on a 2-D plane. n Generate 1000 data points in a ring outside the unit disk n Append the two sets of data together in a 2-D plane; this gives the original x and y variables n Suppose we are given four variables x1,x2,x3,x4, which are just some linear combinations of the original x and y variables n We shall use pca-model to find the original 2-D plane.

5 Data version n 1. Construct the sample variance- covariance matrix n 2. Find the eigenvectors n 3. Projection : use each eigenvector to form a linear combination of original variables n 4. The larger, the better : the k-th principal component is the projection with the k-th largest eigenvalue

6 Data Version(alternative view) n 1-D data matrix : rank 1data matrixrank 1 n 2-D data matrix :rank 2rank 2 n K-D data matrix : rank k n Eigenvectors for 1-D sample covariance matrix: rank 1 n Eigenvectors for 2-D sample covariance matrix: rank 2 n Eigenvectors for k-D sample matrix n Adding i.i.d. noise n Connection with automatic basis curve finding (to be discussed later)

7 Population version n Let the sample size tend to the infinity n Sample covariance-matrix converges to a matrix which is the population covariance-matrix (due to law of large number) n The rest of steps remain the same n We shall use the population version for theoretical discussion

8 Some Basic facts n Variance of linear combination of random variables n var(a x + b y)= a^2 var(x) + b^2 var(y) + 2 a b cov(x,y) n Easier if using matrix representation : n (B.1) var ( m’ X)= m’ Cov(X) m n here m is a p-vector, X consists of p random variables (x_1, …,x_p)’ n From (B.1), it follows that

9 Basic facts (Cont.) n Maximizing var(m’x) subject to ||m||=1 is the same as Max m’cov(X)m subject to ||m||=1 n (here ||m|| denotes the length of the vector m) n Eigenvalue decomposition : n (B.2) M v i = i v i, where n 1 ≥ 2 ≥ …. ≥ p n Basic linear algebra tells us that the first eigenvector will do : n Solution of max m’ M m subject to ||m||=1 must satisfy M m= 1 m

10 Basic facts(cont.) n Covariance matrix is degenerated (I.e, some eigenvalues are zero) if data are confined to a lower dimensional space S n Rank of covariance matrix = number of non-zero eigenvalues = dim. of the space S n This explain why pca works for our first example n Why small errors can be tolerated ? n Large i.i.d. errors are fine too n Heterogeneity is harmful, correlated errors too

11 Further discussion n No guarantee of finding nonlinear structure like clusters, curves, etc. n In fact, sampling properties for pca are mostly developed for normal data ( Mardia, Kent, Bibby 1979, Multivariate Analysis. New York: Academic Press) n Still useful n Scaling problem n Projection pursuit: guided; random


Download ppt "Dynamic graphics, Principal Component Analysis Ker-Chau Li UCLA department of Statistics."

Similar presentations


Ads by Google