Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL 2015 1.

Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL 2015 1

Outline 1)What is an out-of-sample extension? 2)Out-of-sample extension of a)PCA b)KPCA c)MDS 2

What is out-of-sample- extension? o Suppose we perform a dimensionality reduction on a data set o New data becomes available. o Two options: o Option 1: Re-train the model including the new available data. o Option 2: Embed the new data into the existing space obtained by the training data set. 3 out-of-sample extension Question: Why is this important?

Principal Component Analysis (PCA) 4

Out-of-sample PCA o Suppose new data becomes available. 6

7 Out-of-sample PCA

Kernel PCA 8 Main Idea: Apply PCA in the feature space Apply PCA Solve eigenvalue problem Center here, too

o Data is linearly separated when projected to a higher dimensional space. 9 Main Idea: Apply PCA in a higher dimensional space Kernel PCA

Once in the higher dimension, proceed like in the PCA case 10 Out of sample Kernel PCA Center the data New data:

Out of sample Kernel PCA o Project new data into feature space which is obtained by the training data set. o Apply the kernel trick 11

Out of sample Kernel PCA Demo 12 Green points are new data

Multidimensional Scaling (MDS) o MDS visualizes a set of high dimensional data in lower dimensions based on their pairwise distances. o The idea is to make pairwise distance of the data in the low dimension close to the original pairwise data o In other words, two points that are far apart in higher dimension stay far apart in the reduced dimension. Similarly, points that are close in distance will be mapped together in the reduced dimension. 13

Comparison of PCA and MDS o The purpose of the two methods is to find the most accurate data representation in a lower dimensional space. o MDS preserves the most ‘similarities’ of the original data set. o In compare, PCA preserves most of the variance of the data. 14

Multidimensional Scaling (MDS) Main idea: The new coordinates of the projected data can be derived by eigenvalue decomposition of the centered D matrix. 15

Similar to Kernel PCA, we can project the new data as follows ： 16 d = [d 1, d 2, … d n ] New point comes in D=[d ij 2 ] Out of sample MDS

Similar to Kernel PCA, we can project the new data as follows 17 Main idea:

18 Out of sample MDS Demo 1 Chinese cities data

19 Out of sample MDS Demo 2 Chinese cities data

20 Question?

22 Out of sample MDS Demo 2 Seeds data

Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL 2015 1.

Similar presentations

Presentation on theme: "Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL 2015 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL 2015 1.

Similar presentations

Presentation on theme: "Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL 2015 1."— Presentation transcript:

Similar presentations

About project

Feedback