Download presentation
Presentation is loading. Please wait.
Published byThomas Hunter Modified over 8 years ago
1
Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL 2015 1
2
Outline 1)What is an out-of-sample extension? 2)Out-of-sample extension of a)PCA b)KPCA c)MDS 2
3
What is out-of-sample- extension? o Suppose we perform a dimensionality reduction on a data set o New data becomes available. o Two options: o Option 1: Re-train the model including the new available data. o Option 2: Embed the new data into the existing space obtained by the training data set. 3 out-of-sample extension Question: Why is this important?
4
Principal Component Analysis (PCA) 4
5
5
6
Out-of-sample PCA o Suppose new data becomes available. 6
7
7 Out-of-sample PCA
8
Kernel PCA 8 Main Idea: Apply PCA in the feature space Apply PCA Solve eigenvalue problem Center here, too
9
o Data is linearly separated when projected to a higher dimensional space. 9 Main Idea: Apply PCA in a higher dimensional space Kernel PCA
10
Once in the higher dimension, proceed like in the PCA case 10 Out of sample Kernel PCA Center the data New data:
11
Out of sample Kernel PCA o Project new data into feature space which is obtained by the training data set. o Apply the kernel trick 11
12
Out of sample Kernel PCA Demo 12 Green points are new data
13
Multidimensional Scaling (MDS) o MDS visualizes a set of high dimensional data in lower dimensions based on their pairwise distances. o The idea is to make pairwise distance of the data in the low dimension close to the original pairwise data o In other words, two points that are far apart in higher dimension stay far apart in the reduced dimension. Similarly, points that are close in distance will be mapped together in the reduced dimension. 13
14
Comparison of PCA and MDS o The purpose of the two methods is to find the most accurate data representation in a lower dimensional space. o MDS preserves the most ‘similarities’ of the original data set. o In compare, PCA preserves most of the variance of the data. 14
15
Multidimensional Scaling (MDS) Main idea: The new coordinates of the projected data can be derived by eigenvalue decomposition of the centered D matrix. 15
16
Similar to Kernel PCA, we can project the new data as follows : 16 d = [d 1, d 2, … d n ] New point comes in D=[d ij 2 ] Out of sample MDS
17
Similar to Kernel PCA, we can project the new data as follows 17 Main idea:
18
18 Out of sample MDS Demo 1 Chinese cities data
19
19 Out of sample MDS Demo 2 Chinese cities data
20
20 Question?
21
21
22
22 Out of sample MDS Demo 2 Seeds data
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.