Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dimensionality reduction CISC 5800 Professor Daniel Leeds.

Similar presentations


Presentation on theme: "Dimensionality reduction CISC 5800 Professor Daniel Leeds."— Presentation transcript:

1 Dimensionality reduction CISC 5800 Professor Daniel Leeds

2 The benefits of extra dimensions Finds existing complex separations between classes 2

3 The risks of too-many dimensions 3 High dimensions with kernels over-fit the outlier data Two dimensions ignore the outlier data

4 Training vs. testing 4 training epochs error

5 Goal: High Performance, Few Parameters “Information criterion”: performance/parameter trade-off Variables to consider: L likelihood of train data after learning k number of parameters (e.g., number of features) m number of points of training data Popular information criteria: Akaike information criterion AIC : log(L) - k Bayesian information criterion BIC : log(L) - 0.5 k log(m) 5

6 Goal: |Data| > |Parameters| 6

7 Decreasing parameters Force parameter values to 0 L1 regularization Support Vector selection Feature selection/removal Consolidate feature space Component analysis 7

8 Feature removal Start with feature set: F={x 1, …, x k } Find classifier performance with set F: perform(F) Loop Find classifier performance for removing feature x 1, x 2, …, x k : argmax i perform(F-x 1 ) Remove feature that causes least decrease in performance: F=F-x i Repeat, using AIC or BIC as termination criterion 8 AIC : log(L) - k BIC : log(L) - 0.5 k log(m)

9 AIC testing: log(L)-k 9 Featuresk (num features)L (likelihood)AIC F400.1-42.3 F-{x 3 }390.03-41.5 F-{x 3,x 24 }380.005-41.3 F-{x 3,x 24,x 32 }370.001-40.9 F-{x 3,x 24,x 32,x 15 }360.0001-41.2

10 Feature selection Find classifier performance for just set of 1 feature: argmax i perform({x i }) Add feature with highest performance: F={x i } Loop Find classifier performance for adding one new feature: argmax i perform(F+{x i }) Add to F feature with highest performance increase: F=F+{x i } Repeat, using AIC or BIC as termination criterion 10 AIC : log(L) - k BIC : log(L) - 0.5 k log(m)

11 Defining new feature axes 11

12 Defining data points with new axes 12

13 Component analysis 13

14 Component analysis: examples 14 ComponentsData 2 0

15 Component analysis: examples “Eigenfaces” – learned from set of face images u: nine components x i : data reconstructed 15

16 Types of component analysis 16

17 Principle component analysis (PCA) 17

18 Independent component analysis (ICA) 18

19 Evaluating components 19


Download ppt "Dimensionality reduction CISC 5800 Professor Daniel Leeds."

Similar presentations


Ads by Google