Model Selection. Outline Motivation Overfitting Structural Risk Minimization Cross Validation Minimum Description Length.

Model Selection

Outline Motivation Overfitting Structural Risk Minimization Cross Validation Minimum Description Length

Motivation: Suppose we have a class of infinite Vcdim We have too few examples How can we find the best hypothesis Alternatively, Usually we choose the hypothesis class How should we go about doing it?

Overfitting Concept class: Intervals on a line Can classify any training set Zero training error: The only goal?!

Overfitting: Intervals Can always get zero error Are we interested?! Recall Occam Razor!

Overfitting: Intervals

Overfitting Simple concept plus noise A very complex concept –insufficient number of examples + noise 1/3

Theoretical Model Nested Hypothesis classes –H 1  H 2  H 3  …  H i  –Let VC-dim(H i )=I –For simplicity |H i | = 2 i There is a target function c(x), –For some i, c  H i –e(h) = Pr [ h  c] –e i = min h  Hi e(h) –e * = min i e i

Theoretical Model Training error –obs(h) = Pr [ h  c] –obs i = min h  Hi obs(h) Complexity of h –d(h) = min i {h  H i } Add a penalty for d(h) minimize: obs(h)+penalty(h)

Structural Risk Minimization Penalty based. Chose the hypothesis which minimizes: –obs(h)+penalty(h) SRM penalty:

SRM: Performance THEOROM –With probability 1-  –h * : best hypothesis –g * : SRM choice –e(h * )  e(g*)  e(h * )+ 2 penalty(h * ) Claim: The theorem is “tight” –H i includes 2 i coins

Proof Bounding the error in H i Bounding the error across H i

Cross Validation Separate sample to training and selection. Using the training –Select from each H i a candidate g i Using the selection sample –select between g 1, …,g m The split size –(1-  )m training set –  m selection set

Cross Validation: Performance Errors –e cv (m), e A (m) Theorem: with probability 1-  Is CV always near-optimal ?!

Minimum Description length Penalty: size of h Related to MAP –size of h: log(Pr[h]) –errors: log(Pr[D|h])

Model Selection. Outline Motivation Overfitting Structural Risk Minimization Cross Validation Minimum Description Length.

Similar presentations

Presentation on theme: "Model Selection. Outline Motivation Overfitting Structural Risk Minimization Cross Validation Minimum Description Length."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Model Selection. Outline Motivation Overfitting Structural Risk Minimization Cross Validation Minimum Description Length.

Similar presentations

Presentation on theme: "Model Selection. Outline Motivation Overfitting Structural Risk Minimization Cross Validation Minimum Description Length."— Presentation transcript:

Similar presentations

About project

Feedback