Information Criterion for Model Selection Romain Hugues
Problem description DESCRIPTIONTHEORETICAL BACKGROUNDIN PRACTICE
Important parameters We have N m-vector data a sampled from a m’-dim manifold A. We want to estimate a d-dim manifold S. S is parameterized by a n-vector u constrained to be in a n’-dim manifold U DESCRIPTIONTHEORETICAL BACKGROUNDIN PRACTICE
Model can be described by: d: dimension r ( = m’-d) : codimension n’ :degrees of freedom DESCRIPTIONTHEORETICAL BACKGROUNDIN PRACTICE
Minimization and expected residual Max. Lik. Solution of problem by minimizing J: New Notation for residual with respect to model: Residuals for future data a* : Expected residual of Model S: DESCRIPTIONTHEORETICAL BACKGROUNDIN PRACTICE
Mahalanobis projection of Data: DESCRIPTIONTHEORETICAL BACKGROUNDIN PRACTICE
Optimally fitted Manifold: DESCRIPTIONTHEORETICAL BACKGROUNDIN PRACTICE
Evaluation of expected residual: WE NEED TO ESTIMATE I(S) DESCRIPTIONTHEORETICAL BACKGROUNDIN PRACTICE
Geometric Information Criterion AIC(S) is an unbiased estimator of I(S): Extracting noise level ε from covariance: Normalized residual : Normalized AIC : DESCRIPTIONTHEORETICAL BACKGROUNDIN PRACTICE
Model Selection S 1 ”better” than S 2 if AIC 0 (S 1 ) <AIC 0 (S 2 ) If model S1 is CORRECT DESCRIPTIONTHEORETICAL BACKGROUNDIN PRACTICE
Model Comparison S 1 ”better” than S 2 if AIC 0 (S 1 ) <AIC 0 (S 2 ) DESCRIPTIONTHEORETICAL BACKGROUNDIN PRACTICE
What should be done in practice? 1.Collect Data. 2.Estimate Manifolds and true positions for each model. 3.Compute Residuals for each model. 4.If a model is always “correct”, estimate noise level from residuals of this model 5.Compare two models: DESCRIPTIONTHEORETICAL BACKGROUNDIN PRACTICE
Situations when this is useful ? ??