Cross-Validation vs. Bootstrap Estimates of Prediction Error in Statistical Modeling Kaniz Rashid Lubana Mamun MS Student: CSU Hayward Dr. Eric A. Suess Assistant Professor of Statistics:
Regression Analysis
Regression Analysis To find the regression line for data (xi, yi), minimize
Regression Analysis To find the regression line for data (xi, yi), minimize Estimates linear relationships between dependent and independent variables.
Regression Analysis To find the regression line for data (xi, yi), minimize Estimates linear relationships between dependent and independent variables. Applications: Prediction and Forecasting.
Classical Regression Procedure Choose a model: y = b0 + b1 x1 + b2 x2 + e . Verify assumptions: normality of the data. Fit the model, checking for significance of parameters. Check the model’s predictive capability.
Mean Squared Error of Prediction
Mean Squared Error of Prediction MSEP measures how well a model predicts the response value of a future observation.
Mean Squared Error of Prediction MSEP measures how well a model predicts the response value of a future observation. For our regression model, the MSEP of a new observation yn + 1 is
Mean Squared Error of Prediction MSEP measures how well a model predicts the response value of a future observation. For our regression model, the MSEP of a new observation yn + 1 is Small values of MSEP indicate good predictive capability.
What is Cross-Validation? Divide the data into two sub-samples: Treatment set (to fit the model), Validation set (to assess predictive value). Non-parametric approach: mainly used when normality assumption is not met. Criterion for model’s prediction ability: usually the MSEP statistic.
CV For Linear Regression: The “Withhold-1” Algorithm Use the model: y = b0 + b1 x1 + b2 x2 + e .
CV For Linear Regression: The “Withhold-1” Algorithm Use the model: y = b0 + b1 x1 + b2 x2 + e . Withhold one observation (x1i, x2i, yi).
CV For Linear Regression: The “Withhold-1” Algorithm Use the model: y = b0 + b1 x1 + b2 x2 + e . Withhold one observation (x1i, x2i, yi). Fit the regression model to the remaining n – 1 observations.
CV For Linear Regression: The “Withhold-1” Algorithm Use the model: y = b0 + b1 x1 + b2 x2 + e . Withhold one observation (x1i, x2i, yi). Fit the regression model to the remaining n – 1 observations. For each i, calculate
CV For Linear Regression: The “Withhold-1” Algorithm Use the model: y = b0 + b1 x1 + b2 x2 + e . Withhold one observation (x1i, x2i, yi). Fit the regression model to the remaining n – 1 observations. For each i, calculate Finally, calculate
What is the Bootstrap? The Bootstrap is: A computationally intensive technique, Involves simulation and resampling. Used here to assess the accuracy of statistical estimates for a model: Confidence intervals, Standard errors, Estimate of MSEP.
Algorithm For a Bootstrap
Algorithm For a Bootstrap From a data set of size n, randomly draw B samples with replacement, each of size n.
Algorithm For a Bootstrap From a data set of size n, randomly draw B samples with replacement, each of size n. Find the estimate of MSEP for each of the B samples:
Algorithm For a Bootstrap From a data set of size n, randomly draw B samples with replacement, each of size n. Find the estimate of MSEP for each of the B samples: Average these B estimates of q to obtain the overall bootstrap estimate:
Schematic Diagram of Bootstrap Θ(x2*) Θ(x1*) Θ(xB*) Bootstrap Samples X1* X2* XB* Resampling Variablity Data X=(x1, x2, …,xn) Sampling Variablity Population F
Application: Heart Measurements on Children Study: Catheterize 12 children with heart defects and take measurements. Variables measured: y: observed catheter length in cm w: patient’s weight in pounds h: patient’s height in inches Goal: To predict y from w and h. Difficulties: Small n, non-normal data.
Model and Fitted Model
Model and Fitted Model Model: y = b0 + b1w + b2h + e . Fitted Model:
Model and Fitted Model Model: y = b0 + b1w + b2h + e . Fitted Model: Parameter estimates for the heart data are: b0 estimated as 25.6, b1 estimated as 0.277, b2 term eliminated from model (not useful).
Regression Results Both parameters b0 and b1 are significantly different from 0 (important to the model): p-values: 0.000 (for b0) and 0.000 (for b1) R2 = 80% (of variation in y explained) Once weight is known, height does not provide additional useful information. Example: For a child weighing 50 lbs., the estimated distance is 39.45 cm.
Comparison of CV and Bootstrap MSEP Estimates: CV: MSEP = 18.05 Bootstrap: MSEP = 12.04 (smaller = better) For this example: The Bootstrap has the better prediction capability. In general: CV methods work well for large samples. Bootstrap is effective, even for small samples.
Cross-Validation vs. Bootstrap Estimates of Prediction in Statistical Modeling Kaniz Rashid Lubana Mamun MS Student: CSU Hayward Dr. Eric A. Suess Assistant Professor of Statistics: