Prediction Confidence Intervals, Cross- validation, and Predictor Selection.

Slides:



Advertisements
Similar presentations
Chapter 5 Multiple Linear Regression
Advertisements

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Partial and Semipartial Correlation
Kin 304 Regression Linear Regression Least Sum of Squares
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Multiple Linear Regression Introduction. Multiple Regression One continuous Y, two or more X variables. X variables may be continuous or dichotomous k.
Regression Basics Predicting a DV with a Single IV.
Regression single and multiple. Overview Defined: A model for predicting one variable from other variable(s). Variables:IV(s) is continuous, DV is continuous.
Evaluation (practice). 2 Predicting performance  Assume the estimated error rate is 25%. How close is this to the true error rate?  Depends on the amount.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
Multiple regression analysis
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Chapter 12 Simple Regression
Intro to Statistics for the Behavioral Sciences PSYC 1900
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
Introduction to Probability and Statistics Linear Regression and Correlation.
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Review Guess the correlation. A.-2.0 B.-0.9 C.-0.1 D.0.1 E.0.9.
Standard Error of the Mean
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Inference for regression - Simple linear regression
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
Regression with 2 IVs Generalization of Regression from 1 to 2 Independent Variables.
Section #6 November 13 th 2009 Regression. First, Review Scatter Plots A scatter plot (x, y) x y A scatter plot is a graph of the ordered pairs (x, y)
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.
Descriptive Statistics: Variability Lesson 5. Theories & Statistical Models n Theories l Describe, explain, & predict real- world events/objects n Models.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
Statistics for the Terrified Paul F. Cook, PhD Center for Nursing Research.
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where.
Regression. Population Covariance and Correlation.
Stepwise Regression SAS. Download the Data atData.htmhttp://core.ecu.edu/psyc/wuenschk/StatData/St atData.htm.
Comparing Samples. Last Time I talked about what could go wrong in an experiment where you compared a sample mean against a population with a known population.
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Variance Partitions How to Slice a Pie (into Peachy Pieces)
Statistical planning and Sample size determination.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
Building Statistical Models Lesson 4. Theories & Models n Theories l Describe, explain, & predict real-world events/objects n Models l Replicas of real-world.
1 Experimental Statistics - week 11 Chapter 11: Linear Regression and Correlation.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Multiple Regression: II
CHAPTER 12 More About Regression
Correlation, Bivariate Regression, and Multiple Regression
Testing for moderators
CHAPTER 12 More About Regression
Confidence Intervals, Cross-validation, and Predictor Selection
Multivariate Analysis Lec 4
Shudong Wang, NWEA Liru Zhang, Delaware DOE G. Gage Kingsbury, NWEA
Introduction to Regression
CHAPTER 12 More About Regression
Regression Forecasting and Model Building
CHAPTER 12 More About Regression
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2019 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Presentation transcript:

Prediction Confidence Intervals, Cross- validation, and Predictor Selection

Skill Set Why is the confidence interval for an individual point larger than for the regression line? Describe the steps in forward (backward, stepwise, blockwise, all possible regressions) predictor selection. What is cross- validation? Why is it important? What are the main problems as far as R- square and prediction are concerned with forward (backward, stepwise, blockwise, all possible regressions)

Prediction v. Explanation Prediction is important for practice –WWII pilot training Ability tests, e.g., eye-hand coordination Built an airplane that flew Fear of heights Favorite flavor ice cream –Age and driving accidents Explanation is crucial for theory. Highly correlated vbls may not help predict, but may help explain. Team outcomes as function of team resources and team backup.

Confidence Intervals CI for the line, i.e., the mean score: =MSR. N= sample size. The df are for MSR (variance of residuals). CI for a single person’s score: Note shape.

Computing Confidence Intervals Suppose: Find CI for line (mean) at X=1. df = N-k-1 = = 18. CI =.29 to For an individual at X=1, what is the CI? CI = 3.81 to 7.79

Review Why is the confidence interval for the individual wider than a similar interval for the regression line? Why are the confidence intervals regression curved instead of being straight lines?

Shrinkage R 2 is biased (sample value is too large) because of capitalizing on chance to minimize SSe in sample. If the population value of R 2 is zero, the expected value in the sample is R 2 =k/(N-1) where k is the number of predictors and N is the number of people in the sample. If you have many predictors, you can make R 2 as large as you want. What is the expected value of R-square if N = 101 and k =10? Ethical issue here. Common adjustment or shrinkage formula: This is reported by SAS (PROC REG) under ‘Adj R-Sq.’ Adjusts for both k and N and size of initial R 2.

Shrinkage Examples Suppose R 2 is.6405 with k = 4 predictors and a sample size of 30. Then R 2 =.6405 N= Adj R R 2 =.30 N= Adj R Note small N means lots of shrinkage but also smaller initial R 2 shrinks more.

Cross-Validation Compute a and b(s) (can have one or more IVs) on initial sample. Find new sample, do not estimate a and b, but use a and b to find Y’. Compute correlation between Y and Y’ in new sample; square. Ta da! Cross- validation R 2. Cross-validation R 2 does not capitalize on chance and estimates operational R 2.

Cross-validation (2) Double cross-validation Data splitting Expert judgment weights (don’t try this at home) Math Estimates Fixed: Random:

Review What is shrinkage in the context of multiple regression? What are the things that affect the expected amount of shrinkage? What is cross-validation? Why is it important?

Predictor Selection Widely misunderstood and widely misused. Algorithms labeled forward, backward, stepwise, etc. NEVER use for work involving theory or explanation (hint: this clearly means your thesis and dissertation). NEVER use for estimating importance of variables. Use SOLELY for economy (toss predictors).

All Possible Regressions GPA (Y) GREQGREVMATAR GPA (Y) 1 GREQ.6111 GREV MAT AR Mean S.D Data from Pedhazur example. GPA is grade point average. GREQ is Graduate Record Exam, Quantitative. GREV is GRE Verbal. MAT is Miller Analogies Test. AR is Arithmetic Reasoning test.

All Possible Regressions (2) kR2R2 Variables in Model 1.385AR 1.384GREQ 1.365MAT 1.338GREV 2.583GREQ MAT 2.515GREV AR 2.503GREQ AR 2.493GREV MAT 2.492MAT AR 2.485GREQ GREV 3.617GREQ GREV MAT 3.610GREQ MAT AR 3.572GREV MAT AR 3.572GREQ GREV AR 4.640GREQ GREV MAT AR Note how easy it is to choose the model with the highest R 2 for any given number of predictors. In predictor selection, you also need to worry about cost. You get both V and Q GRE in one test. Also consider what change in R 2 means. Accuracy in prediction of dropout.

Predictor Selection Algorithms Forward – build up from start with p value. End when no variables meet PIN. May include duds. Backward – Start with all vbls and pull out with POUT. May lose gems. Stepwise – Start forward, check backward at each step. Not guaranteed to give best R 2. Blockwise – not used much. Forward by blocks, then any method (eg stepwise) within block to choose best predictors.

Things to Consider in PS Algorithms consider statistical significance, but you have to consider practical significance and cost, i.e., algorithms don’t work well. Surviving variables are often there by chance. Do the analysis again and you would choose a different set. OK for prediction. The value of correlated variables is quite different when considered in path analysis and SEM.

Hierarchical Regression Alternative to predictor selection algorithms Theory based (a priori) tests of increments to R-square

Example of Hierarchical Reg Does personality increase prediction of med school success beyond that afforded by cognitive ability? Collect data on 250 med students for first two years. Model 1: Model 2 R 2 =.10, p<.05 R 2 =.13, p<.05 Model test: F(2,245)=4.22, p <.05

Review Describe the steps in forward (backward, stepwise, blockwise, all possible regressions) predictor selection. What are the main problems as far as R- square and prediction are concerned with forward (backward, stepwise, blockwise, all possible regressions) Why avoid predictor selection algorithms when doing substantive research (when you want to explain variance in the DV)?