EPI809/Spring Testing Individual Coefficients
EPI809/Spring Test of Slope Coefficient p 1. Tests if there is a Linear Relationship Between one X & Y 2. Involves one single population Slope p 3. Hypotheses: H 0 : p = 0 vs. H a : p 0
EPI809/Spring Test of Slope Coefficient p Test Statistic
EPI809/Spring Test of Slope Coefficient Rejection Rule Reject H 0 in favor of H a if t falls in colored area Reject H 0 for H a if P-value = 2P(T>|t|) |t|)<α T=t(n-k-1) 0 t 1-α/2 (n-k-1) Reject H 0 0 α/2 -t 1-α/2 (n-k-1) α/2
EPI809/Spring Individual Coefficients SAS Output Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept Food weight PP 22 00 11 ^ ^ ^ ^ β p /s ^ pp ^ P-value
EPI809/Spring Testing Model Portions
EPI809/Spring Tests the Contribution of a Set of X Variables to the Relationship With Y 2.Null Hypothesis H 0 : g+1 =... = k = 0 Variables in Set Do Not Improve Significantly the Model When All Other Variables Are Included Variables in Set Do Not Improve Significantly the Model When All Other Variables Are Included 3.Used in Selecting X Variables or Models Testing Model Portions
EPI809/Spring Testing Model Portions Nested Models H 0 : Reduced model ( g+1 =... = k = 0 ) H a : Full model
EPI809/Spring F-Test for Nested Models Numerator Reduction in SSE from additional parameters df = k-g = number of additional parameters Denominator SSE of full model df=n-(k+1)=error df of full model
EPI809/Spring Selecting Variables in Model Building
EPI809/Spring Model Building with Computer Searches 1. Rule: Use as Few X Variables As Possible 2. Stepwise Regression Computer Selects X Variable Most Highly Correlated With Y Computer Selects X Variable Most Highly Correlated With Y Continues to Add or Remove Variables Depending on SSE Continues to Add or Remove Variables Depending on SSE 3. Best Subset Approach Computer Examines All Possible Sets Computer Examines All Possible Sets
EPI809/Spring Residual Analysis for goodness of fit
EPI809/Spring Residual (Estimated Errors) Analysis 1. Graphical Analysis of Residuals Plot Estimated Errors vs. X i Values (or pred.) Plot Estimated Errors vs. X i Values (or pred.) Plot Histogram or Stem-&-Leaf of Residuals Plot Histogram or Stem-&-Leaf of Residuals 2. Purposes - Examine Functional Form (Linear vs. Non- Linear Model) - Evaluate Violations of Assumptions (to insure validity of the statistic tests on β’s)
EPI809/Spring We recall Linear Regression Assumptions 1. Mean of Distribution of Error Is 0 2. Distribution of Error Has Constant Variance 3. Distribution of Error is Normal 4. Errors Are Independent
EPI809/Spring Residual Plot for Functional Form Nonlinear pattern Correct Specification
EPI809/Spring Residual Plot for Equal Variance Unequal Variance Correct Specification Fan-shaped. Standardized residuals used typically (residual divided by standard error of prediction)
EPI809/Spring Residual Plot for Independence Not Independent Correct Specification
EPI809/Spring Residuals Diagnostics in SAS symbol v=dot h=2 c=green; PROC REG data=Cow; model milk = food weight; plot residual.*predicted. /cHREF=red cframe=ligr; /cHREF=red cframe=ligr; run;
EPI809/Spring
EPI809/Spring Check for Outlying Observations and Influence analysis symbol v=dot h=2 c=green; proc reg data=cow; model milk = food weight/influence; plot rstudent.*obs. / vref=-2 2 cvref=blue lvref=2 HREF=0 to 7 by 1 cHREF=red cframe=ligr; run;
EPI809/Spring
EPI809/Spring Influence analysis of each obs. The REG Procedure Model: MODEL1 Dependent Variable: Milk Output Statistics Hat Diag Cov DFBETAS Obs Residual RStudent H Ratio DFFITS Intercept Food weight
EPI809/Spring Multicollinearity 1.High Correlation Between X Variables 2.Coefficients Measure Combined Effect 3.Leads to Unstable Coefficients Depending on X Variables in Model 4.Always Exists 5. Example: Using Both Age & Height of children as indep. Var. in Same Model
EPI809/Spring Detecting Multicollinearity 1.Examine Correlation Matrix Correlations Between Pairs of X Variables Are More than With Y Variable Correlations Between Pairs of X Variables Are More than With Y Variable 2.Examine Variance Inflation Factor (VIF) If VIF j > 5 (or 10 according to most references), Multicollinearity Exists If VIF j > 5 (or 10 according to most references), Multicollinearity Exists 3.Few Remedies Obtain New Sample Data Obtain New Sample Data Eliminate One Correlated X Variable Eliminate One Correlated X Variable
EPI809/Spring SAS CODES :VET EXAMPLE PROC CORR data=vet; VAR milk food weight; run;
EPI809/Spring Correlation Matrix SAS Computer Output Pearson Correlation Coefficients, N = 6 Prob > |r| under H0: Rho=0 Milk Food weight Milk Food weight r Y1 r Y2 All 1’s r 12
EPI809/Spring Variance Inflation Factors SAS CODES /* VIF measures the inflation in the variances of the parameter estimates due to collinearity that exists among the regressors or (dependent) variables */ PROC REG data=Cow; model milk = food weight/VIF; run;
EPI809/Spring Variance Inflation Factors Computer Output Parameter Estimates Parameter Standard Variance Variable DF Estimate Error t Value Pr > |t| Inflation Intercept Food weight VIF 1 5
EPI809/Spring Types of Regression Models viewed from the explanatory variables standpoint
EPI809/Spring
EPI809/Spring Regression Models based on a Single Quantitative Explanatory Variable
EPI809/Spring Types of Regression Models
EPI809/Spring First-Order Model With 1 Independent Variable
EPI809/Spring First-Order Model With 1 Independent Variable 1.Relationship Between 1 Dependent & 1 Independent Variable Is Linear
EPI809/Spring First-Order Model With 1 Independent Variable 1.Relationship Between 1 Dependent & 1 Independent Variable Is Linear 2.Used When Expected Rate of Change in Y Per Unit Change in X Is Stable
EPI809/Spring First-Order Model Relationships 1 < 0 1 > 0 Y X 1 Y X 1
EPI809/Spring First-Order Model Worksheet Run regression with Y, X 1
EPI809/Spring Types of Regression Models
EPI809/Spring Second-Order Model With 1 Independent Variable 1.Relationship Between 1 Dependent & 1 Independent Variables Is a Quadratic Function 2.Useful 1 St Model If Non-Linear Relationship Suspected
EPI809/Spring Second-Order Model With 1 Independent Variable 1.Relationship Between 1 Dependent & 1 Independent Variables Is a Quadratic Function 2.Useful 1 St Model If Non-Linear Relationship Suspected 3.Model Linear effect Curvilinear effect
EPI809/Spring Second-Order Model Relationships 2 > 0 2 < 0
EPI809/Spring Second-Order Model Worksheet Create X 1 2 column. Run regression with Y, X 1, X 1 2.
EPI809/Spring Types of Regression Models
EPI809/Spring Third-Order Model With 1 Independent Variable 1.Relationship Between 1 Dependent & 1 Independent Variable Has a ‘Wave’ 2.Used If 1 Reversal in Curvature
EPI809/Spring Third-Order Model With 1 Independent Variable 1.Relationship Between 1 Dependent & 1 Independent Variable Has a ‘Wave’ 2.Used If 1 Reversal in Curvature 3.Model Linear effect Curvilinear effects
EPI809/Spring Third-Order Model Relationships 3 < 0 3 > 0
EPI809/Spring Third-Order Model Worksheet Multiply X 1 by X 1 to get X 1 2. Multiply X 1 by X 1 by X 1 to get X 1 3. Run regression with Y, X 1, X 1 2, X 1 3.