EPI 809/Spring Probability Distribution of Random Error
EPI 809/Spring Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
EPI 809/Spring Linear Regression Assumptions Assumptions of errors n Assumptions of errors n - Gauss-Markov condition - Gauss-Markov condition 1. Independent errors 2. Mean of probability distribution of errors is 0 3. Errors have constant variance σ 2, for which an estimator is S 2 4. Probability distribution of error is normal 5. Potential violation of G-M condition.
EPI 809/Spring Error Probability Distribution
EPI 809/Spring Random Error Variation
EPI 809/Spring Random Error Variation 1.Variation of Actual Y from Predicted Y
EPI 809/Spring Random Error Variation 1.Variation of Actual Y from Predicted Y 2.Measured by Standard Error of Regression Model Sample Standard Deviation of , s Sample Standard Deviation of , s ^
EPI 809/Spring Random Error Variation 1.Variation of Actual Y from Predicted Y 2.Measured by Standard Error of Regression Model Sample Standard Deviation of , s Sample Standard Deviation of , s 3. Affects Several Factors Parameter Significance Parameter Significance Prediction Accuracy Prediction Accuracy ^
EPI 809/Spring Evaluating the Model Testing for Significance
EPI 809/Spring Regression Modeling Steps 1. Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
EPI 809/Spring Test of Slope Coefficient 1. Shows If There Is a Linear Relationship Between X & Y 2.Involves Population Slope 1 3.Hypotheses H 0 : 1 = 0 (No Linear Relationship) H 0 : 1 = 0 (No Linear Relationship) H a : 1 0 (Linear Relationship) H a : 1 0 (Linear Relationship) 4.Theoretical basis of the test statistic is the sampling distribution of slope
EPI 809/Spring Sampling Distribution of Sample Slopes
EPI 809/Spring Sampling Distribution of Sample Slopes
EPI 809/Spring Sampling Distribution of Sample Slopes All Possible Sample Slopes Sampl e 1:2.5 Sampl e 2:1.6 Sampl e 3:1.8 Sampl e 4:2.1 : : Very large number of sample slopes
EPI 809/Spring Sampling Distribution of Sample Slopes All Possible Sample Slopes Samp le 1:2.5 Samp le 2:1.6 Samp le 3:1.8 Samp le 4:2.1 : : large number of sample slopes Sampling Distribution 1111 1111 S ^ ^
EPI 809/Spring Slope Coefficient Test Statistic
EPI 809/Spring Test of Slope Coefficient Rejection Rule Reject H 0 in favor of H a if t falls in colored area Reject H 0 for H a if P-value = P(T>|t|) |t|) < α T=t (n-2) 0 t 1-α/2, (n-2) Reject H 0 0 α/2 -t 1-α/2, (n-2) α/2
EPI 809/Spring Test of Slope Coefficient Example Reconsider the Obstetrics example with the following data: Estriol (mg/24h) B.w. (g/1000) Is the Linear Relationship between Estriol & Birthweight significant at.05 level?
EPI 809/Spring Solution Table For β’s
EPI 809/Spring Solution Table for SSE Birth weight =y Estriol =x Predicted =y=β 0 + β 1 x (Obs-pred) 2 =( y - y) SSE=1.1 ^^^^
EPI 809/Spring Test of Slope Parameter Solution H 0 : 1 = 0 H a : 1 0 .05 df = 3 Critical Value(s): Test Statistic:
EPI 809/Spring Test Statistic Solution From Table
EPI 809/Spring Test of Slope Parameter H 0 : 1 = 0 H a : 1 0 .05 df = 3 Critical Value(s): Test Statistic: Decision:Conclusion: Reject at =.05 There is evidence of a linear relationship
EPI 809/Spring Test of Slope Parameter Computer Output Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept Estriol t = k / S P-Value SS kk k k ^ ^ ^ ^
EPI 809/Spring Measures of Variation in Regression 1.Total Sum of Squares (SS yy ) Measures Variation of Observed Y i Around the Mean Y Measures Variation of Observed Y i Around the Mean Y 2.Explained Variation (SSR) Variation Due to Relationship Between X & Y Variation Due to Relationship Between X & Y 3.Unexplained Variation (SSE) Variation Due to Other Factors Variation Due to Other Factors
EPI 809/Spring Variation Measures Total sum of squares (Y i - Y) 2 Unexplained sum of squares (Y i - Y i ) 2 ^ Explained sum of squares (Y i - Y) 2 ^ YiYiYiYi
EPI 809/Spring 1.Proportion of Variation ‘Explained’ by Relationship Between X & Y Coefficient of Determination 0 r 2 1
EPI 809/Spring Coefficient of Determination Examples r 2 = 1 r 2 =.8r 2 = 0
EPI 809/Spring Coefficient of Determination Example Reconsider the Obstetrics example. Interpret a coefficient of Determination of Answer: About 82% of the total variation of birthweight Is explained by the mother’s Estriol level.
EPI 809/Spring r 2 Computer Output Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var r 2 adjusted for number of explanatory variables & sample size S r2r2
EPI 809/Spring Using the Model for Prediction & Estimation
EPI 809/Spring Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term-Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
EPI 809/Spring Prediction With Regression Models What Is Predicted? Population Mean Response E(Y) for Given X Population Mean Response E(Y) for Given X Point on Population Regression LinePoint on Population Regression Line Individual Response (Y i ) for Given X Individual Response (Y i ) for Given X
EPI 809/Spring What Is Predicted?
EPI 809/Spring Confidence Interval Estimate of Mean Y
EPI 809/Spring Factors Affecting Interval Width 1.Level of Confidence (1 - ) Width Increases as Confidence Increases Width Increases as Confidence Increases 2.Data Dispersion (s) Width Increases as Variation Increases Width Increases as Variation Increases 3.Sample Size Width Decreases as Sample Size Increases Width Decreases as Sample Size Increases 4.Distance of X p from Mean X Width Increases as Distance Increases Width Increases as Distance Increases
EPI 809/Spring Why Distance from Mean? Greater dispersion than X 1 XXXX
EPI 809/Spring Confidence Interval Estimate Example Reconsider the Obstetrics example with the following data: Estriol (mg/24h) B.w. (g/1000) Estimate the mean BW and a subject’s BW response when the Estriol level is 4 at.05 level.
EPI 809/Spring Solution Table
EPI 809/Spring Confidence Interval Estimate Solution - Mean BW X to be predicted
EPI 809/Spring Prediction Interval of Individual Response Note!
EPI 809/Spring Why the Extra ‘S’?
EPI 809/Spring SAS codes for computing mean and prediction intervals Data BW; /*Reading data in SAS*/ input estriol birthw; cards; 11 21 32 42 54 ; run; PROC REG data=BW; /*Fitting a linear regression model*/ model birthw=estriol/CLI CLM alpha=.05; run;
EPI 809/Spring Interval Estimate from SAS- Output The REG Procedure Dependent Variable: y Output Statistics Dep Var Predicted Std Error Obs y Value Mean Predict 95% CL Mean 95% CL Predict Residual Predicted Y when X = 3 Confidence Interval SYSYSYSY^ Prediction Interval
EPI 809/Spring Hyperbolic Interval Bands
EPI 809/Spring Correlation Models
EPI 809/Spring Types of Probabilistic Models
EPI 809/Spring Both variables are treated the same in correlation; in regression there is a predictor and a response In regression the x variable is assumed non- random or measured without error Correlation is used in looking for relationships, regression for prediction Correlation vs. regression
EPI 809/Spring Correlation Models 1.Answer ‘How Strong Is the Linear Relationship Between 2 Variables?’ 2.Coefficient of Correlation Used Population Correlation Coefficient Denoted (Rho) Population Correlation Coefficient Denoted (Rho) Values Range from -1 to +1 Values Range from -1 to +1 Measures Degree of Association Measures Degree of Association 3.Used Mainly for Understanding
EPI 809/Spring 1.Pearson Product Moment Coefficient of Correlation between x and y: Sample Coefficient of Correlation
EPI 809/Spring Coefficient of Correlation Values
EPI 809/Spring Coefficient of Correlation Values No Correlation
EPI 809/Spring Coefficient of Correlation Values Increasing degree of negative correlation No Correlation
EPI 809/Spring Coefficient of Correlation Values Perfect Negative Correlation No Correlation
EPI 809/Spring Coefficient of Correlation Values Perfect Negative Correlation No Correlation Increasing degree of positive correlation
EPI 809/Spring Coefficient of Correlation Values Perfect Positive Correlation Perfect Negative Correlation No Correlation
EPI 809/Spring Coefficient of Correlation Examples r = 1r = -1 r =.89r = 0
EPI 809/Spring Test of Coefficient of Correlation 1.Shows If There Is a Linear Relationship Between 2 Numerical Variables 2.Same Conclusion as Testing Population Slope 1 3.Hypotheses H 0 : = 0 (No Correlation) H 0 : = 0 (No Correlation) H a : 0 (Correlation) H a : 0 (Correlation)
EPI 809/Spring Sample t-Test on Correlation Coefficient Hypotheses H 0 : = 0 (No Correlation) H 0 : = 0 (No Correlation) H a : 0 (Correlation) H a : 0 (Correlation) test statistic: under H 0 t = r (n-2) 1/2 / (1-r 2 ) 1/2 ~ t (n-2) t = r (n-2) 1/2 / (1-r 2 ) 1/2 ~ t (n-2) Reject H 0 if |t| > t α/2, n-2 Reject H 0 if |t| > t α/2, n-2
EPI 809/Spring Sample Z-Test on Correlation Coefficient Hypotheses (Fisher) H 0 : = 0 H 0 : = 0 H a : 0 H a : 0 test statistic: under H 0 : Reject H 0 if |z| > z 1- α/2 Reject H 0 if |z| > z 1- α/2
EPI 809/Spring Conclusion 1. Describe the Linear Regression Model 2. State the Regression Modeling Steps 3. Explain Ordinary Least Squares 4. Compute Regression Coefficients 5. Understand and check model assumptions 6. Predict Response Variable 7. Comments of SAS Output
EPI 809/Spring Conclusion … 8. Correlation Models 9. Test of coefficient of Correlation