Download presentation
Presentation is loading. Please wait.
Published byMarylou Ellis Modified over 9 years ago
1
EPI 809/Spring 2008 1 Probability Distribution of Random Error
2
EPI 809/Spring 20082 Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
3
EPI 809/Spring 20083 Linear Regression Assumptions Assumptions of errors n Assumptions of errors n - Gauss-Markov condition - Gauss-Markov condition 1. Independent errors 2. Mean of probability distribution of errors is 0 3. Errors have constant variance σ 2, for which an estimator is S 2 4. Probability distribution of error is normal 5. Potential violation of G-M condition.
4
EPI 809/Spring 20084 Error Probability Distribution
5
EPI 809/Spring 20085 Random Error Variation
6
EPI 809/Spring 20086 Random Error Variation 1.Variation of Actual Y from Predicted Y
7
EPI 809/Spring 20087 Random Error Variation 1.Variation of Actual Y from Predicted Y 2.Measured by Standard Error of Regression Model Sample Standard Deviation of , s Sample Standard Deviation of , s ^
8
EPI 809/Spring 20088 Random Error Variation 1.Variation of Actual Y from Predicted Y 2.Measured by Standard Error of Regression Model Sample Standard Deviation of , s Sample Standard Deviation of , s 3. Affects Several Factors Parameter Significance Parameter Significance Prediction Accuracy Prediction Accuracy ^
9
EPI 809/Spring 2008 9 Evaluating the Model Testing for Significance
10
EPI 809/Spring 200810 Regression Modeling Steps 1. Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
11
EPI 809/Spring 200811 Test of Slope Coefficient 1. Shows If There Is a Linear Relationship Between X & Y 2.Involves Population Slope 1 3.Hypotheses H 0 : 1 = 0 (No Linear Relationship) H 0 : 1 = 0 (No Linear Relationship) H a : 1 0 (Linear Relationship) H a : 1 0 (Linear Relationship) 4.Theoretical basis of the test statistic is the sampling distribution of slope
12
EPI 809/Spring 200812 Sampling Distribution of Sample Slopes
13
EPI 809/Spring 200813 Sampling Distribution of Sample Slopes
14
EPI 809/Spring 200814 Sampling Distribution of Sample Slopes All Possible Sample Slopes Sampl e 1:2.5 Sampl e 2:1.6 Sampl e 3:1.8 Sampl e 4:2.1 : : Very large number of sample slopes
15
EPI 809/Spring 200815 Sampling Distribution of Sample Slopes All Possible Sample Slopes Samp le 1:2.5 Samp le 2:1.6 Samp le 3:1.8 Samp le 4:2.1 : : large number of sample slopes Sampling Distribution 1111 1111 S ^ ^
16
EPI 809/Spring 200816 Slope Coefficient Test Statistic
17
EPI 809/Spring 200817 Test of Slope Coefficient Rejection Rule Reject H 0 in favor of H a if t falls in colored area Reject H 0 for H a if P-value = P(T>|t|) |t|) < α T=t (n-2) 0 t 1-α/2, (n-2) Reject H 0 0 α/2 -t 1-α/2, (n-2) α/2
18
EPI 809/Spring 200818 Test of Slope Coefficient Example Reconsider the Obstetrics example with the following data: Estriol (mg/24h) B.w. (g/1000) 11 21 32 42 54 11 21 32 42 54 Is the Linear Relationship between Estriol & Birthweight significant at.05 level?
19
EPI 809/Spring 200819 Solution Table For β’s
20
EPI 809/Spring 200820 Solution Table for SSE Birth weight =y Estriol =x Predicted =y=β 0 + β 1 x (Obs-pred) 2 =( y - y) 2 110.60.16 121.30.09 2320 242.70.49 453.40.36 1015-SSE=1.1 ^^^^
21
EPI 809/Spring 200821 Test of Slope Parameter Solution H 0 : 1 = 0 H a : 1 0 .05 df 5 - 2 = 3 Critical Value(s): Test Statistic:
22
EPI 809/Spring 200822 Test Statistic Solution From Table
23
EPI 809/Spring 200823 Test of Slope Parameter H 0 : 1 = 0 H a : 1 0 .05 df 5 - 2 = 3 Critical Value(s): Test Statistic: Decision:Conclusion: Reject at =.05 There is evidence of a linear relationship
24
EPI 809/Spring 200824 Test of Slope Parameter Computer Output Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 -0.10000 0.63509 -0.16 0.8849 Estriol 1 0.70000 0.19149 3.66 0.0354 t = k / S P-Value SS kk k k ^ ^ ^ ^
25
EPI 809/Spring 200825 Measures of Variation in Regression 1.Total Sum of Squares (SS yy ) Measures Variation of Observed Y i Around the Mean Y Measures Variation of Observed Y i Around the Mean Y 2.Explained Variation (SSR) Variation Due to Relationship Between X & Y Variation Due to Relationship Between X & Y 3.Unexplained Variation (SSE) Variation Due to Other Factors Variation Due to Other Factors
26
EPI 809/Spring 200826 Variation Measures Total sum of squares (Y i - Y) 2 Unexplained sum of squares (Y i - Y i ) 2 ^ Explained sum of squares (Y i - Y) 2 ^ YiYiYiYi
27
EPI 809/Spring 200827 1.Proportion of Variation ‘Explained’ by Relationship Between X & Y Coefficient of Determination 0 r 2 1
28
EPI 809/Spring 200828 Coefficient of Determination Examples r 2 = 1 r 2 =.8r 2 = 0
29
EPI 809/Spring 200829 Coefficient of Determination Example Reconsider the Obstetrics example. Interpret a coefficient of Determination of 0.8167. Answer: About 82% of the total variation of birthweight Is explained by the mother’s Estriol level.
30
EPI 809/Spring 200830 r 2 Computer Output Root MSE 0.60553 R-Square 0.8167 Dependent Mean 2.00000 Adj R-Sq 0.7556 Coeff Var 30.27650 r 2 adjusted for number of explanatory variables & sample size S r2r2
31
EPI 809/Spring 2008 31 Using the Model for Prediction & Estimation
32
EPI 809/Spring 200832 Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term-Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
33
EPI 809/Spring 200833 Prediction With Regression Models What Is Predicted? Population Mean Response E(Y) for Given X Population Mean Response E(Y) for Given X Point on Population Regression LinePoint on Population Regression Line Individual Response (Y i ) for Given X Individual Response (Y i ) for Given X
34
EPI 809/Spring 200834 What Is Predicted?
35
EPI 809/Spring 200835 Confidence Interval Estimate of Mean Y
36
EPI 809/Spring 200836 Factors Affecting Interval Width 1.Level of Confidence (1 - ) Width Increases as Confidence Increases Width Increases as Confidence Increases 2.Data Dispersion (s) Width Increases as Variation Increases Width Increases as Variation Increases 3.Sample Size Width Decreases as Sample Size Increases Width Decreases as Sample Size Increases 4.Distance of X p from Mean X Width Increases as Distance Increases Width Increases as Distance Increases
37
EPI 809/Spring 200837 Why Distance from Mean? Greater dispersion than X 1 XXXX
38
EPI 809/Spring 200838 Confidence Interval Estimate Example Reconsider the Obstetrics example with the following data: Estriol (mg/24h) B.w. (g/1000) 11 21 32 42 54 11 21 32 42 54 Estimate the mean BW and a subject’s BW response when the Estriol level is 4 at.05 level.
39
EPI 809/Spring 200839 Solution Table
40
EPI 809/Spring 200840 Confidence Interval Estimate Solution - Mean BW X to be predicted
41
EPI 809/Spring 200841 Prediction Interval of Individual Response Note!
42
EPI 809/Spring 200842 Why the Extra ‘S’?
43
EPI 809/Spring 200843 SAS codes for computing mean and prediction intervals Data BW; /*Reading data in SAS*/ input estriol birthw; cards; 11 21 32 42 54 ; run; PROC REG data=BW; /*Fitting a linear regression model*/ model birthw=estriol/CLI CLM alpha=.05; run;
44
EPI 809/Spring 200844 Interval Estimate from SAS- Output The REG Procedure Dependent Variable: y Output Statistics Dep Var Predicted Std Error Obs y Value Mean Predict 95% CL Mean 95% CL Predict Residual 1 1.0000 0.6000 0.4690 -0.8927 2.0927 -1.8376 3.0376 0.4000 2 1.0000 1.3000 0.3317 0.2445 2.3555 -0.8972 3.4972 -0.3000 3 2.0000 2.0000 0.2708 1.1382 2.8618 -0.1110 4.1110 0 4 2.0000 2.7000 0.3317 1.6445 3.7555 0.5028 4.8972 -0.7000 5 4.0000 3.4000 0.4690 1.9073 4.8927 0.9624 5.8376 0.6000 Predicted Y when X = 3 Confidence Interval SYSYSYSY^ Prediction Interval
45
EPI 809/Spring 200845 Hyperbolic Interval Bands
46
EPI 809/Spring 2008 46 Correlation Models
47
EPI 809/Spring 200847 Types of Probabilistic Models
48
EPI 809/Spring 200848 Both variables are treated the same in correlation; in regression there is a predictor and a response In regression the x variable is assumed non- random or measured without error Correlation is used in looking for relationships, regression for prediction Correlation vs. regression
49
EPI 809/Spring 200849 Correlation Models 1.Answer ‘How Strong Is the Linear Relationship Between 2 Variables?’ 2.Coefficient of Correlation Used Population Correlation Coefficient Denoted (Rho) Population Correlation Coefficient Denoted (Rho) Values Range from -1 to +1 Values Range from -1 to +1 Measures Degree of Association Measures Degree of Association 3.Used Mainly for Understanding
50
EPI 809/Spring 200850 1.Pearson Product Moment Coefficient of Correlation between x and y: Sample Coefficient of Correlation
51
EPI 809/Spring 200851 Coefficient of Correlation Values +1.00-.5+.5
52
EPI 809/Spring 200852 Coefficient of Correlation Values +1.00-.5+.5 No Correlation
53
EPI 809/Spring 200853 Coefficient of Correlation Values +1.00 Increasing degree of negative correlation -.5+.5 No Correlation
54
EPI 809/Spring 200854 Coefficient of Correlation Values +1.00-.5+.5 Perfect Negative Correlation No Correlation
55
EPI 809/Spring 200855 Coefficient of Correlation Values +1.00-.5+.5 Perfect Negative Correlation No Correlation Increasing degree of positive correlation
56
EPI 809/Spring 200856 Coefficient of Correlation Values +1.00 Perfect Positive Correlation -.5+.5 Perfect Negative Correlation No Correlation
57
EPI 809/Spring 200857 Coefficient of Correlation Examples r = 1r = -1 r =.89r = 0
58
EPI 809/Spring 200858 Test of Coefficient of Correlation 1.Shows If There Is a Linear Relationship Between 2 Numerical Variables 2.Same Conclusion as Testing Population Slope 1 3.Hypotheses H 0 : = 0 (No Correlation) H 0 : = 0 (No Correlation) H a : 0 (Correlation) H a : 0 (Correlation)
59
EPI 809/Spring 200859 1 Sample t-Test on Correlation Coefficient Hypotheses H 0 : = 0 (No Correlation) H 0 : = 0 (No Correlation) H a : 0 (Correlation) H a : 0 (Correlation) test statistic: under H 0 t = r (n-2) 1/2 / (1-r 2 ) 1/2 ~ t (n-2) t = r (n-2) 1/2 / (1-r 2 ) 1/2 ~ t (n-2) Reject H 0 if |t| > t α/2, n-2 Reject H 0 if |t| > t α/2, n-2
60
EPI 809/Spring 200860 1 Sample Z-Test on Correlation Coefficient Hypotheses (Fisher) H 0 : = 0 H 0 : = 0 H a : 0 H a : 0 test statistic: under H 0 : Reject H 0 if |z| > z 1- α/2 Reject H 0 if |z| > z 1- α/2
61
EPI 809/Spring 200861 Conclusion 1. Describe the Linear Regression Model 2. State the Regression Modeling Steps 3. Explain Ordinary Least Squares 4. Compute Regression Coefficients 5. Understand and check model assumptions 6. Predict Response Variable 7. Comments of SAS Output
62
EPI 809/Spring 200862 Conclusion … 8. Correlation Models 9. Test of coefficient of Correlation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.