Simultaneous inference Estimating (or testing) more than one thing at a time (such as β 0 and β 1 ) and feeling confident about it …
Simultaneous inference we’ll be concerned about … Estimating β 0 and β 1 jointly. Estimating more than one mean response, E(Y), at a time. Predicting more than one new observation at a time.
Why simultaneous inference is important A 95% confidence interval implies a 95% chance that the interval contains β 0. A 95% confidence interval implies a 95% chance that the interval contains β 1. If the intervals are independent, then have only a (0.95×0.95) ×100 = 90.25% chance that both intervals are correct. (Intervals not independent, but point made.)
Terminology Family of estimates (or tests): a set of estimates (or tests) which you want all to be simultaneously correct. Statement confidence level: the confidence level, as you know it, that is, for just one parameter. Family confidence level: the confidence level of the whole family of interval estimates (or tests).
Examples A 95% confidence interval for β 0 – the 95% is a statement confidence level. A 95% confidence interval for β 1 – the 95% is a statement confidence level. Consider family of interval estimates for β 0 and β 1. If a 90.25% chance that both intervals are simultaneously correct, then 90.25% is the family confidence level.
Bonferroni joint confidence intervals for β 0 and β 1 GOAL: To formulate joint confidence intervals for β 0 and β 1 with a specified family confidence level. BASIC IDEA: –Make statement confidence level for β 0 higher –Make statement confidence level for β 1 higher –So that the family confidence level for (β 0, β 1 ) is at least (1-α)×100%.
Recall: Original confidence intervals For β 0 : For β 1 : Goal is to adjust the t-multiples so that family confidence coefficient is 1-α. That is, we need to find the α* to put into the above formulas to achieve the desired family coefficient of 1- α.
A little derivation Let A 1 = the event that first confidence interval does not contain β 0 (i.e., incorrect). So A 1 C = the event that first confidence interval contains β 0 (i.e., correct). P(A 1 ) = α and P(A 1 C ) = 1- α
A little derivation (cont’d) Let A 2 = the event that second confidence interval does not contain β 1 (i.e., incorrect). So A 2 C = the event that second confidence interval contains β 1 (i.e., correct). P(A 2 ) = α and P(A 2 C ) = 1- α
Becoming a not so little derivation… A1A1 A2A2 A 1 or A 2 A 1 C and A 2 C We want P(A 1 C and A 2 C ) to be at least 1-α. P(A 1 C and A 2 C ) = 1 – P(A 1 or A 2 ) = 1 – [P(A 1 )+P(A 2 ) – P(A 1 and A 2 )] = 1 – P(A 1 ) – P(A 2 ) + P(A 1 and A 2 )] ≥ 1 – P(A 1 ) – P(A 2 ) = 1 – α – α = 1 – 2α So, we need α* to be set to α/2.
Bonferroni joint confidence intervals Typically, the t-multiple in this setting is called the Bonferroni multiple and is denoted by the letter B.
Example: 90% family confidence interval The regression equation is punt = leg Predictor Coef SE Coef T P Constant leg n=13 punters t(0.975, 11) = We are 90% confident that β 0 is between and 83.9 and β 1 is between 0.44 and 1.36.
A couple of more points about Bonferroni intervals Bonferroni intervals are most useful when there are only a few interval estimates in the family (o.w., the intervals get too large). Can specify different statement confidence levels to get desired family confidence level. Bonferroni technique easily extends to g interval estimates. Set statement confidence levels at 1-(α/g), so need to look up 1- (α/2g).
Bonferroni intervals for more than one mean response at a time To estimate the mean response E(Y h ) for g different X h values with family confidence coefficient 1-α: where: g is the number of confidence intervals in the family
Example: Mean punting distance for leg strengths of 140, 150, 160 lbs. Predicted Values for New Observations New Fit SE Fit 95.0% CI 95.0% PI (130.55,152.01) (103.23,179.33) (140.13,160.49) (112.41,188.20) (147.72,170.95) (121.03,197.64) n=13 punters t(0.99, 11) = We are 94% confident that the mean responses for leg strengths of 140, 150, 160 pounds are …
Two procedures for predicting g new observations simultaneously Bonferroni procedure Scheffé procedure Use the procedure that gives the narrower prediction limits.
Bonferroni intervals for predicting more than one new obs’n at a time To predict g new observations Y h for g different X h values with family confidence coefficient 1-α: where: g is the number of prediction intervals in the family
Scheffé intervals for predicting more than one new obs’n at a time To predict g new observations Y h for g different X h values with family confidence coefficient 1-α: where: g is the number of prediction intervals in the family
Example: Punting distance for leg strengths of 140 and 150 lbs. n = 13 punters Bonferroni multiple: Suppose we want a 90% family confidence level. Scheffé multiple: Since B is smaller than S, the Bonferroni prediction intervals will be narrower … so use them here instead of the Scheffé intervals.
Example: Punting distance for leg strengths of 140 and 150 lbs. Predicted Values for New Observations New Fit SE Fit 95.0% CI 95.0% PI (130.55,152.01) (103.23,179.33) (140.13,160.49) (112.41,188.20) n=13 punters s(pred(140)) = There is a 90% chance that the punting distances for leg strengths of 140 and 150 pounds will be… s(pred(150)) = 17.21
Simultaneous prediction in Minitab Stat >> Regression >> Regression … Specify predictor and response. Under Options …, In “Prediction intervals for new observations” box, specify a column name containing multiple X values. Specify confidence level. Click on OK. Results appear in session window.