Download presentation
Presentation is loading. Please wait.
1
12.1
2
Inference for Linear Regression
Today we will apply inference procedures to linear regression We have been using inference procedures for the past several chapters Confidence intervals Hypothesis tests Linear regression we covered in chapter 3 It has been awhile
3
Linear Regression Refresher
The idea behind linear regression is to estimate a line of best fit between two variables Independent variable and dependent variable How many units the dependent variable changes when the independent variable changes by one unit
4
Linear Regression Refresher
Old faithful Duration of an eruption vs the time before the next eruption Slope is 10.36, and y-intercept is 33.97 π¦ = π₯ πππ‘πππ£ππ = (ππ’πππ‘πππ)
8
Inference So when we do a linear regression using a sample of data, we are really ESTIMATING the true population values (slope and y-intercept) We donβt know the population values But the estimates from the regression are unbiased estimators for the true population value Does not mean that they are exactly correct So when we do a regression on sample data, we get a regression line π¦ =π+ππ₯ a is an unbiased estimator of the true population y-intercept (sometimes called Ξ±) b is an unbiased estimator of the true population slope (sometimes called Ξ²)
9
Sample vs Population Sample regression equation: π¦ =π+ππ₯
Population regression equation: y=Ξ±+Ξ²π₯ a estimates Ξ± b estimates Ξ²
10
Sampling Distribution
So if we want a sampling distribution for the slope, we already have our unbiased estimate i.e. the mean of the sampling distribution Whatever our estimated slope is But we also need the standard deviation of the sampling distribution Because we donβt know it, we estimate it Called a standard error
11
Standard Error of the Slope
The good news is that we rarely need to use this When we perform a regression, it is included in the computer output But not when you do it on your calculator
12
Standard Error of the Slope
ππΈ π =
13
Confidence Interval for the Slope
14
Example
15
Example 90% confidence interval:
Β± (1.761)( ) ( , ) Interpretation: We are 90% confident that one additional calorie of non-exercise activity corresponds to a decrease in fat gain of between kg and kg.
16
You try The following regression uses the number of Peruvian anchovies caught (millions of metric tons) per year with the price (US $) of fish meal in that year. Calculate a 95% confidence interval for the true slope of the regression line Predictor Coef SE Coef T P Constant Catch S= R-Sq= 73.5% n=14
17
Answers Β± (2.179)(5.091) Β± ( , )
20
Example Hypotheses: π» 0 : Ξ²=0 π» π : Ξ²>0 Check conditionsβfor now letβs assume that they are met (weβll talk about this in a minute) Test statistic: β = 3.07 P-value: tcdf(3.07, BIG, 36)= 0.002
21
Look back at the regression output.
P-value=0.002 Look back at the regression output. It gives us the test statistic, and it gives us a (wrong) p-value Why is the p-value wrong?
22
Back to the Anchovies Does the number of fish caught affect the price?
What is the test statistic? What is the p-value? What is our interpretation?
23
Back to the Anchovies What is the test statistic? t=-5.78
What is the p-value? Very small What is our interpretation? We reject the null hypothesis that the true slope is zero, we conclude instead that the true slope (affect) is different from zero
24
Anchovies Again (last example)
Now letβs test a hypothesis different from zero Letβs test whether the slope is below -20 π» 0 : Ξ²=β20 π» π : Ξ²<β20 Test statistic: β =β1.847 P-value: .0448
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.