Download presentation
Presentation is loading. Please wait.
1
Confidence Intervals Underlying model: Unknown parameter We know how to calculate point estimates E.g. regression analysis But different data would change our estimates. So, we treat our estimates as random variables Want a measure of how confident we are in our estimate. Calculate “Confidence Interval”
2
What is it? If know how data sampled We can construct a Confidence Interval for an unknown parameter, . A 95% C.I. gives a range such that true is in interval 95% of the time. A 100(1-) C.I. captures true (1-) of the time. Smaller , more sure true falls in interval, but wider interval.
3
Example 1: Lead in Water Lead in drinking water causes serious health problems. To test contamination, require a control site. Problems: Lead concentration in control site? Estimate 95% confidence interval
4
Example 2: Gas Market Recall U.S. gas market question: By how much does gas consumption decrease when price increases? Our linear model: Estimate of 1 : -.04237. How confident are we in this estimate? Construct 90% C.I. for this estimate
5
If Data ~N(, 2 ) Since we don’t know , use t- distribution. 95% C.I. for s is standard error of mean. t 97.5 is critical value of t distribution Draw on board (Prob = 2.5%)
6
t-distribution Similar to Normal Distribution Requires “degrees of freedom”. df = (# data points) – (# variables). E.g. mean of lead concentration, 8 samples, one variable: d.f.=7. Higher d.f., closer t is to Normal distribution.
7
If Distribution Unknown Can use “Bootstrapping”. 1. Draw large sample with replacement 2. Calculate mean 3. Repeat many times 4. Draw histogram of sample means 5. Calculate empirical 95% C.I. Requires no previous knowledge of underlying process
8
Lead Concentration 8 lead measurements: Mean=51.39, s=5.75, t 97.5 =2.365 Lower=51.39-(5.75)(2.365) Upper= 51.39+(5.75)(2.365) C.I. = [37.8,65.0] Using bootstrapped samples: C.I. = [40.8,62.08]
9
Gas Regression: S-Plus Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) -0.0898134 0.0507787 -1.7687217 0.0867802 PG -0.0423712 0.0098406 -4.3057672 0.0001551 Y 0.0001587 0.0000068 23.4188561 0.0000000 PNC -0.1013809 0.0617077 -1.6429209 0.1105058 PUC -0.0432496 0.0241442 -1.7913093 0.0830122 Residual standard error: 0.02680668 on 31 degrees of freedom Multiple R-Squared: 0.9678838 F-statistic: 233.5615 on 4 and 31 degrees of freedom, the p- value is 0
10
Gas Price Response b 2 =-.04237, s=.00984 90% C.I.: t 95 =1.695 (d.f.=37-5=32) C.I. = [-.0591,-.0256] Using bootstrapped samples: C.I. = [-.063,-.026] Response is probably between 2.5 gallons and 6 gallons.
11
Interpretation & Other Facts There is a 95% chance that the true average lead concentration lies in this range. There is a 90% chance that the true value of 1 lies in this range. Also can calculate “confidence region” for 2 or more variables.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.