Presentation is loading. Please wait.

Presentation is loading. Please wait.

Confidence Intervals Underlying model: Unknown parameter We know how to calculate point estimates E.g. regression analysis But different data would change.

Similar presentations


Presentation on theme: "Confidence Intervals Underlying model: Unknown parameter We know how to calculate point estimates E.g. regression analysis But different data would change."— Presentation transcript:

1 Confidence Intervals Underlying model: Unknown parameter We know how to calculate point estimates E.g. regression analysis But different data would change our estimates. So, we treat our estimates as random variables Want a measure of how confident we are in our estimate. Calculate “Confidence Interval”

2 What is it? If know how data sampled We can construct a Confidence Interval for an unknown parameter, . A 95% C.I. gives a range such that true  is in interval 95% of the time. A 100(1-) C.I. captures true  (1-) of the time. Smaller , more sure true  falls in interval, but wider interval.

3 Example 1: Lead in Water Lead in drinking water causes serious health problems. To test contamination, require a control site. Problems: Lead concentration in control site? Estimate 95% confidence interval

4 Example 2: Gas Market Recall U.S. gas market question: By how much does gas consumption decrease when price increases? Our linear model: Estimate of  1 : -.04237. How confident are we in this estimate? Construct 90% C.I. for this estimate

5 If Data ~N(, 2 ) Since we don’t know , use t- distribution. 95% C.I. for  s is standard error of mean. t 97.5 is critical value of t distribution Draw on board (Prob = 2.5%)

6 t-distribution Similar to Normal Distribution Requires “degrees of freedom”. df = (# data points) – (# variables). E.g. mean of lead concentration, 8 samples, one variable: d.f.=7. Higher d.f., closer t is to Normal distribution.

7 If Distribution Unknown Can use “Bootstrapping”. 1. Draw large sample with replacement 2. Calculate mean 3. Repeat many times 4. Draw histogram of sample means 5. Calculate empirical 95% C.I. Requires no previous knowledge of underlying process

8 Lead Concentration 8 lead measurements: Mean=51.39, s=5.75, t 97.5 =2.365 Lower=51.39-(5.75)(2.365) Upper= 51.39+(5.75)(2.365) C.I. = [37.8,65.0] Using bootstrapped samples: C.I. = [40.8,62.08]

9 Gas Regression: S-Plus Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) -0.0898134 0.0507787 -1.7687217 0.0867802 PG -0.0423712 0.0098406 -4.3057672 0.0001551 Y 0.0001587 0.0000068 23.4188561 0.0000000 PNC -0.1013809 0.0617077 -1.6429209 0.1105058 PUC -0.0432496 0.0241442 -1.7913093 0.0830122 Residual standard error: 0.02680668 on 31 degrees of freedom Multiple R-Squared: 0.9678838 F-statistic: 233.5615 on 4 and 31 degrees of freedom, the p- value is 0

10 Gas Price Response b 2 =-.04237, s=.00984 90% C.I.: t 95 =1.695 (d.f.=37-5=32) C.I. = [-.0591,-.0256] Using bootstrapped samples: C.I. = [-.063,-.026] Response is probably between 2.5 gallons and 6 gallons.

11 Interpretation & Other Facts There is a 95% chance that the true average lead concentration lies in this range. There is a 90% chance that the true value of  1 lies in this range. Also can calculate “confidence region” for 2 or more variables.


Download ppt "Confidence Intervals Underlying model: Unknown parameter We know how to calculate point estimates E.g. regression analysis But different data would change."

Similar presentations


Ads by Google