Presentation is loading. Please wait.

Presentation is loading. Please wait.

We are interested in methods that produce an interval: Common interval methods for: Confidence intervals Prediction intervals Tolerance intervals Credibility/Probability.

Similar presentations


Presentation on theme: "We are interested in methods that produce an interval: Common interval methods for: Confidence intervals Prediction intervals Tolerance intervals Credibility/Probability."— Presentation transcript:

1 We are interested in methods that produce an interval: Common interval methods for: Confidence intervals Prediction intervals Tolerance intervals Credibility/Probability intervals (Bayesian) Interval Estimation Given the assumptions of the methods are satisfied, the interval covers the true value of the parameter with (approximate) probability at least 1 – .

2 Confidence Intervals  is a parameter we are interested in and assume we don’t know its true value. e.g. a mean, a sd, a proportion, etc. Consider an experiment that will collect a sample of data. Then BEFORE we collect the data, we can devise procedure such that: Estimates we will get from the sample we have yet to collect

3 Confidence Intervals In order to get actual numerical values for and we perform the experiment and plug in the data The outcomes for this experiment are: Under the frequentist definition, probabilities (other than 0 or 1) only exist for outcomes of experiments that haven’t happened yet.

4 Confidence Intervals Once the data is collected we cannot say that  is in the specific, realized interval with probability greater than or equal to 1-  But that’s a mouthful, so let’s make up a new word: confidence The “probability” of the outcome is now: 0 (outcome did not happen) or 1 (outcome did happen). This is true even if you don’t know what the outcome was. For realized CIs something happened. We just can’t tell what the outcome was if we don’t know the true value of . What we could say is: “considering the data we’ve collected is a set of plausible values for  ”.

5 Confidence Intervals The CI’s level of confidence: (1 −  )×100% is the same “number” as the CI –method’s probability of producing an interval that covers , but… We are (1 −  )×100% confident that the true value of  is covered by Given a sample of data, the (1 −  )×100% confidence interval for a parameter estimate on the sample is: confidence is not probability Confidence says something about the “plausibility” of , being one of the values in the measured interval. The “amount of plausibility” is the confidence.

6 Confidence Intervals So how do we compute a (1 −  )×100% confidence interval given a set of data?? Conceptually, if we are trying to estimate a parameter  with some estimator we have to know something about the sampling distribution of the estimator For large IID samples, one can show that is approximately normal: Approx. sampling dist. of an estimator (large IID sample assumed)

7 Confidence Intervals Since we don’t know  or we can plug in their sample estimates once we’ve collected a sample: plug in “Plausible” approximate sampling distribution considering the sample collected. So where do (say) 95% of the “plausible samples” fall…

8 Confidence Intervals Since we don’t know  or we can plug in their sample estimates once we’ve collected a sample: plug in “Plausible” approximate sampling distribution considering the sample collected. So where do (say) 95% of the “plausible samples” fall… …say symmetrically around the estimate???? About 95% falls between the first two standard devs for a normal density! Area between is ≈ 0.95 Two-sided equal tailed CI

9 Confidence Intervals Since we don’t know  or we can plug in their sample estimates once we’ve collected a sample: plug in “Plausible” approximate sampling distribution considering the sample collected. So where do (say) 95% of the “plausible samples” fall… …say up to the first 95% most plausible??? Area between –∞ and is ≈ 0.95 One-sided CI 95% of the “plausible samples” are lower than

10 Confidence Intervals Since we don’t know  or we can plug in their sample estimates once we’ve collected a sample: plug in “Plausible” approximate sampling distribution considering the sample collected. So where do (say) 95% of the “plausible samples” fall… …say highest 95% most plausible??? Area between and ∞ ≈ 0.95 One-sided CI 95% of the “plausible samples” are higher than

11 Confidence Intervals By “standardizing”: Z gives the number of s.d.s is from For :

12 4.11, 3.70, 3.36, 3.68, 4.42, 3.23, 4.03, 4.03, 3.52, 4.75, 5.09, 3.47, 3.02, 4.24, 4.74, 4.51, 2.90, 4.15, 3.54, 3.81, 2.98, 3.82, 4.32, 3.06, 4.00, 4.05, 3.19, 3.17, 3.67, 4.37 A the mass of an unknown powder was determined 30 times. The Results are shown below (units: mg): Compute the Confidence Intervals Compute: a.The sample mean: b.The sample sd: c.The estimated standard error of the mean: d.The number of estimated standard errors that cover 95% of the sampling distribution symmetrically about the sample mean: or

13 Compute the Confidence Intervals a. Sample mean = 3.83 b. Sample sd = 0.58 c. Est se of mean = 0.11 d. For 95%,  = 0.05. 95% spread symmetrically about the mean we want z 0.025 and z 0.975 = ± 1.959964

14 Compute the Confidence Intervals e. Compute the two-sided 95% CI for the mean given this data: Same thing but easier to typeset [ 3.83 – 1.96*0.11, 3.83 + 1.96*0.11 ]

15 Confidence Intervals Points of interest: (1 −  ) is called the level of confidence and is between 0 and 1 Common standard choices are 0.95, 0.99, 0.9  is called the significance level and is between 0 and 1 Common standard choices are 0.05, 0.01, 0.1 Estimate standard error of  with the bootstrap if: Sampling distribution of is not known Sample size is small Algorithm for  is very complicated Why not? You can always do it! For small sample sizes: Use Student-t based formulas (coming up) Bootstrap required estimates (below)

16 Confidence Intervals So how do we compute a (1 −  )×100% confidence interval given a set of data?? Case 1a: (1 −  )×100% CIs for the mean  : Large sample n (at least 30), sd  X known: Two sided One sided, lower bound One sided, upper bound N(0,1) quantiles qnorm(1-a/2) or qnorm(1-a)

17 Confidence Intervals So how do we compute a (1 −  )×100% confidence interval given a set of data?? Case 1b: (1 −  )×100% CIs for the mean  : Large sample n (at least 30), sd  X unknown: Two sided One sided, lower bound One sided, upper bound

18 Confidence Intervals So how do we compute a (1 −  )×100% confidence interval given a set of data?? Case 1c: (1 −  )×100% CIs for the mean  : Small sample n, sd  X unknown: Two sided One sided, lower bound One sided, upper bound Student-t(n-1) quantiles qt(1-a/2,df=n-1) or qt(1-a,df=n-1)

19 A suspect, one Mr. B. Mayhew is captured by law enforcement officials in possession of 50 mini-Ziploc baggies containing what is determined to be very pure methamphetamine (“meth”). Under Federal statute 21 USC §§ 841(a), 841(b)(1)(B); § 2D1.1 the mandatory minimum sentence for possession and intent to distribute is 10 years if the amount is greater than or equal to 50g but 5 years for less than 50g. Considering the sentence differential it is important to determine the total mass as accurately as possible. The baggies are emptied and collected into one mass of crystals. 10 mass measurements are taken: 49.9996g 49.9994g 49.9993g 49.9996g 49.9995g 49.9994g 49.9995g 49.9994g Example: Confidence Intervals a.Compute the two-sided 99% CI for the mean mass b.Compute the one-sided 99% CI for the lower bound on the mean mass c.Compute the one-sided 99% CI for the upper bound on the mean mass The lab’s analytical balances have uncertainty in the 4 th decimal place. The lab policy is to round up if the fourth decimal place is greater than or equal to 5, e.g 45.6785g will be reported as 45.679g while 45.6784g will be reported as 45.678g.

20 Example: Confidence Intervals a.

21 Example: Confidence Intervals

22 Confidence Intervals So how do we compute a (1 −  )×100% confidence interval given a set of data?? Case 2a: (1 −  )×100% CIs for a proportion p : Large sample n (at least 30), p not too close to 0 or 1: Two sided One sided, lower bound One sided, upper bound Remember:

23 Confidence Intervals So how do we compute a (1 −  )×100% confidence interval given a set of data?? Case 2b: (1 −  )×100% CIs for a proportion p : Small sample n and/or p close to 0/1: Two sided Define Agresti, Coull :

24 Example: Confidence Intervals Saunders, Davis and Buscaglia define random match probability (RMP) in handwriting analysis as “[T]he chance of randomly selecting two individuals from some relevant population and then randomly selecting two writing samples, one from each individual’s available body of handwriting, that are declared to ‘‘match’’ on the basis of the chosen comparison procedure.” Say a suspect is apprehended in an a case and is alleged to have written a threatening letter. A database search yields 100 “best matching” individuals (one writing sample each). Assume this serves as a sample from a “relevant population”. It is known that none were actually produced by the suspect with the exception of the writing sample they produced. Each item in the sample compared to the others (n = 4950 comparisons) and two pairs are found to “match”. The estimated RMP is thus 2/4950 = 0.0004. Compute the estimated two sided CI (neglecting correlations) for this RMP at the 95% level of confidence.

25 Example: Confidence Intervals

26 Confidence Intervals So how do we compute a (1 −  )×100% confidence interval given a set of data?? Case 3: (1 −  )×100% CIs for a Poisson mean counts : Large sample n (at least 30): Two sided

27 Bootstrap Confidence Intervals So how do we compute a (1 −  )×100% confidence interval given a set of data?? For any parameter, you can try to obtain bootstrap based CIs For a sample of size n: Obtain a bootstrap sampling distribution for  : boot.reps Find the (1 −  )×100% empirical percentiles: quantile(boot.reps, probs=c(a/2, 1-a/2)) Two sided One sided, lower bound One sided, upper bound quantile(boot.reps, probs=c(a)) quantile(boot.reps, probs=c(1-a))

28 Consider again the case of Mr. B. Mayhew with seizure mass measurements of: 49.9996g 49.9994g 49.9993g 49.9996g 49.9995g 49.9994g 49.9995g 49.9994g Example: Bootstrap Confidence Intervals a.Compute the 99% CI for the mean mass via the bootstrap. b.What is your bootstrap standard error estimate for the estimated mean? c.Approximately, with what level of confidence can you report the mean measurement is equal to or exceeds 50.0000g?

29 Example: Bootstrap Confidence Intervals Look at what happens by just demanding a little more precision:


Download ppt "We are interested in methods that produce an interval: Common interval methods for: Confidence intervals Prediction intervals Tolerance intervals Credibility/Probability."

Similar presentations


Ads by Google