Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stat 301 – Day 36 Bootstrapping (4.5). Last Time – CI for Odds Ratio Often the parameter of interest is the population odds ratio,   Especially with.

Similar presentations


Presentation on theme: "Stat 301 – Day 36 Bootstrapping (4.5). Last Time – CI for Odds Ratio Often the parameter of interest is the population odds ratio,   Especially with."— Presentation transcript:

1 Stat 301 – Day 36 Bootstrapping (4.5)

2 Last Time – CI for Odds Ratio Often the parameter of interest is the population odds ratio,   Especially with case-control studies Turns out the log-odds ratio is often well modeled by the normal distribution with a known standard deviation formula So can estimate the population log odds ratio using sample log odds + z  (1/a+1/b+1/c+1/d)  Exponentiate to get endpoints for 

3 PP 5.1.3 (p. 426) Sample odds: 2.587 ln(2.587)+ 1.645  (1/65+464+1/30+1/554).950 + 1.645(.230) = (.572, 1.328) (e.572, e 1.328 ) = (1.77, 3.77) I am 90% confident that the odds of having an accident for new zealand drivers who had less than 5 hours of sleep is between 1.77 and 3.77 times higher than the odds of having a crash for new zealand drivers who had more than 5 hours of sleep.

4 What we have done so far… Analyzing one sample compared to a claim about the population parameter  Categorical: One-sample z-procedures if sample size is large or binomial Comparing two proportions (independent samples or randomized experiment)  Categorical: Two-sample z-procedures if sample sizes large or Fisher’s Exact Test (experiment) With quantitative data, no “small sample” alternative

5 Bootstrapping Relatively new approach that allows us to estimate aspects of the sampling distribution of the statistic in cases where the Central Limit Theorem does not apply  Small samples  Statistics other than the mean Previously we considered the randomization distribution  One random sample? Two independent samples?  Confidence interval

6 Investigation 4.5.1 (p. 365) If I take a random sample of 10 words from the population, what do I know about the sampling distribution of the sample mean?

7 Investigation 4.5.1 A bootstrap sample resamples the data from the existing sample, drawing n observations, but with replacement GettysburgSample.mtw  Random sample of 10 words from population  Answer through part (h)

8 Bootstrap Distribution Sampling from existing sample (with replacement, that is, each observed value repeated infinitely many times, demo)demo  Centers around sample statistic  Estimates the standard deviation of the sampling distribution  How does statistic vary around the parameter! If sampling distribution is symmetric, CI would then be statistic + t SE(statistic)

9 And without symmetry? Key Result: How bootstrap samples vary around statistic, mimics how statistics vary around parameter  E.g., expected sampling error

10 Without symmetry Look at middle 95% of values (between 2.5 th percentile and 97.5 th percentile) Statistic should not be any more than (4.8-3.6) below parameter Statistic should not be any more than (6.1-4.8) above parameter Parameter < statistic+(4.8-3.6) Parameter > statistic-(6.1-4.8) Max overestimate

11 “Percentile” confidence interval (Sample mean – (6.1-4.8), sample mean + (4.8-3.6)) Upper bound: Estimate + (estimate – 2.5 th percentile) Lower bound: Estimate - (97.5 th percentile – estimate)

12 Investigation 4.5.2 (p. 371)

13 Investigation 4.5.3 (p. 372) (a)-(c)

14 For Tuesday Continuing HW 8  All but last problem?


Download ppt "Stat 301 – Day 36 Bootstrapping (4.5). Last Time – CI for Odds Ratio Often the parameter of interest is the population odds ratio,   Especially with."

Similar presentations


Ads by Google