Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exam 1 is two weeks from today (March 9 th ) in class 15% of your grade Covers chapters 1-6 and the central limit theorem. I will put practice problems,

Similar presentations


Presentation on theme: "Exam 1 is two weeks from today (March 9 th ) in class 15% of your grade Covers chapters 1-6 and the central limit theorem. I will put practice problems,"— Presentation transcript:

1 Exam 1 is two weeks from today (March 9 th ) in class 15% of your grade Covers chapters 1-6 and the central limit theorem. I will put practice problems, old exams, and specific sections that are not included on the web by the end of this week. You will be allowed to bring in one page of notes and a calculator. I’ll provide normal probability tables. Today: Continue with central limit theorem. Announcement

2 Central Limit Theorem Example One: –Drive through window at a bank –Consider transaction times, X i =transaction time for person i –E(X i ) = 6 minutes and Var(X i ) = 3 2 minutes 2. Transaction time for each person is independent. –Thirty customers show up on Saturday morning. 1.What is the probability that the total of all the transaction times is greater than 200 minutes? 2.What is the probability that the average transaction time is between 5.9 and 6.1 minutes?

3 The “general interpretation” of the Central Limit Theorem suggests that measurements that are the result of a large number of factors tend to be normally distributed. Are heights and weights normally distributed in the adult US population? Use data to see. Create a histogram and superimpose normal distribution over it. “Control” for gender by doing this for each gender.

4 Dataset: NHANES (National Health and Nutrition Examination Survey.) About 16,000 adults were examined in “mobile examination centers”. The adults were sampled to reflect the demographics of the US. Hundreds of measurements were made on each person.

5 Histogram is from data. Blue lines are normal pdfs with Means = 173.3 and 159.9 and std devs = 7.6 and 7.3 (the means and std devs come from the data) Medians are 173.3 and 159.9 The data appear to be normally distributed (approximately)… We’ll see other ways to assess normality later in the semester.

6 Histogram is from data. Blue lines are normal pdfs with Means = 80.4 and 70.8 and std devs = 16.7 and 17.9 (the means and std devs come from the data) Medians are 78.4 and 67.5 The data do not appear to normally distributed… (note difference between mean and median.) Why wouldn’t you expect normality here?

7 Central Limit Theorem Example: –5 chemists independently synthesize a compound 1 time each. –Each reaction should produce 10ml of a substance. –Historically, the amount produced by each reaction has been normally distributed with std dev 0.5ml. 1.What’s the probability that less than 49.8mls of the substance are made in total? 2.What’s the probability that the average amount produced is more than 10.1ml? 3.Suppose the average amount produced is more than 11.0ml. Is that a rare event? Why or why not? If more than 11.0ml are made, what might that suggest?

8 Answer: Central limit theorem: If E(X i )=  and Var(X i )=  2 for all i (and independent) then: X 1 +…+X n ~ N(n ,n  2 ) (X 1 +…+X n )/n ~ N( ,  2 /n)

9 Lab: 1.Let Y = total amount made. Y~N(5*10,5*0.5) (by CLT) Pr(Y<49.8) = Pr[(Y-50)/1.58 < (49.8-50)/1.58] =Pr(Z < -0.13) = 0.45 2.Let W = average amount made. W~N(10,0.5/5) (by CLT) Pr(W > 10.1) = Pr[Z > (10.1 – 10)/0.32] =Pr(Z > 0.32) = 0.38

10 Lab (continued) 3.One definition of rare: It’s a rare event if Pr(W > 11.0) is small (i.e. if “Seeing probability of 11.0 or something more extreme is small”) Pr(W>11) = Pr[Z > (11-10)/0.32] = Pr(Z>3.16) = approximately zero. This suggests that perhaps either the true mean is not 10 or true std dev is not 0.1 (or not normally distributed…)

11 Sample size: 1006 (source: gallup.com)

12 Let X i = 1 if person i thinks the President is hiding something and 0 otherwise. Suppose E(X i ) = p and Var(X i ) = p(1-p) and each person’s opinion is independent. Let Y = total number of “yesses” = X 1 +…+ X 1006 Y ~ Bin(1006,p) Suppose p = 0.36 (this is the estimate…) What is Pr(Y < 352)? Note that this definition turns three outcomes into two outcomes

13 Normal Approximation to the binomial CDF –Even with computers, as n gets large, computing things like this can become difficult. (1006 is OK, but how about 1,000,000?) –Idea: Use the central limit theorem approximate this probability –Y is approximately N[1006*0.36,1006*(0.36)*(0.64)] = N(362.16,231.8) (by central limit theorem) Pr[ (Y-362.16)/15.2 < (352-362.16)/15.2] = Pr(Z < -0.67) = 0.25 Pr(Y<352) = Pr(Y=0)+…+Pr(Y=351), where Pr(Y=k) = (1006 choose k)0.36 k 0.64 1006-k

14 Normal Approximation to the binomial CDF Black “step function” is plots of bin(1006,0.36) pdf versus Y (integers) Blue line is plot of Normal(362.16,231.8) pdf

15 Normal Approximation to the binomial CDF Area under blue curve to left of 352 is approximately equal to the sum of areas of rectangles (black Stepfunction) to the left of 352

16 Comments about normal approximation of the binomial : Rule of thumb is that it’s OK if np>5 and n(1-p)>5. “Continuity correction” Y is binomial. If we use the normal approximation to the probability that Y<k, we should calculate Pr(Y<k+.5) If we use the normal approximation to the probability that Y>k, we should calculate Pr(Y<k-.5) (see picture on board)


Download ppt "Exam 1 is two weeks from today (March 9 th ) in class 15% of your grade Covers chapters 1-6 and the central limit theorem. I will put practice problems,"

Similar presentations


Ads by Google