Download presentation
Presentation is loading. Please wait.
Published byПетар Матовић Modified over 5 years ago
1
Sampling Distributions of Proportions section 7.2
2
Toss a penny 20 times and record the number of heads.
Now, Really think about it for a minute: We are tossing exactly 20 times, expecting that the probability of a head will be .5 each time and that each toss will be independent SOUND FAMILIAR???????
3
Okay now, imagine 1000 people lining up and tossing the penny 20 times each.
If we were to record each sample proportion and histogram the results what would expect the shape to be? What shape do you think the dot plot will have?
4
A Sampling Distribution Model for a Proportion
A proportion is no longer just a computation from a set of data. It is now a random variable quantity that has a probability distribution. This distribution is called the sampling distribution model for proportions. Even though we depend on sampling distribution models, we never actually get to see them. We never actually take repeated samples from the same population and make a histogram. We only imagine or simulate them.
5
The Sampling Distribution of Proportions
Consider the approximate sampling distributions generated by a simulation in which SRSs of Reese’s Pieces are drawn from a population whose proportion of orange candies is either 0.45 or 0.15. What do you notice about the shape, center, and spread of each?
6
The Sampling Distribution of Proportions
What did you notice about the shape, center, and spread of each sampling distribution? Sample Proportions
7
As sample size increases, the spread decreases.
The Sampling Distribution of Proportions In Chapter 6, we learned that the mean and standard deviation of a binomial random variable X are Sample Proportions As sample size increases, the spread decreases.
8
The Sampling Distribution Model for a Proportion (cont.)
Provided that the sampled values are independent and the sample size is large enough, the sampling distribution of is very much like the Binomial Distribution and for large sample sizes is modeled by a Normal model with Mean: Standard deviation:
9
Modeling the Distribution of Sample Proportions (cont.)
A picture of what we just discussed is as follows:
10
How Good Is the Normal Model?
The Normal model gets better as a good model for the distribution of sample proportions as the sample size gets bigger. Just how big of a sample do we need? This will soon be revealed…
11
Assumptions and Conditions
Most models are useful only when specific assumptions are true. There are two assumptions in the case of the model for the distribution of sample proportions: The Independence Assumption: The sampled values must be independent of each other. The Sample Size Assumption: The sample size, n, must be large enough.
12
Assumptions and Conditions (cont.)
Assumptions are hard—often impossible—to check. That’s why we assume them. Still, we need to check whether the assumptions are reasonable by checking conditions that provide information about the assumptions. The corresponding conditions to check before using the Normal to model the distribution of sample proportions are the Randomization Condition, the 10% Condition and the Success/Failure Condition.
13
Assumptions and Conditions (cont.)
Randomization Condition: The sample should be a simple random sample of the population. 10% Condition: the sample size, n, must be no larger than 10% of the population. Success/Failure Condition: The sample size has to be big enough so that both np (number of successes) and nq (number of failures) are at least 10. …So, we need a large enough sample that is not too large.
14
Remember back to binomial distributions
Why does the third assumption insure an approximate normal distribution? Remember back to binomial distributions Suppose n = 10 & p = 0.1 (probability of a success), a histogram of this distribution is strongly skewed right! np > 10 & n(1-p) > 10 insures that the sample size is large enough to have a normal approximation! Now use n = 100 & p = 0.1 (Now np > 10!) While the histogram is still strongly skewed right – look what happens to the tail!
15
Consider the following situation:
Suppose we have a population of six people: Alice, Ben, Charles, Denise, Edward, & Frank What is the proportion of females? What is the parameter of interest in this population? Draw samples of two from this population. How many different samples are possible? 1/3 gender 6C2 =15
16
Find the 15 different samples that are possible & find the sample proportion of the number of females in each sample. Ben & Frank 0 Charles & Denise .5 Charles & Edward 0 Charles & Frank 0 Denise & Edward .5 Denise & Frank .5 Edward & Frank 0 Alice & Ben .5 Alice & Charles .5 Alice & Denise 1 Alice & Edward .5 Alice & Frank .5 Ben & Charles 0 Ben & Denise .5 Ben & Edward 0 How does the mean of the sampling distribution (mp-hat) compare to the population parameter (p)? mp-hat = p Find the mean & standard deviation of all p-hats.
17
But WAIT! We said that the standard deviation should equal
WHY did this happen? We are sampling more than 10% of our population! So – in order to calculate the standard deviation of the sampling distribution, we MUST be sure that our sample size is less than 10% of the population!
18
Assumptions (Rules of Thumb)
Must start with a Simple Random Sample Sample size must be less than 10% of the population (independence) Sample size must be large enough to insure a normal approximation can be used. np > 10 & n (1 – p) > 10
19
Using the Normal Approximation for
Sample Proportions A polling organization asks an SRS of 1500 first-year college students how far away their home is. Suppose that 35% of all first-year students actually attend college within 50 miles of home. What is the probability that the random sample of 1500 students will give a result within 2 percentage points of this true value? STATE: We want to find the probability that the sample proportion falls between 0.33 and 0.37 (within 2 percentage points, or 0.02, of 0.35). PLAN: We have an SRS of size n = 1500 drawn from a population in which the proportion p = 0.35 attend college within 50 miles of home. DO: Since np = 1500(0.35) = 525 and n(1 – p) = 1500(0.65)=975 are both greater than 10, we’ll standardize and then use Table A to find the desired probability. CONCLUDE: About 90% of all SRSs of size 1500 will give a result within 2 percentage points of the truth about the population.
20
Based on past experience, a bank believes that 7% of the people who receive loans will not make payments on time. The bank recently approved 200 loans. What are the mean and standard deviation of the proportion of clients in this group who may not make payments on time? Are assumptions met? What is the probability that over 10% of these clients will not make payments on time? Yes – np = 200(.07) = 14 n(1 - p) = 200(.93) = 186 Ncdf(.10, 1E99, .07, ) = .0482
21
Suppose one student tossed a coin 200 times and found only 42% heads
Suppose one student tossed a coin 200 times and found only 42% heads. Do you believe that this is likely to happen? No – since there is approximately a 1% chance of this happening, I do not believe the student did this.
22
Assume that 30% of the students at MSU wear contacts
Assume that 30% of the students at MSU wear contacts. In a sample of 100 students, what is the probability that more than 35% of them wear contacts? Check assumptions! mp-hat = & sp-hat = np = 100(.3) = 30 & n(1-p) =100(.7) = 70 Ncdf(.35, 1E99, .3, ) = .1376
23
Section 7.2 Sample Proportions
Summary In this section, we learned that… In practice, use this Normal approximation when both np ≥ 10 and n(1 - p) ≥ 10 (the Normal condition).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.