Sampling Distributions of Proportions
Sampling Distribution Is the distribution of possible values of a statistic from all possible samples of the same size from the same population In the case of the pennies, it’s the distribution of all possible sample proportions (p) We will use: p for the population proportion and p-hat for the sample proportion
Suppose we have a population of six people: Alice, Ben, Charles, Denise, Edward, & Frank What is the proportion of females? What is the parameter of interest in this population? Draw samples of two from this population. How many different samples are possible? Proportion of females 6 C 2 =15 1/3
Find the 15 different samples that are possible & find the sample proportion of the number of females in each sample. Alice & Ben.5 Alice & Charles.5 Alice & Denise 1 Alice & Edward.5 Alice & Frank.5 Ben & Charles 0 Ben & Denise.5 Ben & Edward 0 Ben & Frank 0 Charles & Denise.5 Charles & Edward 0 Charles & Frank 0 Denise & Edward.5 Denise & Frank.5 Edward & Frank 0 Find the mean & standard deviation of all p-hats. How does the mean of the sampling distribution ( p-hat ) compare to the population parameter (p)? p-hat = p
Formulas: These are found on the formula chart!
Does the standard deviation of the sampling distribution equal the equation? NO - WHY? We are sampling more than 10% of our population! If we use the correction factor, we will see that we are correct. Correction factor – multiply by So – in order to calculator the standard deviation of the sampling distribution, we MUST be sure that our sample size is less than 10% of the population!
Assumptions (Rules of Thumb) Sample size must be less than 10% of the population (independence) Sample size must be large enough to ensure a normal approximation can be used. np > 10 & n (1 – p) > 10
Why does the second assumption ensure an approximate normal distribution? Suppose n = 10 & p = 0.1 (probability of a success), a histogram of this distribution is strongly skewed right! Remember back to binomial distributions Now use n = 100 & p = 0.1 (Now np > 10!) While the histogram is still strongly skewed right – look what happens to the tail! np > 10 & n(1-p) > 10 insures that the sample size is large enough to have a normal approximation!
Based on past experience, a bank believes that 7% of the people who receive loans will not make payments on time. The bank recently approved 200 loans. What are the mean and standard deviation of the proportion of clients in this group who may not make payments on time? Are assumptions met? What is the probability that over 10% of these clients will not make payments on time? Yes – np = 200(.07) = 14 n(1 - p) = 200(.93) = 186 Ncdf(.10, 1E99,.07,.01804) =.0482
Suppose one student tossed a coin 200 times and found only 42% heads. Do you believe that this is likely to happen? No – since there is approximately a 1% chance of this happening, I do not believe the student did this. np = 200(.5) = 100 & n(1-p) = 200(.5) = 100 Since both > 10, I can use a normal curve! Find & using the formulas.
Assume that 30% of the students at BHS wear contacts. In a sample of 100 students, what is the probability that more than 35% of them wear contacts? Check assumptions! p-hat =.3 & p-hat = np = 100(.3) = 30 & n(1-p) =100(.7) = 70 Ncdf(.35, 1E99,.3, ) =.1376