Engineering Statistics Chapter 3 Distribution of Samples Distribution of sample statistics 3C - Proportions and Difference between Proportions.

Slides:



Advertisements
Similar presentations
Chapter 7 Hypothesis Testing
Advertisements

Chapter 6 Normal Random Variable
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
S2 Chapter 7: Hypothesis Testing Dr J Frost Last modified: 3 rd October 2014.
1 Chapter 12 Inference About One Population Introduction In this chapter we utilize the approach developed before to describe a population.In.
1 Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Type I and Type II Errors One-Tailed Tests About a Population Mean: Large-Sample.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
ESTIMATION AND HYPOTHESIS TESTING
Introduction to Hypothesis Testing
Point estimation, interval estimation
Sample size computations Petter Mostad
Chapter 9 Chapter 10 Chapter 11 Chapter 12
Point and Confidence Interval Estimation of a Population Proportion, p
Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer Office:N763 Phone: Office hours: Tue, Thu 10am-11:30.
Estimating a Population Proportion
Continuous Random Variables and Probability Distributions
Discrete Random Variables and Probability Distributions
7-2 Estimating a Population Proportion
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
BCOR 1020 Business Statistics
5-3 Inference on the Means of Two Populations, Variances Unknown
Binomial Probability Distribution.
Chapter 5 Several Discrete Distributions General Objectives: Discrete random variables are used in many practical applications. These random variables.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 5.2.
Engineering Statistics Chapter 4 Hypothesis Testing 4C Testing on Proportions & Difference Between Proportions.
4.1Introduction The field of statistical inference consist of those methods used to make decisions or to draw conclusions about a population. These methods.
1 CHAPTER 7 Homework:5,7,9,11,17,22,23,25,29,33,37,41,45,51, 59,65,77,79 : The U.S. Bureau of Census publishes annual price figures for new mobile homes.
Estimation Basic Concepts & Estimation of Proportions
STATISTICAL INFERENCE PART VII
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Chapter 6: Probability Distributions
Engineering Statistics Chapter 2 Special Variables 2D Approximation of Variables.
QBM117 Business Statistics Estimating the population mean , when the population variance  2, is known.
Sections 6-1 and 6-2 Overview Estimating a Population Proportion.
Copyright © 2009 Cengage Learning Chapter 10 Introduction to Estimation ( 추 정 )
Chapter 7. Statistical Intervals Based on a Single Sample Weiqi Luo ( 骆伟祺 ) School of Software Sun Yat-Sen University : Office.
Chapter 7 Estimates and Sample Sizes
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Engineering Statistics Chapter 4 Hypothesis Testing 4B Testing on Variance & Proportion of Variances.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Economics 173 Business Statistics Lecture 6 Fall, 2001 Professor J. Petry
Estimating a Population Proportion
1 Estimation of Standard Deviation & Percentiles Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Chapter Outline 2.1 Estimation Confidence Interval Estimates for Population Mean Confidence Interval Estimates for the Difference Between Two Population.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
1 Chapter 8 Hypothesis Testing 8.2 Basics of Hypothesis Testing 8.3 Testing about a Proportion p 8.4 Testing about a Mean µ (σ known) 8.5 Testing about.
Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
Confidence Intervals Lecture 3. Confidence Intervals for the Population Mean (or percentage) For studies with large samples, “approximately 95% of the.
Chapter 8 : Estimation.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
CHAPTER SEVEN ESTIMATION. 7.1 A Point Estimate: A point estimate of some population parameter is a single value of a statistic (parameter space). For.
Chapter 2 Statistical Inference  Estimation -Confidence interval estimation for mean and proportion -Determining sample size  Hypothesis Testing -Test.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Engineering Statistics Chapter 3 Distribution of Samples Distribution of sample statistics 3B - Variance.
1 Chapter 8 Interval Estimation. 2 Chapter Outline  Population Mean: Known  Population Mean: Unknown  Population Proportion.
Hypothesis Testing  Test for one and two means  Test for one and two proportions.
Confidence Intervals and Sample Size. Estimates Properties of Good Estimators Estimator must be an unbiased estimator. The expected value or mean of.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
Virtual University of Pakistan
Chapter 9 -Hypothesis Testing
ESTIMATION.
Engineering Statistics Hypothesis Testing
CONCEPTS OF ESTIMATION
Elementary Statistics
Introduction to Estimation
Presentation transcript:

Engineering Statistics Chapter 3 Distribution of Samples Distribution of sample statistics 3C - Proportions and Difference between Proportions

Proportion of a property When a sample is collected in relation to a property, it is important to know if its proportion is reasonable. For example, when we interview a group of people for work, we would like to know if the proportion of candidates is normal based on gender, age, race etc. The proportion of a property is highly dependent on the size of samples. In small samples, it is not surprising if the proportion of a sample is unusual. When the size increases, we expect the proportion to be closer to that of the population.

Distribution of proportions If the proportion of a property in a population is , and we take samples of size n, then the proportion p is expected to follow the normal distribution, with a mean , and a variance  (1 -  )/n. As can be seen, the variance decreases as the sample size increases. When n is large, we would expect the proportion of the property in the sample to be very close to that of the population.

Need for Continuity Adjustment Since the proportion is based on a ratio m/n, the value of m will be an integer. In order to avoid bias in obtaining the correct proportion, it is necessary to introduce a correction of ½ unit. This is the same as for continuity correction in discrete- to-continuous approximation. Thus we shall treat p > m/n as p > (m+½)/n, p  m/n as p  (m – ½)/n, p < m/n as p < (m – ½)/n, and p  m/n as p  (m+½)/n.

Example 1 30% of customers to a fast-food restaurant are old folks who are given discounts. During a short period, the restaurant serves 40 customers. What is the probability the percentage of old folks is not more than 25%? Solution: p ~ N(0.3, 0.3×(1–0.3)/40). P(p  0.25)  P(p  /40) [Continuity adjustment] = P(z  [0.2625–0.3]/  ) = P(z  –0.52) = 0.5– =

Example 2 A furniture factory claims that less than 12% of its executive chairs has defects. An office just ordered 25 such chairs. What is the probability the percentage of defects exceeds 15%? Solution: p ~ N(0.12, 0.12×(1–0.12)/25). P(p> 0.15)  P(p > /25) [Continuity adjustment] = P(z > [0.17–0.12]/  ) = P(z > 0.77) = 0.5– =

Example 3 It is estimated that 65% of students in the Faculty of Education are ladies. A class in FoE has 120 students. What is the probability the proportion of ladies in the class exceeds 70%? Solution: Let p represents the proportion for ladies, then p ~ N(0.65, [0.65×(1-0.65)]/120). After continuity correction, P(p > 0.70)  P(p > /120) = P(z > [ – 0.65]/  (0.65×0.35/120) = P(z > 1.24) = 0.5– =

Alternative: Binomial distribution. We note that the same question can be solved using binomial distribution as follows: Let X represent number of ladies. X~Bin(120, 0.65). As n>30, X is approximated by normal distribution  X~N(120×0.65, 120×0.65×0.35). 70% of 120 is 84. We are looking for P(X>84). By continuity adjustment, we have P(X>84.5) = P(z>[ ]/  27.3) = P(z > 1.24) = , as we obtained earlier.

Example 4 18% of students withdraw half-way through a course. In a class with 45 students, what is the probability less than 15% will withdraw? Solution: p ~ N(0.18, 0.18×(1–0.18)/45) After continuity adjustment, the event p < 0.15  p < 0.15–0.5/45 P(a < ) = P(z < [0.1389–0.18]/  ) = P(z < –0.72) = 0.5 – =

Binomial Alternative: Let W represent the number of students who withdraw. Then W~Bin(45, 0.18). 15% of 45 is So the event is W<6.75. Even though the number here is a decimal, we still need to make the same continuity adjustment. Thus we look for W < 6.75–0.5. As n>30, we use the approximation W~N(45×0.18, 45×0.18×0.82). P(W < 6.25) = P([6.25 – 8.1]/  6.642) = P(z < –0.72) = , as found above.

Difference between proportions The same rules on the distribution of the difference between means will apply to the difference between proportions. Thus if  1 and  2 are proportions of the same property for two populations, and we take samples of sizes n 1 and n 2 from those two population respectively, then we expect the difference of proportions p 1 –p 2 of the samples to satisfy p 1 –p 2 ~N(  1 –  2,  1 (1–  1 )/n 1 +  2 (1–  2 )/n 2 ).

Example 5 In the 1985 cohort, it is known that 20% of non-graduates and 14% of graduates remain unemployed 6 months after coming on to the market. A survey tracks 80 non- graduates and 50 graduates of the cohort. Find the probability the percentage of non- graduates who remain unemployed exceed that of graduates by at least 10%.

Solution: Let p n represent the proportion that of non- graduates and p g that of unemployed graduates. p n – p g ~ (0.20–0.14, 0.2×0.8/ ×0.86/50) P(p n – p g > 0.1) = P(z > [0.1 – 0.06] /  (0.2×0.8/ ×0.86/50)) = P(z > 0.60) = 0.5 – =

Example 6 The Transport Ministry believes that 35% of express buses exceed speed limits on the highway. On a certain day, two teams track express buses going in opposite directions. The team for north-bound traffic monitor 60 buses, while the south-bound team has 75 buses on record. What is the probability the percentage of speeding buses for north-bound exceeds that of southbound by at least 4%? Solution:Let p n represent the proportion of north-bound buses which speed, and p s the same proportion for south-bound buses.

p n –p s ~ ( , 0.35×0.65/ ×0.65/75) P(p n -p s > 0.04) = P(z > [0.04 – 0]  (0.35×0.65/ ×0.65/75)) = P(z > 0.48) = 0.5 – = So there is a probability of that the north- bound speeding percentage might exceed that of south-bound by 4% or more. Note that in this case, we also have the same probability that the proportion of south- bound speeders exceeds that of north-bound by 4%!

Confidence Interval for Proportion When we have the proportion of a property from the population, we expect the proportion for a sample to follow the normal distribution. Hence, we may apply the same procedure to estimate the (1–  )100% confidence interval as for the mean. We shall use two examples to illustrate the method.

Example 7 The Tourism Department reports says that 32% of tourists are foreigners. A group of 150 tourists are visiting the Royal Museum. What is 98% confidence interval for the percent of foreign tourists? Solution : p~N(0.32, 0.32×0.68/150); p~N(0.32, ) At 95% confidence,  =0.05,  /2= Z = Hence the 95% confidence interval for the proportion of foreign tourist is 0.32–1.96×   p  ×    p   24.53% to 39.47% of the tourists are foreigners.

Example 8 The records of a bank shows that 17% of its customers are business customers, but the transactions for this group make up 75%. During a certain hour, there were 50 customers and 400 transactions. Find the 90% confidence interval for the percentage of (i)Business customers; (ii)Business transactions.

Solution: p1 = proportion of business customers; p2 = proportion of business transactions. p1~N(0.17, 0.17×0.83/50); p2~N(0.75, 0.75×0.25/400). At 90% confidence,  =0.1,  /2=0.05. z 0.05 = The confidence intervals are: 0.17 – ×   p1  ×    p1  ; and 0.75 – ×   p2  ×    p2  Hence the range is 8.26% to 25.74% for business customers, and 71.44% to 78.56% for business transactions.

Confidence Interval From Sample When the proportions are derives from data of samples, we expect the same normal distribution can be used to model the population proportion, using the sample proportion as the estimator. For such purposes, we expect the result will be good only if the sample size is reasonably large. For small samples, it is not reliable to use the proportion obtained to obtain a general picture of the population proportion.

Example 9 In a survey on cleanliness of eating stalls, it was found that only 55 out of 140 stalls checked follow proper procedures to maintain hygienic environments. Based on this, estimate the 95% confidence interval for the percentage of clean eating stalls nationwide.

Solution: Even though only the sample data are available, we can safely assume that the proportion from such a big sample is a good estimator for the wider proportion. Hence we shall use the normal distribution to estimate the proportion for the nation: p~N(55/140, [55/140×85/140]/140) At 95%,  =0.05,  /2= Z = So the 95% interval for population proportion of clean eateries is 55/140 – 1.96  ([55/140×85/140]/140) to 55/  ([55/140×85/140]/140)   p  or 31.2% to 47.38%.

Example 10 During a screening process, it was found that 20 out of 80 boys years old and 30 out of 100 girls of the same age group are fat. Based on this study, find the probability the proportion of fat girls exceeds that of boys by 2% or more. NOTE: In this case, we only have the sample proportions. However, as the sample sizes are large enough, we can use these data to project the likely distribution of the difference of proportions.

Solution: Note: 20/80 = 0.25, 30/100 = 0.3. p b ~N(0.25, 0.25×0.75/80); p g ~N(0.3, 0.3×0.7/100); pg – pb ~ N( , 0.25×0.75/ ×0.7/100) P(pg – pb > 0.02) = P(z >[ ]/  (0.25×0.75/ ×0.7/100) = P(z > -0.45) = =

Difference of proportions Using the distribution for difference between proportions, we can find the probability for the difference between proportions (Exs 11 & 12). When the sample sizes are large, we can also use the sample proportions to estimate the interval for the difference between population proportions. The same procedure is used to determine the confidence interval for the difference in proportions (Ex 13).

Example 11 On the average, 37% of men and 18% of women in the country smoke. A survey is taken for 50 men and 60 women. What is the probability the proportion of men who smoke exceeds that of women by at least 20%? Solution: Let p m and p w represent the proportion of men and women who smoke. Then p m ~ N(0.37, 0.37×0.63/50); p w ~ N(0.18, 0.18×0.82/60).

Example 11 (Solution) This means that p m – p w ~N(0.37 – 0.18, 0.37×0.63/ ×0.82/60). So P(p m  p w +0.20) = P(p m – p w  0.20) = P(z  [0.20 – 0.19]/  (0.37×0.63/ ×0.82/60). = P(z  0.12) = 0.5 – =

Example 12 65% of those achieving good results at STPM exam and 55% of those for Matriculation exam get admitted to universities of their choice. A check is made on 72 students successful at STPM and 45 of those at Matriculation. What is the probability the success rate in university admission for those through Matriculation is at least as good as those through STPM? Solution: Let p s be the proportion of STPM candidates who are successful and p m for that of matriculation candidates.

Example 12 (Solution) Then we have: p s ~ N(0.65, 0.65×0.35/72); p m ~ N(0.55, 0.55×0.45/45). And so p m – p s ~N(0.55 – 0.65, 0.65×0.35/ ×0.45/45). Hence P(p m  p s ) = P(p m – p s  0.0) = P(z  [0.0 – (-0.10]/  (0.65×0.35/ ×0.45/45). = P(z  1.07) = 0.5 – =

Example 13 Out of 75 sticks of LajuMaut cigarettes, 20 are found to have nicotine exceeding danger levels. For 60 sticks of LajuMaut cigarettes, 15 are also found to have nicotine exceeding danger levels. What is the 90% confidence interval of p L –p C, where p L and p C represents the proportions of LajuMuat and CepatMaut cigarettes with excessive levels of nicotine?

From the data given, p L =20/75 = , and p C =15/60 = By theory, p L –p C ~N(0.2667– 0.25, ×0.7333÷ ×0.75÷60). At 90% confidence,  =0.1,  /2=0.05. And z 0.05 = Hence the confidence interval for the difference in proportion is from – ×  (0.2667×0.7333÷ ×0.75÷60) to ×  (0.2667×0.7333÷ ×0.75÷60), I.e. – to NOTE: The left boundary – indicates that p L may actually be less than p C.

Multiple Groups When we want to compare the proportions of multiple (3 or more) groups in a population, the method using normal distribution becomes ineffective. An alternative is to use the differences between what are expected and what are obtained and treat them as variations. The sum of squares of the differences can be modeled using the  2 distribution. However, as  2 distribution tables do not provide for probabilities, we shall only look at these cases in hypothesis testing. (See 4C).