1 Math 4030 – 10b Inferences Concerning Proportions
2 Population proportion p is: p(100)% of the subjects in the population has the property of our interest; if randomly select one subject from the population, the probability is p that the subject has the property of our interest; if we take a sample of size n, of which X subjects have the property of our interest, then the sample proportion is Sample Proportion:
3 Distribution of sample proportion X/n: For n ≥ 30 Confidence Interval for p (Sec. 10.1): Maximum error of estimate for p
4 Sample size calculation: p?? Use p from similar population; Use ¼ as maximum of p(1-p); If = 0.05, we may use n = 1/E 2
5 For Hypothesis Testing (Sec. 10.2)
6 A new method is under development for making disks of a superconducting material. 50 disks are made by each method (new and old) and they are checked for superconductivity when cooled with liquid nitrogen. Compare 2 proportions: Old Method 1New Method 2Total Superconductors Failures19827 Total Need to claim that the new method makes improvement.
7 or Sample proportions: Distribution under the assumption Distribution of Sample Proportion Difference:
8 Hypothesis Testing: Null hypothesis Alternative hypothesis Level of significance: Critical value and Critical region: for large sample, we use the z-test Sample statistic calculation: Conclusion: Reject the null hypothesis, …
9 Confidence interval for the difference: More than Up to
10 Compare Several Proportions (Sec. 10.3): Sample 1Sample 2…Sample kTotal Successesx1x1 x2x2 …xkxk x Failuresn 1 -x 1 n 2 -x 2 …n k -x k n - x Totaln1n1 n2n2 …nknk n From k independent samples from k populations, we have
11 for each j, and large sample. Sampling distribution if are k population proportions: Combined has chi-square distribution with df = k – 1. Normal approximate binomial.
12 Observed frequency Expected frequency
13 Hypothesis Testing: Null hypothesis Alternative hypothesis Sample statistic: where (Pooled proportion) (Expected Cell Frequency) (Observed Cell Frequency) with df = k – 1,
14 Example. Four methods are under development for making disks of a superconducting material. 40, 50, 60, 70 disks are made by each of 4 methods, respectively, and they are checked for superconductivity when cooled with liquid nitrogen. Method 1 Method 2 Method 3 Method 4 Total Supercond uctors Failures Total
15 First we need to know whether 4 methods have any difference. Null hypothesis: Alternative hypothesis: are not all equal. Level of significance: = 0.05 Critical region: With df = 4 – 1 = 3, we have Critical region is: (7.815, ). Statistic from sample: We need to calculate the expected frequencies.
16 Method 1Method 2Method 3Method 4Total Supercon ductors 21 (19.5) 32 (26) 32 (39) 45 (45.5) 130 Failures 9 (10.5) 8 (14) 28 (21) 25 (24.5) 70 Total 2 = Expected frequencies: Conclusion: Since the sample statistic falls in the critical region, we reject the null hypothesis. Four methods are not all the same.
How do these methods differ? 17 Gives confidence interval for each of the 4 population (method) proportion. Use Excel, we find
18 Method 1 Method 2 Method 3 Method p