Engineering Statistics Chapter 4 Hypothesis Testing 4C Testing on Proportions & Difference Between Proportions
Testing proportions In Ch 3C, we looked at how sample proportions distribute. We use the normal distribution and set p~N( , (1– )/n). When testing proportions of a sample, we want to find out if the proportion of a property is usual or abnormal. Another reason to test proportion is to see if the proportion has changed over a certain period. If so, a certain action may need to be taken.
Example 1 For example, a hotel usually provides two towels in a room, even when only a guest checks in. To maintain hygiene, both towels have to be washed after the guest checks out even if it appears as if only one towel is used. This would be a waste. Thus the management may want to find out the proportion of its single-occupant guests use two towels. If the proportion is low, it may be better to put only one towel in each room and put a note telling the guest he/she may ask for a second. This will mean savings on unnecessary washing.
How to carry out the test Let us assume that the management is willing to risk offending some guests if the proportion using two towels is less than 1/6. He puts a card in each room where a single guest checks in, asking the guest to state whether he/she agrees to have only one towel when he checks in. An explanation is given for such a move. After two weeks, the hotel gets 97 responses. If all the guests agree, then of course the hotel can go ahead with confidence that it is not going to lose any potential customers. But what if 8 guests disagree? This is only a little lower than 1/6. Dare he, on such a sample, introduce the practice?
Analysis As usual, we need to set a confidence level. Let us put it at 95%. Also, let us use p to represent the proportion who disagree. Then p~N(0.1667, /97) The steps are:
Test Procedure Null hypothesis: p = ; Alternative hypothesis: p < Test statistic: z=[p–0.1667]/ ( /97). This is a one-tail test. At 95% confidence, =0.05, z 0.05 = The test being on the left tail, we set the critical value as – We shall accept the null hypothesis if z calculated –1.6449, and reject it otherwise. z calculated = [8/97–0.1667]/ ( /97). = – < z critical. So we reject the null hypothesis. This means the proportion who disagree is less than 1/6 and the hotel should probably introduce the new rule.
Example 2 The proportion of doctors who quit to go into private practice after compulsory service or scholarship binding is 72%. The Ministry of Health introduces a new scheme to persuade such doctors to stay in the government hospitals. After this, from a group of 120 doctors, 80 leave the government when they can. At 90% of confidence, test the hypothesis that the percentage has gone down.
Example 2 (Analysis) 80/120 = 66.67%. So it appears that the proportion has gone down below 72%. However, is this small reduction significant enough for the MoH to claim success of the scheme? This is where the test comes in. Solution: Let q represent the proportion of doctors quitting government services. Then q~N(0.72, 0.72 0.28/120). We are testing if q has gone down, so this is a one-tail test on the left.
Example 2 (Solution) Null hypothesis: q = 0.72; Alternative hypothesis: q<0.72. Test statistic: z=[q–0.72]/ (0.72 0.28/120). This is a one-tail test. At 90% confidence, =0.1, z 0.1 = The test being on the left tail, we set the critical value as – We shall accept the null hypothesis if z calculated –1.2816, and reject it otherwise. z calculated = [80/120–0.72]/ (0.72 0.28/120). = – < z critical. This is sufficient to support the MoH’s claim that the scheme has made changes.
Example 3 An oil palm plantation has two types of plants, AA and BB in the ratio 2:3. The types are grown from seeds. There is no way to determine the type that will grow by looking at the seeds. As AA produces more fruits, steps are taken to increase the proportion for AA. A technician suggested using centrifuges to separate the seeds. This was adopted and it was found that the out of 50 plants, 26 turn out to be AA. At 95% of confidence, test the hypothesis the procedure has increased the proportion for AA. Solution: Let a represent the proportion for AA plants, then a~N(0.4, 0.4 0.6/50).
Example 3 (Solution) Null hypothesis: a = 0.4; Alternative hypothesis: a>0.4. Test statistic: z=[a–0.4]/ (0.4 0.6/50). This is a one-tail test. At 95% confidence, =0.05, z 0.05 = We shall accept the null hypothesis if z calculated , and reject it otherwise. z calculated = [0.52–0.4]/ (0.4 0.6/50). = > z critical. Based on this, we reject the null hypothesis and agree that the procedure has made improvement in the ratio for AA.
Example 4 Climate experts believe that global warming has caused changes in rainfall pattern. One such change is in the frequency of raining days. The record for a region shows that during the inter-monsoon season, the proportion for rainy days is During a study conducted over 122 days of inter-monsoon, 30 are rainy days. At 90% confidence, does this show the proportion for rainy days has changed? Solution: Let r represent the proportion for rainy days, then r~N(0.22, 0.22 0.78/122). The way the question is asked, we only want to know if the proportion has changed, which means we shall carry out a two-tail test.
Example 4 (Solution) Null hypothesis: r = 0.4; Alternative hypothesis: r 0.4. Test statistic: z=[r–0.22]/ (0.22 0.78/122). This is a one-tail test. At 90% confidence, =0.1, /2=0.05, z 0.05 = We shall accept the null hypothesis if – z calculated , and reject it otherwise. z calculated = [30/122–0.22]/ (0.22 0.78/122). = The obtained z falls within the critical range and so we accept the null hypothesis. This indicates that the change is not significant, and the rainy days ratio hasn’t really changed.
Comparing two Proportions When the proportions of two different samples of sizes n 1 and n 2 from the same population with known proportion are compared, we model the difference using p 1 –p 2 ~N(0, (1– )/n 1 + (1– )/n 2 ) When comparing samples of sizes n 1 and n 2 from two populations with known proportions 1 and 2, the model is p 1 –p 2 ~N( 1 – 2, 1 (1– 1 )/n 1 + 2 (1– 2 )/n 2 )
Comparing two Proportions However, if the proportions in the populations are not known, then we test if the proportions are different using p 1 – p 2 ~N(0, p 1 (1 – p 1 )/n 1 + p 2 (1 – p 2 )/n 2 ) to model the difference. In all cases, the test will be the same as for testing the proportion of a sample. As can be expected, the test will be more reliable if the sample sizes are large. For small samples, we have to more cautious in interpreting the results.
Example 5 The proportion of families having below-poverty incomes at national level is A survey on poverty is made in two villages X and Y. For 60 families in X, 8 families are identified as having below-poverty income; for 72 families in Y, the figure is 11. At confidence level of 90%, test the hypothesis Y has higher level of poor people. Solution: Using x and y to represent the proportion of poor families, we model the difference as y – x ~N(0, 0.13 0.87/ 0.87/72).
Example 5 (Solution) Null hypothesis: y – x = 0; Alternative hypothesis: y – x > 0. Test statistic: z=[y–x–0]/ (0.13 0.87/ 0.87/72). Because our question is whether y is more than x, we run a one-tail test. At 90% confidence, =0.1, z 0.1 = We shall accept the null hypothesis if z calculated , and reject it otherwise. z calculated = [11/72–8/60]/ (0.13 0.87/ 0.87/72). = The obtained z is less than the critical value. So we accept H 0. This indicates that the proportion of poverty at Y is not higher than that at X.
Example 6 A researcher is convinced that the proportion of male undergraduates who plagiarize in thesis is higher than that of the female. A thorough check was done on 35 thesis of the men and 32 of the ladies. Among the men, 12 of them are confirmed to have copied substantial amount from other sources, while only 8 of the ladies do so. At 95% level of confidence, can you agree with the researcher? Solution: Using m and w to represent the proportion men and women plagiarizers, we model the result as follows: m – w ~N(0, 12/35 23/35/35+ 8/32 24/32/32).
Example 6 (Solution) Null hypothesis: m – w = 0; Alternative hypothesis: m – w > 0. Test statistic: z=[m–w–0]/ (12/35 23/35/35+ 8/32 24/32/32). This is a one-tail test. At 95% confidence, =0.05, z 0.1 = We shall accept the null hypothesis if z calculated , and reject it otherwise. z calculated = [12/35–8/32]/ (12/35 23/35/35+ 8/32 24/32/32) = The obtained z is less than the critical value. So we accept the null hypothesis. This indicates that the proportion of men who plagiarize is not more than that of the ladies.
Example 7 In order to increase the proportion of business customers who give more business to the bank, a bank manager introduced a special scheme for such customers. It was found that in the month after the scheme, 23 out of 80 new customers are business customers; during the previous month only 17 out of 90 were business customers. At 95% confidence level, can we agree that his scheme has succeeded? Solution: Let p1and p2 represent the proportions before and after the scheme. p2 – p1 ~ N(0, 23/80 57/80/80+17/90 73/90/90).
Example 7 (Solution) Null hypothesis: p2 – p1 = 0; Alternative hypothesis: p2 – p1 > 0. Test statistic: z=[p2–p1–0]/ (23/80 57/80/ /90 73/90/90). This is a one tail test. At 95% confidence, =0.05, z 0.1 = We shall accept null hypothesis if z calculated , and reject it otherwise. z calculated = [23/80 – 17/90] / (23/80 57/80/ /90 73/90/90) = < z critical. Hence we accept the null hypothesis.