Hypothesis Testing: The Difference Between Two Population Means Hypothesis testing involving the difference between two population means is most frequently employed to determine whether or not it is reasonable to conclude that the population means are not equal. In such cases, one of the following hypotheses may be formulated: H0: μ1 – μ2 = 0, HA: μ1 – μ2 ≠ 0 H0: μ1 – μ2 ≥ 0, HA: μ1 – μ2 0 H0: μ1 – μ2 ≤ 0, HA: μ1 – μ2 0
Hypothesis Testing: The Difference Between Two Population Means Using the same methodology, it is possible to test the hypothesis that the difference is equal to, greater than or equal to, or less than or equal to some value other than zero.
Hypothesis Testing: The Difference Between Two Population Means When each of the two independent simple random samples has been drawn from a normally distributed population with a known variance, the test statistic for the null hypothesis of equal population means is: The subscript 0 indicates that the difference is a hypothesized parameter.
Hypothesis Testing: The Difference Between Two Population Means Researchers wish to know if the data they collected provide sufficient evidence to indicate a difference in the mean uric acid levels between normal individuals and individuals with Down’s syndrome. The data consist of uric acid readings on 12 individuals with Down’s syndrome and 15 normal individuals. The means are:
Hypothesis Testing: The Difference Between Two Population Means The data constitute two independent simple random samples each drawn from normally distributed population with variance equal to 1 for the Down’s syndrome population and 1.5 for the normal population. H0: μ1 – μ2 = 0, HA: μ1 – μ2 ≠ 0 An alternative way of stating the hypothesis is as follows H0: μ1 = μ2, HA: μ1 ≠ μ2
Hypothesis Testing: The Difference Between Two Population Means The test statistic is: Decision rule: let α=0.05, the critical values of z are 1.96 and -1.96. Reject H0 unless -1.96 z 1.96
Hypothesis Testing: The Difference Between Two Population Means Calculation of the test statistic: Statistical decision to reject H0 since 2.57 1.96 Conclude that on the basis of the data, there is an indication that the two population means are not equal. p value: 0.0102 (the area to the right of 2.57 and the left of -2.57)
Hypothesis Testing: The Difference Between Two Population Means If the population variances are unknown, two possibilities exist: They may be assumes equal The may be assumed unequal
Hypothesis Testing: The Difference Between Two Population Means When the population variances are unknown but assumed to be equal we may “pool” the sample variances using the following equation:
Hypothesis Testing: The Difference Between Two Population Means When each of two independent simple random samples has been drawn from a normally distributed population and the two populations have equal but unknown variances, the test statistic for testing H0: μ1 = μ2 is given by: Which is distributed as Student’s t with n1+n2-2 degrees of freedom.
Hypothesis Testing: The Difference Between Two Population Means In a study to investigate the nature of lung destruction in cigarette smokers before the development of marked emphysema, a lung destructive index was measured in a sample of lifelong nonsmokers and smokers who died suddenly outside the hospital of nonrespiratory causes A larger score indicates greater lung damage. The average score of the nonsmokers (9 subjects) was 12.4 with standard deviation of 4.8492 The average score for smokers (16 subjects) was 17.5 with a standard deviation of 4.4711
Hypothesis Testing: The Difference Between Two Population Means We wish to know if we may conclude that smokers, in general, have greater lung damage measured by this destructive index than do nonsmokers? The data constitutes two independent simple random samples of lungs. The lung destructive index scores in both populations are approximately normally distributed. The population variances are unknown but are assumed to be equal.
Hypothesis Testing: The Difference Between Two Population Means Hypotheses: H0: μS ≤ μNS, HA: μS μNS Test statistic: Decision rule: let α=0.05. the critical values of t are 2.0687 and -2.0687. reject H0 unless -2.0687 t 2.0687
Hypothesis Testing: The Difference Between Two Population Means Calculation of the test statistic:
Hypothesis Testing: The Difference Between Two Population Means Statistical decision: we reject H0 because 2.6573>2.0687 (falls in the rejection zone). Conclusion: we conclude that the two population means are different, as measured by the index used in the study, smokers have greater lung damage than nonsmokers. p value: 0.01>P>0.005, since 2.500 2.65732.8073
Hypothesis Testing: The Difference Between Two Population Means When two independent simple random samples have been drawn from normally distributed populations with unknown and unequal variances, the test statistic for testing H0: μ1 = μ2 is:
Hypothesis Testing: The Difference Between Two Population Means The critical value of t` for an α level of significance and a two sided test is approximately:
Hypothesis Testing: The Difference Between Two Population Means For a two sided test, reject H0 if the computed value of t` is either greater than or equal to the critical value calculated t`(1-α) or less than or equal to the negative of that value.
Hypothesis Testing: The Difference Between Two Population Means The critical value of t` for a one-sided test is found by computing t`(1-α) using the previous equation using t1 = t(1-α) for n1-1 degrees of freedom and t2=t(1-α) for n2-1 degrees of freedom. For a one-sided test with the rejection region in the right tail of the sampling distribution, reject H0 if the computed t` is equal to or greater than the critical t`. For a one-sided test with the rejection region in the left tail of the sampling distribution, reject H0 if the computed t` is equal to or smaller than the negative of the critical t` computed.
Hypothesis Testing: The Difference Between Two Population Means Researchers wish to know if two populations differ with respect to the mean value of the total serum complement activity (CH50). The data consist of CH50 determinations on n2=20 apparently normal subjects and n1=10 subjects with disease. The sample means and standard deviations are:
Hypothesis Testing: The Difference Between Two Population Means The data constitute two independent random samples, one from a population of apparently normal subjects and the other from a population of subjects with disease. We assume that the CH50 values are approximately normally distributed in both populations. The population variances are unknown and unequal.
Hypothesis Testing: The Difference Between Two Population Means HA: μ1 – μ2 ≠ 0 Test statistic: The statistic t` does not follow Student’s t distribution so we obtain the critical value by the equation:
Hypothesis Testing: The Difference Between Two Population Means Decision rule: let α=0.05, before computing t` we calculate: w1=(33.8)2/10=114.244 w2=(10.1)2/20=5.1005 In the t distribution table, we find t1=2.2622 and t2=2.0930 Our decision rule is reject H0 if the computed t is either ≥2.255 or ≤2.255.
Hypothesis Testing: The Difference Between Two Population Means Calculation of the test statistic: Statistical decision: since -2.255 1.41 2.255, we can not reject the H0. On the basis of these results we can not conclude that the two population means are different. The p value of this test 0.05
Hypothesis Testing: The Difference Between Two Population Means When sampling is from populations that are not normally distributed, the results of the central limit theorem may be employed if sample sizes are large (≥30). When each of two large independent simple random samples has been drawn from a population that is not normally distributed, the test statistic for testing H0: μ1 = μ2 is:
Hypothesis Testing: The Difference Between Two Population Means A study was designed to test the effect of disability on the beneficiary effects of health promotion. The researchers developed a scale for testing this effect (BHADP), the scale was administered to a sample of 132 disabled (D) and 137 nondisabled (137) subjects with the following results: The authors wish to know if they may conclude on the basis of these results that, in general, diabled persons, on the average, score higher on the BHADP scale Standard Deviation Mean Score Sample 7.93 31.83 D 4.80 25.07 ND
Hypothesis Testing: The Difference Between Two Population Means The statistics were computed from two independent samples that behave as simple random samples from a population of disabled persons anda population of nondisabled persons. Since the population variances are unknown, we will use sample variances in the calculation of the test statistic. Since we have large samples, the central limit theorem allows us to use z as a test statistic.
Hypothesis Testing: The Difference Between Two Population Means Hypotheses: H0: μD – μND ≤ 0 HA: μD – μND > 0 or alternatively: H0: μD ≤ μND HA: μD > μND
Hypothesis Testing: The Difference Between Two Population Means Decision rule: Let α=0.01. this is a one sided test with critical value of z equal to 2.33. Reject H0 if zcomputed≥2.33 Calculation of the test statistic: Reject H0 since zcomputed = 8.42 > 2.33
Hypothesis Testing: The Difference Between Two Population Means The data indicate that on average disabled persons score higher on the BHADP scale than do nondisabled person. For this test, p 0.001 since 8.42 >3.89
Paired Comparisons Previously we discussed the difference between two population means assuming that the samples were independent. Some times we may want to assess the effectiveness of a treatment or experimental procedure making use of observations resulting from nonindependent samples. A hypothesis test based on this type of data is called a paried comparison test.
Paired Comparisons The objective in paired comparison tests is to eliminate a maximum number of sources of extraneous variations by making the pairs similar with respect to as many variables as possible. Related or paired observations may be obtained in a number of ways: The same subjects may be measured before and after receiving some treatment. In comparing two methods of analysis, the material to be analyzed may be divided equally so that one half is analyzed by one method and one half is analyzed by another.
Paired Comparisons Instead of performing the analysis with individual observations, we use di, the difference between pairs of observations as the variable of interest.
Paired Comparisons When the n sample differences computed from the n pairs of measurements constitute a simple random sample from a normally distributed population of differences, the test statistic for testing hypothesis about the population mean difference μd is:
Paired Comparisons The t statistic is distributed as Student’s t with n-1 degrees of freedom. We do not have to worry about the equality of variances in paired comparisons, since our variable is the difference in the reading of the same subject or object.
Paired Comparisons In a study to evaluate the effect of very low calorie diet (VLCD) on the weight of 9 subjects, the following data was collected: The researchers wish to know if these data provide sufficient evidence to allow them to conclude that the treatment is effective in causing weight reduction in those individuals. 78.2 89.5 81.7 100.4 105.4 104.3 98.6 111.4 117.3 B 63.9 69 62.7 77.7 82.3 82.9 75.8 85.9 83.3 A
Paired Comparisons We may obtain the differences in one of two ways: by subtracting the before weights from the after weights (A – B) or by subtracting the after weights from the before weights (B – A). If we choose (di=A – B), the differences are:-34, -25.5, -22.8, -21.4, -23.1, -22.7, -19, -20.5, -14.3. Assumptions: the observed differences constitute a simple random sample from a normally distributed population of differences that could be generated.
Paired Comparisons Hypotheses: H0: μd ≥ 0 HA: μd 0 If we had obtained the differences by subtracting the after weights from the before weights (B – A) our hypotheses would have been: H0: μd ≤ 0 HA: μd > 0 If the question had been such that a two sided test was indicated, the hypotheses would have been: H0: μd = 0 HA: μd ≠ 0
Paired Comparisons The test statistic: Decision rule: Let α=0.05, the critical value of t is -1.8595, reject H0 if the computed t is less than or equal to the critical value.
Paired Comparisons A 95% confidence interval for μd may be obtained as follows: