Presentation is loading. Please wait.

Presentation is loading. Please wait.

Agresti/Franklin Statistics, 1 of 111 Chapter 9 Comparing Two Groups Learn …. How to Compare Two Groups On a Categorical or Quantitative Outcome Using.

Similar presentations


Presentation on theme: "Agresti/Franklin Statistics, 1 of 111 Chapter 9 Comparing Two Groups Learn …. How to Compare Two Groups On a Categorical or Quantitative Outcome Using."— Presentation transcript:

1 Agresti/Franklin Statistics, 1 of 111 Chapter 9 Comparing Two Groups Learn …. How to Compare Two Groups On a Categorical or Quantitative Outcome Using Confidence Intervals and Significance Tests

2 Agresti/Franklin Statistics, 2 of 111 Bivariate Analyses The outcome variable is the response variable The binary variable that specifies the groups is the explanatory variable

3 Agresti/Franklin Statistics, 3 of 111 Bivariate Analyses Statistical methods analyze how the outcome on the response variable depends on or is explained by the value of the explanatory variable

4 Agresti/Franklin Statistics, 4 of 111 Independent Samples The observations in one sample are independent of those in the other sample Example: Randomized experiments that randomly allocate subjects to two treatments Example: An observational study that separates subjects into groups according to their value for an explanatory variable

5 Agresti/Franklin Statistics, 5 of 111 Dependent Samples Data are matched pairs – each subject in one sample is matched with a subject in the other sample Example: set of married couples, the men being in one sample and the women in the other. Example: Each subject is observed at two times, so the two samples have the same people

6 Agresti/Franklin Statistics, 6 of 111  Section 9.1 Categorical Response: How Can We Compare Two Proportions?

7 Agresti/Franklin Statistics, 7 of 111 Categorical Response Variable Inferences compare groups in terms of their population proportions in a particular category We can compare the groups by the difference in their population proportions: (p 1 – p 2 )

8 Agresti/Franklin Statistics, 8 of 111 Example: Aspirin, the Wonder Drug Recent Titles of Newspaper Articles: “Aspirin cuts deaths after heart attack” “Aspirin could lower risk of ovarian cancer” “New study finds a daily aspirin lowers the risk of colon cancer” “Aspirin may lower the risk of Hodgkin’s”

9 Agresti/Franklin Statistics, 9 of 111 Example: Aspirin, the Wonder Drug The Physicians Health Study Research Group at Harvard Medical School Five year randomized study Does regular aspirin intake reduce deaths from heart disease?

10 Agresti/Franklin Statistics, 10 of 111 Example: Aspirin, the Wonder Drug Experiment: Subjects were 22,071 male physicians Every other day, study participants took either an aspirin or a placebo The physicians were randomly assigned to the aspirin or to the placebo group The study was double-blind: the physicians did not know which pill they were taking, nor did those who evaluated the results

11 Agresti/Franklin Statistics, 11 of 111 Example: Aspirin, the Wonder Drug Results displayed in a contingency table:

12 Agresti/Franklin Statistics, 12 of 111 Example: Aspirin, the Wonder Drug What is the response variable? What are the groups to compare?

13 Agresti/Franklin Statistics, 13 of 111 Example: Aspirin, the Wonder Drug The response variable is whether the subject had a heart attack, with categories ‘yes’ or ‘no’ The groups to compare are: Group 1: Physicians who took a placebo Group 2: Physicians who took aspirin

14 Agresti/Franklin Statistics, 14 of 111 Example: Aspirin, the Wonder Drug Estimate the difference between the two population parameters of interest

15 Agresti/Franklin Statistics, 15 of 111 Example: Aspirin, the Wonder Drug p 1 : the proportion of the population who would have a heart attack if they participated in this experiment and took the placebo p 2 : the proportion of the population who would have a heart attack if they participated in this experiment and took the aspirin

16 Agresti/Franklin Statistics, 16 of 111 Example: Aspirin, the Wonder Drug Sample Statistics:

17 Agresti/Franklin Statistics, 17 of 111 Example: Aspirin, the Wonder Drug To make an inference about the difference of population proportions, (p 1 – p 2 ), we need to learn about the variability of the sampling distribution of:

18 Agresti/Franklin Statistics, 18 of 111 Standard Error for Comparing Two Proportions The difference,, is obtained from sample data It will vary from sample to sample This variation is the standard error of the sampling distribution of :

19 Agresti/Franklin Statistics, 19 of 111 Confidence Interval for the Difference between Two Population Proportions The z-score depends on the confidence level This method requires: Independent random samples for the two groups Large enough sample sizes so that there are at least 10 “successes” and at least 10 “failures” in each group

20 Agresti/Franklin Statistics, 20 of 111 Confidence Interval Comparing Heart Attack Rates for Aspirin and Placebo 95% CI:

21 Agresti/Franklin Statistics, 21 of 111 Confidence Interval Comparing Heart Attack Rates for Aspirin and Placebo Since both endpoints of the confidence interval (0.005, 0.011) for (p 1 - p 2 ) are positive, we infer that (p 1 - p 2 ) is positive Conclusion: The population proportion of heart attacks is larger when subjects take the placebo than when they take aspirin

22 Agresti/Franklin Statistics, 22 of 111 Confidence Interval Comparing Heart Attack Rates for Aspirin and Placebo The population difference (0.005, 0.011) is small Even though it is a small difference, it may be important in public health terms For example, a decrease of 0.01 over a 5 year period in the proportion of people suffering heart attacks would mean 2 million fewer people having heart attacks

23 Agresti/Franklin Statistics, 23 of 111 Confidence Interval Comparing Heart Attack Rates for Aspirin and Placebo The study used male doctors in the U.S The inference applies to the U.S. population of male doctors Before concluding that aspirin benefits a larger population, we’d want to see results of studies with more diverse groups

24 Agresti/Franklin Statistics, 24 of 111 Interpreting a Confidence Interval for a Difference of Proportions Check whether 0 falls in the CI If so, it is plausible that the population proportions are equal If all values in the CI for (p 1 - p 2 ) are positive, you can infer that (p 1 - p 2 ) >0 If all values in the CI for (p 1 - p 2 ) are negative, you can infer that (p 1 - p 2 ) <0 Which group is labeled ‘1’ and which is labeled ‘2’ is arbitrary

25 Agresti/Franklin Statistics, 25 of 111 Interpreting a Confidence Interval for a Difference of Proportions The magnitude of values in the confidence interval tells you how large any true difference is If all values in the confidence interval are near 0, the true difference may be relatively small in practical terms

26 Agresti/Franklin Statistics, 26 of 111 Significance Tests Comparing Population Proportions 1. Assumptions: Categorical response variable for two groups Independent random samples

27 Agresti/Franklin Statistics, 27 of 111 Significance Tests Comparing Population Proportions Assumptions (continued): Significance tests comparing proportions use the sample size guideline from confidence intervals: Each sample should have at least about 10 “successes” and 10 “failures” Two–sided tests are robust against violations of this condition At least 5 “successes” and 5 “failures” is adequate

28 Agresti/Franklin Statistics, 28 of 111 Significance Tests Comparing Population Proportions 2. Hypotheses: The null hypothesis is the hypothesis of no difference or no effect: H 0 : (p 1 - p 2 ) =0 Under the presumption that p 1 = p 2, we create a pooled estimate of the common value of p 1 and p 2 This pooled estimate is

29 Agresti/Franklin Statistics, 29 of 111 Significance Tests Comparing Population Proportions 2. Hypotheses (continued): H a : (p 1 - p 2 ) ≠ 0 (two-sided test) H a : (p 1 - p 2 ) < 0 (one-sided test) H a : (p 1 - p 2 ) > 0 (one-sided test)

30 Agresti/Franklin Statistics, 30 of 111 Significance Tests Comparing Population Proportions 3. The test statistic is:

31 Agresti/Franklin Statistics, 31 of 111 Significance Tests Comparing Population Proportions 4. P-value: Probability obtained from the standard normal table 5. Conclusion: Smaller P-values give stronger evidence against H 0 and supporting H a

32 Agresti/Franklin Statistics, 32 of 111 Example: Is TV Watching Associated with Aggressive Behavior? Various studies have examined a link between TV violence and aggressive behavior by those who watch a lot of TV A study sampled 707 families in two counties in New York state and made follow-up observations over 17 years The data shows levels of TV watching along with incidents of aggressive acts

33 Agresti/Franklin Statistics, 33 of 111 Example: Is TV Watching Associated with Aggressive Behavior?

34 Agresti/Franklin Statistics, 34 of 111 Example: Is TV Watching Associated with Aggressive Behavior? Test the Hypotheses: H 0 : (p 1 - p 2 ) = 0 H a : (p 1 - p 2 ) ≠ 0 Using a significance level of 0.05 Group 1: less than 1 hr. of TV per day Group 2: at least 1 hr. of TV per day

35 Agresti/Franklin Statistics, 35 of 111 Example: Is TV Watching Associated with Aggressive Behavior?

36 Agresti/Franklin Statistics, 36 of 111 Example: Is TV Watching Associated with Aggressive Behavior? Conclusion: Since the P-value is less than 0.05, we reject H 0 We conclude that the population proportions of aggressive acts differ for the two groups The sample values suggest that the population proportion is higher for the higher level of TV watching

37 Agresti/Franklin Statistics, 37 of 111  Section 9.2 Quantitative Response: How Can We Compare Two Means?

38 Agresti/Franklin Statistics, 38 of 111 Comparing Means We can compare two groups on a quantitative response variable by comparing their means

39 Agresti/Franklin Statistics, 39 of 111 Example: Teenagers Hooked on Nicotine A 30-month study: Evaluated the degree of addiction that teenagers form to nicotine 332 students who had used nicotine were evaluated The response variable was constructed using a questionnaire called the Hooked on Nicotine Checklist (HONC)

40 Agresti/Franklin Statistics, 40 of 111 Example: Teenagers Hooked on Nicotine The HONC score is the total number of questions to which a student answered “yes” during the study The higher the score, the more hooked on nicotine a student is judged to be

41 Agresti/Franklin Statistics, 41 of 111 Example: Teenagers Hooked on Nicotine The study considered explanatory variables, such as gender, that might be associated with the HONC score

42 Agresti/Franklin Statistics, 42 of 111 Example: Teenagers Hooked on Nicotine How can we compare the sample HONC scores for females and males? We estimate (µ 1 - µ 2 ) by (x 1 - x 2 ): 2.8 – 1.6 = 1.2 On average, females answered “yes” to about one more question on the HONC scale than males did

43 Agresti/Franklin Statistics, 43 of 111 Example: Teenagers Hooked on Nicotine To make an inference about the difference between population means, (µ 1 – µ 2 ), we need to learn about the variability of the sampling distribution of:

44 Agresti/Franklin Statistics, 44 of 111 Standard Error for Comparing Two Means The difference,, is obtained from sample data. It will vary from sample to sample. This variation is the standard error of the sampling distribution of :

45 Agresti/Franklin Statistics, 45 of 111 Confidence Interval for the Difference between Two Population Means A 95% CI: Software provides the t-score with right- tail probability of 0.025

46 Agresti/Franklin Statistics, 46 of 111 Confidence Interval for the Difference between Two Population Means This method assumes: Independent random samples from the two groups An approximately normal population distribution for each group this is mainly important for small sample sizes, and even then the method is robust to violations of this assumption

47 Agresti/Franklin Statistics, 47 of 111 Example: Nicotine – How Much More Addicted Are Smokers than Ex-Smokers? Data as summarized by HONC scores for the two groups: Smokers: x 1 = 5.9, s 1 = 3.3, n 1 = 75 Ex-smokers:x 2 = 1.0, s 2 = 2.3, n 2 = 257

48 Agresti/Franklin Statistics, 48 of 111 Example: Nicotine – How Much More Addicted Are Smokers than Ex-Smokers? Were the sample data for the two groups approximately normal? Most likely not for Group 2 (based on the sample statistics): x 2 = 1.0, s 2 = 2.3) Since the sample sizes are large, this lack of normality is not a problem

49 Agresti/Franklin Statistics, 49 of 111 Example: Nicotine – How Much More Addicted Are Smokers than Ex-Smokers? 95% CI for (µ 1 - µ 2 ): We can infer that the population mean for the smokers is between 4.1 higher and 5.7 higher than for the ex-smokers

50 Agresti/Franklin Statistics, 50 of 111 How Can We Interpret a Confidence Interval for a Difference of Means? Check whether 0 falls in the interval When it does, 0 is a plausible value for (µ 1 – µ 2 ), meaning that it is possible that µ 1 = µ 2 A confidence interval for (µ 1 – µ 2 ) that contains only positive numbers suggests that (µ 1 – µ 2 ) is positive We then infer that µ 1 is larger than µ 2

51 Agresti/Franklin Statistics, 51 of 111 How Can We Interpret a Confidence Interval for a Difference of Means? A confidence interval for (µ 1 – µ 2 ) that contains only negative numbers suggests that (µ 1 – µ 2 ) is negative We then infer that µ 1 is smaller than µ 2 Which group is labeled ‘1’ and which is labeled ‘2’ is arbitrary

52 Agresti/Franklin Statistics, 52 of 111 Significance Tests Comparing Population Means 1. Assumptions: Quantitative response variable for two groups Independent random samples

53 Agresti/Franklin Statistics, 53 of 111 Significance Tests Comparing Population Means Assumptions (continued): Approximately normal population distributions for each group This is mainly important for small sample sizes, and even then the two-sided test is robust to violations of this assumption

54 Agresti/Franklin Statistics, 54 of 111 Significance Tests Comparing Population Means 2. Hypotheses: The null hypothesis is the hypothesis of no difference or no effect: H 0 : (µ 1 - µ 2 ) =0

55 Agresti/Franklin Statistics, 55 of 111 Significance Tests Comparing Population Proportions 2. Hypotheses (continued): The alternative hypothesis: H a : (µ 1 - µ 2 ) ≠ 0 (two-sided test) H a : (µ 1 - µ 2 ) < 0 (one-sided test) H a : (µ 1 - µ 2 ) > 0 (one-sided test)

56 Agresti/Franklin Statistics, 56 of 111 Significance Tests Comparing Population Means 3. The test statistic is:

57 Agresti/Franklin Statistics, 57 of 111 Significance Tests Comparing Population Means 4. P-value: Probability obtained from the standard normal table 5. Conclusion: Smaller P-values give stronger evidence against H 0 and supporting H a

58 Agresti/Franklin Statistics, 58 of 111 Example: Does Cell Phone Use While Driving Impair Reaction Times? Experiment: 64 college students 32 were randomly assigned to the cell phone group 32 to the control group

59 Agresti/Franklin Statistics, 59 of 111 Example: Does Cell Phone Use While Driving Impair Reaction Times? Experiment (continued): Students used a machine that simulated driving situations At irregular periods a target flashed red or green Participants were instructed to press a “brake button” as soon as possible when they detected a red light

60 Agresti/Franklin Statistics, 60 of 111 Example: Does Cell Phone Use While Driving Impair Reaction Times? For each subject, the experiment analyzed their mean response time over all the trials Averaged over all trials and subjects, the mean response time for the cell- phone group was 585.2 milliseconds The mean response time for the control group was 533.7 milliseconds

61 Agresti/Franklin Statistics, 61 of 111 Example: Does Cell Phone Use While Driving Impair Reaction Times? Data:

62 Agresti/Franklin Statistics, 62 of 111 Example: Does Cell Phone Use While Driving Impair Reaction Times? Test the hypotheses: H 0 : (µ 1 - µ 2 ) =0 vs. H a : (µ 1 - µ 2 ) ≠ 0 using a significance level of 0.05

63 Agresti/Franklin Statistics, 63 of 111 Example: Does Cell Phone Use While Driving Impair Reaction Times?

64 Agresti/Franklin Statistics, 64 of 111 Example: Does Cell Phone Use While Driving Impair Reaction Times? Conclusion: The P-value is less than 0.05, so we can reject H 0 There is enough evidence to conclude that the population mean response times differ between the cell phone and control groups The sample means suggest that the population mean is higher for the cell phone group

65 Agresti/Franklin Statistics, 65 of 111 Example: Does Cell Phone Use While Driving Impair Reaction Times? What do the box plots tell us? There is an extreme outlier for the cell phone group It is a good idea to make sure the results of the analysis aren’t affected too strongly by that single observation Delete the extreme outlier and redo the analysis In this example, the t-statistic changes only slightly

66 Agresti/Franklin Statistics, 66 of 111 Example: Does Cell Phone Use While Driving Impair Reaction Times? Insight: In practice, you should not delete outliers from a data set without sufficient cause (i.e., if it seems the observation was incorrectly recorded) It is however, a good idea to check for sensitivity of an analysis to an outlier If the results change much, it means that the inference including the outlier is on shaky ground

67 Agresti/Franklin Statistics, 67 of 111 What is a point estimate of µ 1 - µ 2 ? a.18.2 – 12.9 b.32.6 – 18.1 c.6764 - 4252 d.32.6/18.2 – 18.1/12.9 How much more time do women spend on housework than men? Data is Hours per Week. Gender: Sample Size MeanSt. Dev. Women676432.618.2 Men425218.112.9

68 Agresti/Franklin Statistics, 68 of 111 What is the standard error for comparing the means? a.5.3 b..076 c..297 d..088 How much more time do women spend on housework than men? Data is Hours per Week. Gender: Sample Size MeanSt. Dev. Women676432.618.2 Men425218.112.9

69 Agresti/Franklin Statistics, 69 of 111 What factor causes the standard error to be small compared to the sample standard deviations for the two groups? a. sample means b. sample standard deviations c. sample sizes d. genders How much more time do women spend on housework than men? Data is Hours per Week. Gender: Sample Size MeanSt. Dev. Women676432.618.2 Men425218.112.9

70 Agresti/Franklin Statistics, 70 of 111  Section 9.3 Other Ways of Comparing Means and Comparing Proportions

71 Agresti/Franklin Statistics, 71 of 111 Alternative Method for Comparing Means An alternative t- method can be used when, under the null hypothesis, it is reasonable to expect the variability as well as the mean to be the same This method requires the assumption that the population standard deviations be equal

72 Agresti/Franklin Statistics, 72 of 111 The Pooled Standard Deviation This alternative method estimates the common value σ of σ 1 and σ 1 by:

73 Agresti/Franklin Statistics, 73 of 111 Comparing Population Means, Assuming Equal Population Standard Deviations Using the pooled standard deviation estimate, a 95% CI for (µ 1 - µ 2 ) is: This method has df =n 1 + n 2 - 2

74 Agresti/Franklin Statistics, 74 of 111 Comparing Population Means, Assuming Equal Population Standard Deviations The test statistic for H 0 : µ 1 =µ 2 is: This method has df =n 1 + n 2 - 2

75 Agresti/Franklin Statistics, 75 of 111 Comparing Population Means, Assuming Equal Population Standard Deviations These methods assume: Independent random samples from the two groups An approximately normal population distribution for each group This is mainly important for small sample sizes, and even then, the CI and the two-sided test are usually robust to violations of this assumption σ 1 =σ 2


Download ppt "Agresti/Franklin Statistics, 1 of 111 Chapter 9 Comparing Two Groups Learn …. How to Compare Two Groups On a Categorical or Quantitative Outcome Using."

Similar presentations


Ads by Google