SADC Course in Statistics Comparing two proportions (Session 14)

Slides:



Advertisements
Similar presentations
Introductory Mathematics & Statistics for Business
Advertisements

Prepared by Lloyd R. Jaisingh
Overview of Lecture Parametric vs Non-Parametric Statistical Tests.
1 Session 9 Tests of Association in two-way tables.
1 Session 8 Tests of Hypotheses. 2 By the end of this session, you will be able to set up, conduct and interpret results from a test of hypothesis concerning.
SADC Course in Statistics Analysis of Variance for comparing means (Session 11)
SADC Course in Statistics Common Non- Parametric Methods for Comparing Two Samples (Session 20)
SADC Course in Statistics Estimating population characteristics with simple random sampling (Session 06)
The Poisson distribution
SADC Course in Statistics Comparing several proportions (Session 15)
SADC Course in Statistics Further ideas concerning confidence intervals (Session 06)
SADC Course in Statistics Introduction to Non- Parametric Methods (Session 19)
SADC Course in Statistics Tests for Variances (Session 11)
Assumptions underlying regression analysis
SADC Course in Statistics Basic principles of hypothesis tests (Session 08)
SADC Course in Statistics Meaning and use of confidence intervals (Session 05)
SADC Course in Statistics The binomial distribution (Session 06)
SADC Course in Statistics Inferences about the regression line (Session 03)
SADC Course in Statistics Importance of the normal distribution (Session 09)
Correlation & the Coefficient of Determination
SADC Course in Statistics Confidence intervals using CAST (Session 07)
SADC Course in Statistics Sample size determinations (Session 11)
SADC Course in Statistics Linking tests to confidence intervals (and other issues) (Session 10)
SADC Course in Statistics (Session 09)
SADC Course in Statistics Goodness-of-fit tests (and further issues) (Session 16)
SADC Course in Statistics Comparing Means from Paired Samples (Session 13)
SADC Course in Statistics Revision on tests for proportions using CAST (Session 18)
STATISTICAL INFERENCE ABOUT MEANS AND PROPORTIONS WITH TWO POPULATIONS
Hypothesis Test II: t tests
Department of Engineering Management, Information and Systems
Elementary Statistics
Chapter 7 Hypothesis Testing
Contingency Tables Prepared by Yu-Fen Li.
Chapter 16 Goodness-of-Fit Tests and Contingency Tables
Chi-Square and Analysis of Variance (ANOVA)
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 22 Comparing Two Proportions.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Statistical Significance for 2 x 2 Tables Chapter 13.
McGraw-Hill, Bluman, 7th ed., Chapter 9
© The McGraw-Hill Companies, Inc., Chapter 10 Testing the Difference between Means and Variances.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Two Sample Proportions Large Sample Difference of Proportions z Test & Confidence.
Chapter 18: The Chi-Square Statistic
ABOUT TWO INDEPENDENT POPULATIONS
Categorical Data Analysis
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
4/4/2015Slide 1 SOLVING THE PROBLEM A one-sample t-test of a population mean requires that the variable be quantitative. A one-sample test of a population.
Hypothesis Testing. To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.
SADC Course in Statistics Introduction and Study Objectives (Session 01)
SADC Course in Statistics Comparing Means from Independent Samples (Session 12)
1 Nominal Data Greg C Elvers. 2 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics.
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
Presentation 12 Chi-Square test.
Fundamentals of Hypothesis Testing: One-Sample Tests
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
CHAPTER 11 SECTION 2 Inference for Relationships.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Fall 2002Biostat Statistical Inference - Proportions One sample Confidence intervals Hypothesis tests Two Sample Confidence intervals Hypothesis.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 11 Chi-Square Tests.
Chapter 9: Inferences Involving One Population
Chapter 9 Hypothesis Testing.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Chapter 10 Analyzing the Association Between Categorical Variables
Comparing two Rates Farrokh Alemi Ph.D.
Chapter 11 Chi-Square Tests.
Chapter 11 Chi-Square Tests.
Presentation transcript:

SADC Course in Statistics Comparing two proportions (Session 14)

To put your footer here go to View > Header and Footer 2 Learning Objectives By the end of this session, you will be able to explain how two sample proportions can be compared using either –a normal approximation; or –a chi-squared test understand the link between the normal approximation and the chi-square test

To put your footer here go to View > Header and Footer 3 Dealing with categorical data In most of the previous sessions, the focus has been on quantitative measurements. Many data variables collected in practice are however, categorical in nature, especially those emerging from surveys, e.g. –gender of HH head (male/female) –level of education (none, primary, secondary, tertiary) –whether of not HH has access to clean water (yes/no) –failure of a crop (success/failure), etc.

To put your footer here go to View > Header and Footer 4 Some typical questions Are animals vaccinated for a specific disease less likely to fall sick compared to unvaccinated animals? Is there an association between the level of poverty and educational level of the HH head? Does the proportion of children who have had prescribed inoculations differ according to whether or not their HH had access to a health centre within 5 km of their homestead?

To put your footer here go to View > Header and Footer 5 An example comparing proportions In a long-term study on the relationship between smoking and mortality amongst males with cardiovascular problems, such individuals > 60 years were monitored. After 6 years, it was found that 117 out of 1067 non-smokers group had died, while this was 54 out of 356 amongst smokers. Is there evidence of a difference in death rates between smokers and non-smokers?

To put your footer here go to View > Header and Footer 6 Comparing two proportions Let 1 and 2 be the population proportions dying in the smokers and non-smokers groups. The hypotheses to be tested are: H 0 : 1 = 2 versus H 1 : 1 2 Since the sample sizes are large, we assume the normal approximation to the sample proportions p 1 and p 2 (using the Central Limit Theorem), and carry out a test based on the normal distribution.

To put your footer here go to View > Header and Footer 7 Expectation and variance of p 1, p 2 From results of a binomial distribution for the number of deaths (r) in a sample of size n, we have E(r) = n and Var(r) = n(1- ). Hence E(p) = E(r/n) = n/n =, while Var(p) = (1/n 2 )(n(1- ) = (1- )/n where p = observed sample proportion = r/n. This allows the standard error of p 1 -p 2, for two sample proportions from populations with true proportions 1 and 2 to be computed.

To put your footer here go to View > Header and Footer 8 Standard error of p 1 - p 2 The standard error of p 1 -p 2 is given by: Since 1 and 2 are unknown, we can use the estimate: However, under the null hypothesis, an estimate of the common = 1 = 2 can be used, as is done in most software packages.

To put your footer here go to View > Header and Footer 9 Test procedure and results Returning to our example, we can now calculate the z statistic for testing H 0 as: z = p 1 – p 2 /(standard error of p 1 -p 2 ) = p 1 – p 2 / = 0.042/{(0.12*0.88)*[(1/1067)+(1/356)]} = 2.11 This is significant at the 5% level. The exact p-value is

To put your footer here go to View > Header and Footer 10 Conclusions There is some evidence (p=0.035) to indicate that mortality rates differ between smokers and non-smokers. The corresponding proportions of deaths are 11% in the non-smoking group and 15% in the smokers group.

To put your footer here go to View > Header and Footer 11 A second example In a study of the effectiveness of using mosquito nets, results from a household survey were used to address the following objective: Is there evidence, amongst children in the sample, of a relationship between the use of a mosquito net and the occurrence of malaria? This is equivalent to the question: Are the proportions of children with malaria different between HHs using mosquito nets and those that dont?

To put your footer here go to View > Header and Footer 12 Survey results Results from the survey gave the following: Of 1039 children using mosquito nets, 649 had malaria Of 6904 children using mosquito nets, 3849 had malaria Can you write out this information in the form of a two-way table, with rows representing whether or not malaria was suffered, and columns representing the use of a net?

To put your footer here go to View > Header and Footer 13 Two-way table – observed values Usually sleep under a mosquito net? Suffered malaria? YesNoTotal Yes % % % No % % % Total % % 7943 (100%) Which two proportions (or percentages) are we interested in comparing?

To put your footer here go to View > Header and Footer 14 Null and alternative hypotheses As before, we can compare the two sample proportions. However, often the null and alternative hypotheses are expressed as: H 0 : occurrence of malaria is independent of use of a mosquito net H 1 : malaria and use of net are not independent, i.e. they are associated If H 0 is true, then use of a mosquito net is not associated with the occurrence of malaria. What values would you then expect in each cell of the table?

To put your footer here go to View > Header and Footer 15 Expected values in the first row: Expected value in cell 1 = (4498 / 7943)*1039 = (4498*1039) / 7943 = Expected value in cell 2 = (4498 / 7943)*6904 = (4498*6904) / 7943 = Can you calculate expected values in the next row? Check that your 2 numbers add to Computation of expected values

To put your footer here go to View > Header and Footer 16 Usually sleep under a mosquito net? Suffered malaria? YesNoTotal Yes No Total Note: Table of expected values

To put your footer here go to View > Header and Footer 17 The chi-square test statistic Here we test the null hypothesis using a chi-square test. The first step is to compute the chi-square ( 2 ) test statistic. The formula is: Comparing this value with values of the 2 distribution with 1 d.f., shows the result is significant at the 1% level. We conclude there is strong evidence to reject the null hypothesis.

To put your footer here go to View > Header and Footer 18 What would have happened if we had done a z-test to compare the two proportions of children with malaria who use, and do not use a mosquito net? The result would be an z-statistic = 4.07 This again leads to a highly significant p-value of Note that the square of z above is This is identical to the chi-square statistic. This is expected since theoretically, it is known that z 2 = 2 with 1 d.f. So the two tests are equivalent! Comparison with z-test

To put your footer here go to View > Header and Footer 19 We havent yet dealt with how best to present results of a chi-square test, and further interpretation of results of this last example. We also have not discussed assumptions underlying the chi-square test and actions to take if assumptions fail. These issues will be dealt with in the next two sessions. Some final remarks

To put your footer here go to View > Header and Footer 20 Some practical work follows…