1 G89.2228 Lect 7a G89.2228 Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.

Slides:



Advertisements
Similar presentations
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Hypothesis Testing Steps in Hypothesis Testing:
Lecture 6 Outline – Thur. Jan. 29
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Lecture 5 Outline – Tues., Jan. 27 Miscellanea from Lecture 4 Case Study Chapter 2.2 –Probability model for random sampling (see also chapter 1.4.1)
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Inferences About Process Quality
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
5-3 Inference on the Means of Two Populations, Variances Unknown
Hypothesis Testing Using The One-Sample t-Test
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
1 Nominal Data Greg C Elvers. 2 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Hypothesis Testing.
Chapter 9.3 (323) A Test of the Mean of a Normal Distribution: Population Variance Unknown Given a random sample of n observations from a normal population.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
1 G Lect 6b G Lecture 6b Generalizing from tests of quantitative variables to tests of categorical variables Testing a hypothesis about a.
1 G Lect 10a G Lecture 10a Revisited Example: Okazaki’s inferences from a survey Inferences on correlation Correlation: Power and effect.
Mid-Term Review Final Review Statistical for Business (1)(2)
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Chi-squared Tests. We want to test the “goodness of fit” of a particular theoretical distribution to an observed distribution. The procedure is: 1. Set.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Chapter 12 A Primer for Inferential Statistics What Does Statistically Significant Mean? It’s the probability that an observed difference or association.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
The binomial applied: absolute and relative risks, chi-square.
Chi- square test x 2. Chi Square test Symbolized by Greek x 2 pronounced “Ki square” A Test of STATISTICAL SIGNIFICANCE for TABLE data.
Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
1 G Lect 11a G Lecture 11a Example: Comparing variances ANOVA table ANOVA linear model ANOVA assumptions Data transformations Effect sizes.
Copyright © 2010 Pearson Education, Inc. Slide
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Review Lecture 51 Tue, Dec 13, Chapter 1 Sections 1.1 – 1.4. Sections 1.1 – 1.4. Be familiar with the language and principles of hypothesis testing.
Chapter 10 The t Test for Two Independent Samples
© Copyright McGraw-Hill 2004
More Contingency Tables & Paired Categorical Data Lecture 8.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
MATB344 Applied Statistics I. Experimental Designs for Small Samples II. Statistical Tests of Significance III. Small Sample Test Statistics Chapter 10.
1 Probability and Statistics Confidence Intervals.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
The p-value approach to Hypothesis Testing
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Lecture 22 Dustin Lueker.  Similar to testing one proportion  Hypotheses are set up like two sample mean test ◦ H 0 :p 1 -p 2 =0  Same as H 0 : p 1.
Hypothesis Testing and Statistical Significance
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
I. ANOVA revisited & reviewed
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
The binomial applied: absolute and relative risks, chi-square
Hypothesis Testing Review
Review of Chapter 11 Comparison of Two Populations
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
What are their purposes? What kinds?
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Statistical Inference for the Mean: t-test
Presentation transcript:

1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength of Association –Odds Ratios

2 G Lect 7a Difference in proportions from independent samples Henderson-King & Nisbett had a binary outcome that is most appropriately analyzed using methods for categorical data. Consider their question, is choice to sit next to a Black person different in groups exposed to disruptive Black vs. White? In their study and. Are these numbers consistent with the same population mean? (Is there difference zero?) Consider the general large sample test statistic, which will have N(0,1), when the sample sizes are large:

3 G Lect 7a Differences in proportions Under the null hypothesis, the standard errors of the means are and, where  is the population standard deviation. Under the null hypotheses, the common proportion is estimated by pooling the data: The common variance is The Z statistic is then, The two-tailed p-value is.16. A 95% CI bound on the difference is.160±(1.96)(.114) = (-.06,.38). It includes the H 0 value of zero.

4 G Lect 7a Pearson Chi Square for 2  2 Tables The z test statistic has a standard normal N(0,1) distribution for large samples. z 2 is distributed as  2 with 1 degree of freedom for large samples. From the example, =1.96. Howell's table for Chi Square Pr(  2 > 1.96) to be in range.1 to.25. Pearson’s calculation for this test statistic is: where O i is an observed frequency and E i is the expected frequency given the null hypothesis of equal proportions.

5 G Lect 7a Expected values for no association From the example: p 1 =11/37=.297, and p 2 =16/35=.457 The expected frequencies are based on a pooled p=(11+16)/(37+35)=.375

6 G Lect 7a Chi square test of association, continued Marginal probabilities = pooled Expected joint probabilities|H0 = product of marginals (e.g..193 =.375*.514) E i = expected joint probability * n (e.g =.193*72=27*37/72) We use these values in Pearson's formula

7 G Lect 7a Analysis of a 2  2 tables for small samples The z and  2 tests are justified on the basis of the central limit theorem, and will be approximately correct for fairly small n’s. What if the sample is ridiculously small? –Rule of thumb: if expected frequencies are less than 2.5, the sample is small For small n's, Fisher recommended using a Randomization test –Suppose we have N subjects, and g 1 are in group 1 and r 1 overall respond positively –Under H0, response and group are independent –Consider this thought experiment: Put all N subjects in an urn. Randomly draw r 1 subjects and pretend that they are positive responders? How often would the original pattern of data emerge from such a random process?

8 G Lect 7a Fisher’s Exact test Suppose we have the following table Pearson ChiSquare would be 3.6, and two tailed p is.058 Hypergeometric probability of getting 1 or fewer Grp2 responses (given that 5 people responded) is:

9 G Lect 7a Analysis of Matched Samples Many research questions involve comparing proportions computed from related observations: Analogue of paired t-test. –Analysis of change –Within-subjects designs –Analysis of siblings, spouses, supervisor-employee pairs, … –Samples constructed by matching on confounding variables When the outcome is binary, display the data showing the numbers of pairs (joint dist.)

10 G Lect 7a Example (Howell Ex ) Is the proportion pro the same at the two time points? Note that the marginals (30/40 and 15/40) are not independent Instead of comparing those proportions, examine those whose opinions change Compare (5,20) to the expected (12.5,12.5) as a Chi Square test

11 G Lect 7a McNemar’s test McNemar showed that this test [whether (5,20) is significantly different from (12.5,12.5)] may be computed, with the Yates correction for continuity, as: For the example, (20-5-1) 2 /25=7.84 is unusual for 1 d.o.f.  2, yielding p=.005

12 G Lect 7a Confidence interval for matched proportion difference Fleiss (1981) recommends using the general form of the symmetric CI for testing the difference between p 1 and p 2 where the standard error is estimated using E.g. the 95% CI for the difference (30/40)-(15/40)=.375 requires the SE

13 G Lect 7a Measures of association Consider two tables: The proportions with D in groups A and B is.90 vs.50 in the first table (.9-.5=.4) and.82 vs.33 ( =.49) in the second. Is the difference stronger in the second table?

14 G Lect 7a Odds ratios as an alternative to differences in proportions The proportions in group A in levels D and ~D do not differ across tables. Which way we look at the table gives different answers. The odds of D (vs ~D) are 9 to 1 in group A and 1 to 1 in group B in the first table. The odds ratio is 9: the odd are 9 times greater for group A than B. The odds ratio is also 9 in the second table.

15 G Lect 7a Properties of odds ratios Invariant to multiplying rows or columns by a constant Equal to one for equal odds Approaches infinity when off- diagonal cells approach zero Approaches zero when diagonal cells approach zero Easily computed as  =ad/bc Log(  ) (a “logit”) has a less obvious interpretation, but nicer scale features: with equal odds point ln(1)=0

16 G Lect 7a Confidence interval on odds ratio Like other bounded parameters, confidence intervals for  are difficult (symmetric bound does not work well) Approximate, but improved CI on ln(  )=ln(ad/bc) uses Compute CI on ln(  ), then take antilog (i.e., e x ) of each bound.

17 G Lect 7a Example From the first table above, 95% CI on ln(  ) is 95% CI on  is thus: an asymmetric confidence interval