Practical Solutions Comparing Proportions & Analysing Categorical Data.

Slides:



Advertisements
Similar presentations
Comparing Two Proportions (p1 vs. p2)
Advertisements

Hypothesis Testing and Comparing Two Proportions Hypothesis Testing: Deciding whether your data shows a “real” effect, or could have happened by chance.
SPSS Session 5: Association between Nominal Variables Using Chi-Square Statistic.
Contingency Tables Chapters Seven, Sixteen, and Eighteen Chapter Seven –Definition of Contingency Tables –Basic Statistics –SPSS program (Crosstabulation)
LEARNING PROGRAMME Hypothesis testing Part 2: Categorical variables Intermediate Training in Quantitative Analysis Bangkok November 2007.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Bivariate Analysis Cross-tabulation and chi-square.
Funded through the ESRC’s Researcher Development Initiative Department of Education, University of Oxford Session 3.3: Inter-rater reliability.
Three important questions Three important questions to ask: 1. Whether column % change? 2. Is the relationship significant? (.05 as chi square significance.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Section 4.2: How to Look for Differences. Cross-Tabulations College student binge drinkers experienced many personal and social problems, the researchers.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #17.
Determining and Interpreting Associations Among Variables.
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
Chi-square Test of Independence
Is used when we have categorical (nominal) rather than interval / ratio data can also be used for measurement data, is less powerful and than typical tests.
Data Analysis Express: Data Analysis Express: Practical Application using SPSS.
15a.Accessing Data: Frequencies in SPSS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton McNemar Test PowerPoint Prepared by Alfred P.
8/2/2015Slide 1 SPSS does not calculate confidence intervals for proportions. The Excel spreadsheet that I used to calculate the proportions can be downloaded.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.5 Small Sample.
David Yens, Ph.D. NYCOM PASW-SPSS STATISTICS David P. Yens, Ph.D. New York College of Osteopathic Medicine, NYIT l PRESENTATION.
Categorical Data Prof. Andy Field.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Inferential Statistics: SPSS
Amsterdam Rehabilitation Research Center | Reade Testing significance - categorical data Martin van der Esch, PhD.
Chi-Square Test of Independence Practice Problem – 1
1 Applied Statistics Using SAS and SPSS Topic: Chi-square tests By Prof Kelly Fan, Cal. State Univ., East Bay.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Pearson Chi-Square Contingency Table Analysis.
CHAPTER 11 SECTION 2 Inference for Relationships.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.4 Analyzing Dependent Samples.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
MK346 – Undergraduate Dissertation Preparation Part II - Data Analysis and Significance Testing.
Analysis of Qualitative Data Dr Azmi Mohd Tamil Dept of Community Health Universiti Kebangsaan Malaysia FK6163.
Nonparametric Statistics
Nonparametric Tests: Chi Square   Lesson 16. Parametric vs. Nonparametric Tests n Parametric hypothesis test about population parameter (  or  2.
Perform Descriptive Statistics Section 6. Descriptive Statistics Descriptive statistics describe the status of variables. How you describe the status.
CHI SQUARE TESTS.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
WINKS 7 Tutorial 3 Analyzing Summary Data (Using Student’s t-test) Permission granted for use for instruction and for personal use. ©
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Chi-Square Analyses.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Chi-Square Goodness-of- Fit Test PowerPoint Prepared.
Comparing Proportions & Analysing Categorical Data Scott Harris October 2009.
STATISTICAL TESTS USING SPSS Dimitrios Tselios/ Example tests “Discovering statistics using SPSS”, Andy Field.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Practical Solutions Analysing Continuous Data. 2 1)To produce the overall histogram you can use the options exactly as given. This results in the following.
Analysis of Variance (ANOVA) Scott Harris October 2009.
Chapter 4 Selected Nonparemetric Techniques: PARAMETRIC VS. NONPARAMETRIC.
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 4: Bivariate Analysis (Contingency Analysis and Regression Analysis)
Determining and Interpreting Associations Among Variables
Measures of Agreement Dundee Epidemiology and Biostatistics Unit
Making Use of Associations Tests
Categorical Data Aims Loglinear models Categorical data
Basic Statistics Overview
Elementary Statistics
Natalie Robinson Centre for Evidence-based Veterinary Medicine
Hypothesis Testing and Comparing Two Proportions
Statistical Analysis using SPSS
Lesson 11 - R Chapter 11 Review:
Hypothesis Testing Part 2: Categorical variables
Making Use of Associations Tests
Applied Statistics Using SPSS
Applied Statistics Using SPSS
 .
Exercise 1 (a): producing individual tables, using the cross-tabs menu
SEM: Step by Step In AMOS and Mplus.
Practical Solutions Analysis of Variance
Presentation transcript:

Practical Solutions Comparing Proportions & Analysing Categorical Data

2 Practical Solutions * Chi-square / Fisher’s test. CROSSTABS /TABLES=GROUP BY CAT2 /FORMAT= AVALUE TABLES /STATISTIC=CHISQ /CELLS= COUNT ROW /COUNT ROUND CELL /METHOD=EXACT TIMER(5). 1.Treatment group (GROUP) and the status at time 2 (CAT2) are nominal categorical variables and as they are not repeated measures of the same outcome we can the Chi-square test or Fisher’s exact test. (Analyze  Descriptive Statistics  Crosstabs…). The step by step instructions for doing this are contained in the notes but the syntax for this is included below:

3 Practical Solutions 1.The SPSS output is included below (the key sections are highlighted):

4 Practical Solutions 1.The presentation for this type of analysis was mentioned in the notes but you should include the cross-tabulation alongside only the most appropriate p value. As there are no cells with an expected count below 5 here (as shown by the highlighted footer in the SPSS output) we should use the Pearson Chi-square p value. There was found to be no significant association (Pearson Chi- square: p = 0.147) between HbA1c status at follow-up and treatment group. Difference in proportions and association are the same test and the names are often used interchangeably. Therefore as there was no association that means there is no significant difference in proportions.

5 Practical Solutions * Filtering the data to select only 2 treatment groups. USE ALL. COMPUTE filter_$=(GROUP=1 OR GROUP=3). VARIABLE LABEL filter_$ 'GROUP=1 OR GROUP=3 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMAT filter_$ (f1.0). FILTER BY filter_$. EXECUTE. * Re running the Chi-square analysis. CROSSTABS /TABLES=GROUP BY CAT2 /FORMAT= AVALUE TABLES /STATISTIC=CHISQ /CELLS= COUNT ROW /COUNT ROUND CELL. * Turning off the filter. FILTER OFF. USE ALL. EXECUTE. 2.The selection of specific cases was covered earlier (Data  Select Cases…). One way of writing the syntax to select just 2 of the 3 treatment groups is given below:

6 Practical Solutions The Chi-square test is still the most appropriate and although the p value is now very close to 0.05 we still have the same conclusion. 2.The SPSS output is included below (the key sections are highlighted):

7 Practical Solutions (You should present these results in the same way as shown earlier) The CI shows that differences in either direction are just about possible (hence so is no difference), but that the difference could be as large as almost 27% 2.When we produce the 95% confidence interval for the difference in proportions of patients with HbA1c >= 7 you can see that the CI just includes zero at the lower end. This agrees with our borderline significant p value from SPSS earlier.

8 Practical Solutions * Computing the Chi-square / Fisher’s exact values. CROSSTABS /TABLES=GROUP BY SevCAT2 /FORMAT= AVALUE TABLES /STATISTIC=CHISQ /CELLS= COUNT ROW /COUNT ROUND CELL /METHOD=EXACT TIMER(5). 3.We use the same analysis as was used for question 1. The syntax for this is included below:

9 Practical Solutions 3.The SPSS output is included below (the key sections are highlighted):

10 Practical Solutions 3.Again, the presentation for this type of analysis was mentioned in the notes but you should include the cross-tabulation alongside only the most appropriate p value. This time there are 3 cells with an expected count below 5 here (as shown by the highlighted footer in the SPSS output). In this case we should use Fisher’s exact test p value. There was found to be no significant association (Fisher’s exact test: p = 0.641) between severe HbA1c status at follow-up and treatment group. Both tests indicate no significant difference but we should report the Fisher’s exact test result here. This is due to at least one expected count being below 5 and hence the Pearson Chi- square assumption is not met.

11 Practical Solutions * Calculating the one variable Chi-square. NPAR TEST /CHISQUARE=CAT1 /EXPECTED= /MISSING ANALYSIS /METHOD=EXACT TIMER(5). 4.This time we are looking at testing one variable against an expected set of proportions. To do this wee need to use the one variable Chi-square (Analyze  Nonparametric tests  Chi- square..). The key element here is realising that category 0 is =7 and is expected to be 40%, hence we need to enter 0.6 first and then 0.4. Again a step by step guide for this is included in the notes but the syntax has been included below:

12 Practical Solutions There is highly significant evidence that the sample does not have 60% of patients with a HbA1c =7. We can use the Chi- square (Asymp. Sig. value) here as the expected count assumption is met. 4.The SPSS output is included below (the key sections are highlighted):

13 Practical Solutions The CI also indicates that the proportion is higher than 0.4, with the minimum likely value being 48.8% (0.488). 4.From looking at the CIA confidence interval we can see that (with >=7 as the ‘feature’) the confidence interval for the ‘feature’ excludes the test value of 0.4, hence agreeing with our SPSS finding.

14 Practical Solutions * McNemar test. CROSSTABS /TABLES=CAT1 BY CAT2 /FORMAT= AVALUE TABLES /STATISTIC=MCNEMAR /CELLS= COUNT TOTAL /COUNT ROUND CELL. 5.The status variables at time 1 and time 2 (CAT1 & CAT2) are nominal categorical variables that are repeated measures of the same outcome. Due to this we need to use McNemars test to assess if there was a significant change in response (Analyze  Descriptive Statistics  Crosstabs…). The step by step instructions for doing this are contained in the notes but the syntax for this is included below:

15 Practical Solutions It can be seen that the percentages that are changing in each direction are quite different (27.3% and 0.4%), so it is no surprise to see a highly significant McNemar p value (p < 0.001). 5.The SPSS output is included below (the key sections are highlighted):

16 Practical Solutions (You should present these results in the same way as shown earlier) The CI shows that there are quite large differences in favour of a reduction in HbA1c levels, with at least 21.1% more of patients improving than getting worse. 5.When we produce the 95% confidence interval for the difference in proportions of patients changing HbA1c status in each direction it can be seen that the CI is quite a long way from zero. This agrees with our highly significant p value from SPSS earlier.

17 Practical Solutions 5.Yet again the presentation for this type of analysis was mentioned in the notes but it should include the cross-tabulation, alongside the McNemar p value and a confidence interval for the difference in proportions changing in each direction. There was found to be a highly significant change in HbA1c control status (McNemar test: p < 0.001) between the two measurements in favour of improving control or a lowering of HbA1c levels (Difference 26.9%, 95% CI: 21.1% to 32.5%).

18 Practical Questions 6.We need to enter the data as a summary table in SPSS in the following fashion:

19 Practical Questions 6.Remember that we also need to weight the cases by the count variable: Having applied the weights we can move on to assess the agreement between the raters. We should use the Kappa technique to do this and step by step instructions were included in the session notes. Syntax for both of these steps is included above. * Weighting the cases. WEIGHT BY Count. * Producing the Kappa. CROSSTABS /TABLES=Rater1 BY Rater2 /FORMAT= AVALUE TABLES /STATISTIC=KAPPA /CELLS= COUNT TOTAL /COUNT ROUND CELL.

20 Practical Solutions The outlined cells indicate agreement between the two raters. The absolute agreement is 78% ( =78). The Kappa statistic of indicates that there is a good level of agreement. 6.The SPSS output is included below (the key sections are highlighted):

21 Practical Solutions The CI shows that in the worst case the agreement between the raters may only be (or of moderate level). 6.Using CIA we can get a 95% CI for the Kappa statistic: