INFERENTIAL STATISTICS: DIFFERENCE TESTS

Slides:



Advertisements
Similar presentations
Hypothesis testing 5th - 9th December 2011, Rome.
Advertisements

© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Hypothesis Testing IV Chi Square.
Analysis of frequency counts with Chi square
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Lecture 9: One Way ANOVA Between Subjects
Non-parametric statistics
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Inferential Statistics
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
AM Recitation 2/10/11.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Inferential Statistics: SPSS
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
T-TEST Statistics The t test is used to compare to groups to answer the differential research questions. Its values determines the difference by comparing.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
ANOVA: Analysis of Variance.
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Chapter Eight: Using Statistics to Answer Questions.
Chapter 13 Understanding research results: statistical inference.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Stats Methods at IC Lecture 3: Regression.
Psych 200 Methods & Analysis
Inferential Statistics
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 3: Comparing between groups Lecturer: Mahrita Harahap
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 4: Bivariate Analysis (Contingency Analysis and Regression Analysis)
Statistical Significance
Dependent-Samples t-Test
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014  
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Statistics for Managers Using Microsoft Excel 3rd Edition
Lecture Slides Elementary Statistics Twelfth Edition
Dr. Siti Nor Binti Yaacob
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/1.
Comparing Three or More Means
Hypothesis Testing Review
Hypothesis testing. Chi-square test
APPROACHES TO QUANTITATIVE DATA ANALYSIS
© LOUIS COHEN, LAWRENCE MANION AND KEITH MORRISON
INFERENTIAL STATISTICS: REGRESSION ANALYSIS AND STANDARDIZATION
CHOOSING A STATISTICAL TEST
Non-Parametric Tests.
Qualitative data – tests of association
Data Analysis and Interpretation
Spearman’s rho Chi-square (χ2)
MANOVA: Multivariate Analysis of Variance or Multiple Analysis of Variance D. Gordon E. Robertson, PhD, School of Human Kinetics, University of Ottawa.
Inferential Statistics
Inferential Statistics
Inferential Statistics
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Kin 304 Inferential Statistics
The Chi-Square Distribution and Test for Independence
Hypothesis testing. Chi-square test
Non-parametric tests, part A:
Inferential Stat Week 13.
One way ANOVA One way Analysis of Variance (ANOVA) is used to test the significance difference of mean of one dependent variable across more than two.
What are their purposes? What kinds?
Some statistics questions answered:
Parametric versus Nonparametric (Chi-square)
Analysis of Variance: repeated measures
Understanding Statistical Inferences
Chapter Nine: Using Statistics to Answer Questions
Exercise 1 Use Transform  Compute variable to calculate weight lost by each person Calculate the overall mean weight lost Calculate the means and standard.
InferentIal StatIstIcs
BUS-221 Quantitative Methods
Presentation transcript:

INFERENTIAL STATISTICS: DIFFERENCE TESTS © LOUIS COHEN, LAWRENCE MANION AND KEITH MORRISON © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

STRUCTURE OF THE CHAPTER Measures of difference between groups The t-test (a difference test for parametric data) Analysis of variance (a difference test for parametric data) The chi-square test (a difference test and a test of goodness of fit for non-parametric data) Degrees of freedom (a statistic that is used in calculating statistical significance in considering difference tests) The Mann-Whitney and Wilcoxon tests (difference tests for non-parametric data) The Kruskal-Wallis and Friedman tests (difference tests for non-parametric data) © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

MEASURES OF DIFFERENCE BETWEEN GROUPS Are there differences between two or more groups of sub-samples, e.g.: Is there a statistically significant difference between the amount of homework done by boys and girls? Is there a statistically significant difference between test scores from four similarly mixed-ability classes studying the same syllabus? Does school A differ statistically significantly from school B in the stress level of its sixth-form students? © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

DIFFERENCE TESTS VARY ACCORDING TO . . . . . . the kind of data with which one is working (parametric or non-parametric). . . . the number of groups being compared. . . . whether the groups are related or independent. Independent groups are entirely unrelated to each other, e.g. males and females; related groups might be the same group voting on two or more variables or the same group voting at two different points in time (e.g. a pre-test and a post-test). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

MEASURES OF DIFFERENCE BETWEEN GROUPS The t-test (for two groups): parametric data. Analysis of Variance (ANOVA) (for three or more groups): parametric data. The chi-square test: for categorical data. The Mann-Whitney and Wilcoxon tests (for two groups): non-parametric data. The Kruskal-Wallis and the Friedman tests (for three or more groups): non-parametric data. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

T-TEST Devised by William Gossett in 1908. Used when we have two conditions; the t-test assesses whether there is a statistically significant difference between the means of the two independent groups or the same group under two conditions. The independent t-test is used when the participants are independent of each other. The related or paired t-test is used when the same participants perform in two conditions, e.g. two time points, two variables. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

SAFETY CHECKS FOR THE T-TEST Parametric continuous data with the dependent variable at interval or ratio level. Random sampling and random allocation. Normal distribution of the data (though large samples often overcome this). Equality of variance (similarity/equality of variance in each group: ‘homogeneity of variance’), though the Levene test can overcome problems here. If these safety requirements are not met then the researcher should use a non-parametric difference test (e.g. Mann-Whitney U test; the Wilcoxon test), even if the data are interval or ratio. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

T-TEST FOR PARAMETRIC DATA t-tests (parametric, interval and ratio data) To find if there are differences between two groups Decide whether they are independent or related samples Independent sample: two different groups on one occasion Related sample: one group on two occasions © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

T-TEST FOR PARAMETRIC DATA Formula for computing the t-test Sample one mean – sample two mean t =  Standard error of the difference in means © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

Formula for calculating t M = Mean d = difference between the means N = Number of cases © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

INDEPENDENT AND RELATED SAMPLES IN A T-TEST: EXAMPLES Independent sample (two groups) A group of scientists wants to study the effects of a new drug for insomnia. They have applied this drug to a random group of people (control group) and to a group of people suffering from insomnia (experimental group). Related sample (same group in two conditions) A group of therapists wants to study whether there is any difference in doing relaxation techniques on the beach or in an apartment. A group of people is asked to first do relaxation on the beach and later in an apartment. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

INDEPENDENT AND RELATED SAMPLES IN A T-TEST: AN EXAMPLE 24 people were involved in an experiment to determine whether background noise affects short-term memory (recall of words). If half of the sample were allocated to the NOISE condition and the other half to the NO NOISE condition (independent sample) – we use independent t-test; If everyone in the sample has performed at both conditions (related sample) – we use paired or related t-test. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

AN EXAMPLE OF A T-TEST Participants were asked to memorize a list of 20 words in two minutes. Half of the sample performs in a noisy environment and the other half in a quiet environment. Independent variable – two types of environment: Quiet environment (NO NOISE condition) Noisy environment (NOISE condition) Dependent variable – the number of words each participant can recall. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

NOISE NO NOISE 5 15 10 9 6 16 7 3 18 17 13 11 12  = 87  = 166 = 7.3 = 13.8 SD = 2.5 SD = 2.8 NOTE: participants vary within conditions: in the NOISE condition, the scores range from 3 to 11; in the NO NOISE condition, they range from 9 to 18. The participants differ between the conditions too: the scores of the NO NOISE condition, in general, are higher than those in the NOISE condition – the means confirm it. Are the differences between the means of our groups large enough for us to conclude that the differences are due to our independent variable: NOISE/NO NOISE manipulation? © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

T-TEST FOR INDEPENDENT SAMPLES Group statistics In which condition are you? N Mean Std. Deviation Std. Error How many words can you recall? NOISE 12 7.2500 .71906 NO NOISE 13.8333 .79614 This shows: the name of two conditions; the number of cases in each condition; the mean of each condition; the standard deviation and standard error of the mean, of the two conditions. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

T-TEST FOR INDEPENDENT SAMPLES (SPSS) Independent Samples Test Levene’s Test for Equality of Variances t-test for Equality of Means F Sig t df Sig. (2-tailed) Mean Differences Std. Error Differences 95% Confidence Interval of the Difference Lower Upper How many words can you recall? Equal variances assumed .177 .676 -6.137 22 .000 -6.5833 1.07279 -8.808 -4.359 Equal variances not assumed 21.78 -8.809 -4.357 The Levene test is for ‘homogeneity of variance’, and the t-test here indicates whether you should use the upper or lower row. Mean Difference means the difference between the means of the two groups. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

REPORTING FINDINGS FROM THE EXAMPLE Participants in the NOISE condition recalled fewer words (t (22) = 7.25, SD = 2.49) than those in the NO NOISE condition (t (22) = 13.83, SD = 2.76). The mean difference between conditions was 6.58; the 95 per cent confidence interval for the estimated population mean difference is between 4.36 and 8.81. An independent t-test revealed that, if the null hypothesis is true, such a result would be highly unlikely to have arisen (t (22) = 6.14; p < 0.001). It is therefore concluded that listening to noise affects short-term memory, at least in respect of word recall. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

T-TEST FOR INDEPENDENT SAMPLES WITH SPSS Group Statistics Which group are you? N Mean Std. Deviation Std. Error Mean Mathematics post-test score Control group 166 8.69 1.220 .095 Experimental Group One 9.45 .891 .069 © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means 95% Confidence Interval of the Difference F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference Lower Upper Mathematics post-test score Equal variances assumed 28.856 .000 -6.523 330 -.765 .117 -.996 -.534 Equal variances not assumed 302.064 Read the line ‘Levene’s Test for Equality of Variances’. If the probability value is statistically significant then your variances are unequal; otherwise they are regarded as equal. If the Levene’s probability value is not statistically significant then you need the row ‘equal variances assumed’; if the Levene’s probability value is statistically significant then you need the row ‘equal variances not assumed’. Look to the column ‘Sig. (2-tailed)’ and the appropriate row, and see if the results are statistically significant. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

PAIRED SAMPLE T-TEST (SAME GROUP UNDER TWO CONDITIONS) WITH SPSS Paired Samples Statistics Mean N Std. Deviation Std. Error Pair 1 Mathematics pre-test score 6.95 252 1.066 .067 Mathematics post-test score 8.94 1.169 .074 This indicates: The two conditions; The mean of each condition; The number of cases in each condition; The standard deviation and standard error of the mean, for the two conditions. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

PAIRED SAMPLE T-TEST (SAME GROUP UNDER TWO CONDITIONS) WITH SPSS Paired Samples Correlations N Correlation Sig. Pair 1 Mathematics pre-test score & Mathematics post-test score 252 .020 .749 This shows that there is no association between the scores on the pre-test and the scores on the post-test for the group in question (r = .02 and  = .749). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

PAIRED SAMPLE T-TEST (SAME GROUP UNDER TWO CONDITIONS) WITH SPSS Paired Samples Test Paired Differences t df Sig. (2-tailed) Mean Std. Deviation Std. Error Mean 95% Confidence Interval of the Difference Lower Upper Pair 1 Mathematics pre-test score - Mathematics post-test score -1.992 1.567 .099 -2.186 -1.798 -20.186 251 .000 This shows that : The difference between the mean of each condition (6.95 and 8.94) is 1.992. The confidence intervals shows that we are 95 per cent certain that the population difference lies somewhere between –2.186 and –1.798. There is a statistically significant difference between the two sets of scores. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

RESULT It can be seen from the paired t-test that the hypothesis is not supported (t (251) = 20.186;  = .000). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

DEGREES OF FREEDOM The number of individual scores that can vary without changing the sample mean. The number of scores one needs to know before one can calculate the others. E.g.: If you are asked to choose two numbers that must add up to 100, and the first is 89, then the other has to be 11; there is 1 degree of freedom (89 + x = 100). If you are asked to choose three numbers that must add to 100, and the first of these is 20, then you have 2 degrees of freedom (20 + x + y = 100). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

DEGREES OF FREEDOM (WITH SPSS) Which group are you * Who are you Crosstabulation Chinese or non-Chinese Total Chinese Non-Chinese Which group are you? Control group 156 10 166 94.0% 6.0% 100.0% Experimental Group One .0% Group Two 143 25 168 85.1% 14.9% 465 35 500 93.0% 7.0% Degrees of freedom = 2 (1 degree of freedom in each of two rows, which fixes what must be in the third row) © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

ANALYSIS OF VARIANCE (ANOVA) Parametric, interval and ratio data. To see if there are any statistically significant differences between the means of two or more groups. It calculates the grand mean (i.e. the mean of the means of each condition) and sees how different each of the individual means is from the grand mean. Premised on the same assumptions as t-tests (random sampling, a normal distribution of scores, independent variable(s) is/are categorical (e.g. teachers, students) and one is a continuous variable (e.g. marks on a test)). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

ANOVA AND MANOVA One-way analysis of variance (one categorical independent variable and one continuous dependent variable). Two-way analysis of variance (two categorical independent variables and one continuous dependent variable). Multiple analysis of variance (MANOVA) (one categorical independent variable and two or more continuous variables). Post-hoc tests (e.g. Tukey hsd test, Scheffé test) to locate where differences between means lie (in which group(s)). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

SAFETY CHECKS FOR ANOVA Continuous parametric data. Random sampling and random allocation. Normal distribution of the data (though large samples often overcome this). Homogeneity (equality) of variances (though the Levene test can identify problems here, and the Brown-Forsythe and Welch tests can overcome problems here). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

SAFETY CHECKS FOR MANOVA Continuous parametric data for dependent variables. Independent variables are categorical, with two or more values. Groups are independent of each other. Random sampling and random allocation. Adequate sample size (more cases in each cell than the number of dependent variables being studied). Normal distribution of the data (though large samples can overcome this). No outliers. A linear relationship between each pair of dependent variables. No multicollinearity (dependent variables are independent of each other but moderately correlated). Homogeneity (equality) of variances (though the Levene test can identify problems here, and the Brown-Forsythe and Welch tests can overcome the problem here). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

FORMULA FOR ANOVA © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

Between-groups and within-groups variance: 9 15 21 25 16 17 22 26 = 9 = 15.4 = 22.2 Between-groups and within-groups variance: Variation between the groups (9 to 22.2); Variation within the first group (no variation since all participants scored the same); Variation within the second group (from 15 to 16); Variation within the third group (from 17 to 26). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

ANOVA First, ANOVA calculates the mean for each of the three groups. Then it calculates the grand mean (the three means added then divided by three). For each group separately, the total deviation of each individual’s score from the mean of the group is calculated (within-groups variation). Then the deviation of each group mean from the grand mean is calculated (between-groups variation). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

F RATIO When we conduct our experiment, we hope that the between-groups variance is very much larger than the within-groups variance, in order to get a bigger F ratio. This shows us that one (or more) of the individual group means is significantly different from the grand mean. However, it does not tell us which means are statistically significantly different. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

F (3,13) = .420,  = .742 Between-groups variation Within-groups © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

RESULTS An F ratio of .420 has been given, with a probability of  = .742. This tells us that there is no statistically significant difference between any of the groups. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

EFFECT SIZE: PARTIAL ETA SQUARED SSeffect = The sums of the squares for whatever effect is of interest. SSerror = The sums of the squares for whatever error term is associated with that effect. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

EFFECT SIZE: PARTIAL ETA SQUARED IN SPSS Between-Subjects Factors Value Label N Which group are you? 1 Control group 166 2 Experimental Group One 3 Experimental Group Two 168 © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

EFFECT SIZE: PARTIAL ETA SQUARED IN SPSS Tests of Between-Subjects Effects Dependent Variable:Mathematics post-test score Source Type III Sum of Squares df Mean Square F Sig. Partial Eta Squared Corrected Model 48.583a 2 24.291 23.168 .000 .085 Intercept 41113.093 1 39211.301 .987 group 48.583 Error 521.105 497 1.049 Total 41684.000 500 Corrected Total 569.688 499 a. R Squared = .085 (Adjusted R Squared = .082) © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

THE POST HOC TUKEY TEST The null hypothesis for the F-test ANOVA is always that the samples come from populations with the same mean (i.e. no statistically significant differences): H0 = μ1 = μ2 = μ3 = … If the p-value is so low that we reject the null hypothesis, we have decided that at least one of these populations has a mean that is not equal to the others. The F-test itself only tells us that there are differences at least between one pair of means, not where these differences lie. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

POST HOC TESTS To determine which samples are statistically significantly different; after having performed the F-test and rejected the null hypothesis, we turn to post hoc comparisons. The purpose of a post hoc analysis is to find out exactly where those differences are; Post hoc tests allow us to make multiple pairwise comparisons and determine which pairs are statistically significantly different from each other and which are not. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

THE POST HOC TUKEY TEST Tukey’s Honestly Significant Difference (hsd) Test is used to test the hypothesis that all possible pairs of means are equal. Tukey’s hsd test compares the mean differences between each pair of means to a critical value. If the mean difference from a pair of means exceeds the critical value, we conclude that there is a significant difference between these pairs. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

THE SCHEFFÉ TEST The Scheffé test is very similar to the Tukey hsd test, but it is more stringent than the Tukey test in respect of reducing the risk of a Type I error, though this comes with some loss of power – one may be less likely to find a difference between groups in the Scheffé test. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

THE GAMES-HOWELL TEST The Games-Howell test is similar to the Tukey hsd test, and is used if the sub-sample sizes are unequal. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

FINDING PARTIAL ETA SQUARED IN SPSS Multivariate Testsb Effect Value F Hypothesis df Error df Sig. Partial Eta Squared scores Pillai's Trace .675 1033.477a 1.000 497.000 .000 Wilks' Lambda .325 Hotelling's Trace 2.079 Roy's Largest Root scores * group .040 10.366a 2.000 .960 .042 a. Exact statistic b. Design: Intercept + group Within Subjects Design: scores © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

USING TUKEY TO LOCATE DIFFERENCE IN SPSS Multiple Comparisons MEASURE_1 Tukey HSD (I) Which group are you? (J) Which group are you? Mean Difference (I-J) Std. Error Sig. 95% Confidence Interval Lower Bound Upper Bound Control group Experimental Group One -.42* .084 .000 -.61 -.22 Experimental Group Two -.15 .173 -.35 .05 .42* .22 .61 .27* .005 .07 .46 .15 -.05 .35 -.27* -.46 -.07 Based on observed means. The error term is Mean Square(Error) = .588. * The mean difference is significant at the .05 level. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

USING TUKEY TO LOCATE DIFFERENCE IN SPSS MEASURE_1 Tukey HSDa,,b,,c Which group are you? N Subset 1 2 Control group 166 7.86 Experimental Group Two 168 8.01 Experimental Group One 8.27 Sig. .174 1.000 Means for groups in homogeneous subsets are displayed. Based on observed means. The error term is Mean Square(Error) = .588. a. Uses Harmonic Mean Sample Size = 166.661. b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed. c. Alpha = .05. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CHI-SQUARE A measure of a relationship or an association developed by Karl Pearson in 1900. Measures the association between two categorical variables. Compares the observed frequencies with the expected frequencies. Determines whether two variables are independent. Allows us to find out whether various sub-groups are homogeneous. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

TYPES OF CHI-SQUARE One-variable Chi-Square (goodness-of-fit test) – used when we have one variable. Chi-Square test for independence: 2 x 2 – used when we are looking for an association between two variables, with two levels, e.g. the association between (drinking alcohol/does not drink alcohol) and (smoking/does not smoke). Chi-Square test for independence: r x c – used when we are looking for an association between two variables, where one has more than two levels (heavy smoker, moderate smoker, does not smoke) and (heavy drinker, moderate drinker, does not drink). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

FORMULA FOR CHI-SQUARE 2 = Where: O = observed frequencies E = expected frequencies  = the sum of © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

ONE-VARIABLE CHI-SQUARE OR GOODNESS-OF-FIT TEST Enables us to discover whether a set of obtained frequencies differs from an expected set of frequencies. One variable only. The numbers that we find in the various categories are called the observed frequencies. The numbers that we expect to find in the categories, if the null hypothesis is true, are the expected frequencies. Chi-Square compares the observed and the expected frequencies. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

EXAMPLE: PREFERENCE FOR CHOCOLATE BARS A sample of 120 people were asked which of four chocolate bars they preferred. We want to find out whether some brands (or one brand) are preferred over others – Research Hypothesis; • If some brands are not preferred over others, then all brands should be equally represented – Null Hypothesis; • If the Null Hypothesis is true, then we expect 30 (120/4) people in each category. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

ONE-VARIABLE CHI-SQUARE OR GOODNESS-OF-FIT TEST Frequencies Chocolate A Chocolate B Chocolate C Chocolate D Observed 20 70 10 Expected 30 If all brands of chocolate are equally popular, the observed frequencies will not differ much from the expected frequencies; If, however, the observed frequencies differ a lot from the expected frequencies, then it is likely that all brands are not equally popular. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

ONE-VARIABLE CHI-SQUARE/GOODNESS-OF-FIT TEST Observed N Expected N Residual (Difference between observed and expected frequencies) Brand A 20 30 -10.0 Brand B 70 40.0 Brand C 10 -20.0 Brand D Total 120 A chi-square value of 73.3, df = 3 was found to have an associated probability level of 0.000. A statistically significant difference was found between the observed and the expected frequencies, i.e. all brands of chocolate are not equally popular. More people prefer chocolate B (70) than the other bars of chocolate. Chocolate Chi-square df Asymp. Sig 73.333 3 .000 © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CHI-SQUARE TEST FOR INDEPENDENCE (BIVARIATE): 2 X 2 Enables us to discover whether there is a relationship or association between two categorical variables of two levels. If there is no association between the two variables, then we conclude that the variables are independent of each other. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

A WORKED EXAMPLE Imagine that we have asked 110 students the following: Do you smoke and drink? Do you smoke but do not drink? Do you not smoke but drink? Do you abstain from both? Each student can only fall into one group, and thus we have four groups (they must be mutually exclusive). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CHI-SQUARE TEST FOR INDEPENDENCE: 2 X 2 (WITH SPSS) Do you drink? * Do you smoke? Crosstabulation Do you smoke? Yes No Total Do you drink? Count 50 15 65 Expected Count 41.4 23.6 65.0 20 25 45 28.6 16.4 45.0 70 40 110 70.0 40.0 110.0 © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

CHI-SQUARE TEST FOR INDEPENDENCE: 2 X 2 (WITH SPSS) Value df Asymp. Sig. Exact Sig. (2-sided) Exact Sig. (1-sided) Pearson Chi-Square 12.12 1 .000 Continuity Correction 10.759 .001 Likelihood Ratio 12.153 Fisher’s Exact Test Linear-by-Linear Association 12.011 N of Valid Cases 110 Chi-Square = 12.12 df (degrees of freedom) = (columns –1) x (rows –1) = (2–1) x (2–1) = 1 © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

RESULTS A 2 x 2 Chi-Square was carried out to discover whether there was a statistically significant relationship between smoking and drinking. The Chi-Square value of 12.12 has an associated probability value of p < 0.001, df = 1, showing that such an association is extremely unlikely to have arisen as a result of sampling error. It can therefore be concluded that there is a statistically significant association between smoking and drinking. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

MANN-WHITNEY U-TEST FOR INDEPENDENT SAMPLES Mann-Whitney (non-parametric, nominal and ordinal data) for two groups under one condition. Difference between two independent groups (independent samples), based on ranks. This is the non-parametric equivalent of the t-test for independent samples. Find the significant differences and then run a crosstabs to look at where the differences lie. Note where there are NO statistically significant differences as well as where there are statistically significant differences. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

MANN-WHITNEY U-TEST (SPSS) © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

MANN-WHITNEY U-TEST (SPSS) © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

THE WILCOXON TEST FOR RELATED SAMPLES This is the non-parametric equivalent of the t-test for related samples. For paired (related) samples in a non-parametric test, e.g. the same group under two conditions. For example, here is the result for one group of females who have rated (a) their own ability in mathematics and (b) their enjoyment of mathematics, both variables using a five-point scale (‘not at all’ to ‘a very great deal’). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

THE WILCOXON TEST FOR RELATED SAMPLES (SPSS) Ranks N Mean Rank Sum of Ranks How good at mathematics do you think you are? - How much do you enjoy mathematics? Negative Ranks 11 94.08 11101.00 Positive Ranks 73 99.11 7235.00 Ties 57 Total 248 Test Statisticsb How good at mathematics do you think you are? - How much do you enjoy mathematics? Z -2.631a Asymp. Sig. (2-tailed) .009 a. Based on positive ranks. b. Wilcoxon Signed Ranks Test © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

KRUSKAL-WALLIS TEST FOR INDEPENDENT SAMPLES Kruskal-Wallis (non-parametric, nominal and ordinal data) for three or more independent groups under one condition. Difference between more than two independent groups (independent samples), based on ranks. This is the non-parametric equivalent of ANOVA for independent samples. Find the statistically significant differences and then run a crosstabs to look at where the differences lie. Note where there are NO statistically significant differences as well as where there are statistically significant differences. © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

KRUSKAL-WALLIS TEST (SPSS) © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

KRUSKAL-WALLIS TEST (SPSS) © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors

THE FRIEDMAN TEST FOR THREE OR MORE RELATED GROUPS This is the non-parametric equivalent of ANOVA for related samples. For three or more related samples in a non-parametric test, e.g. the same groups under two conditions. For example, the result for four groups of students, grouped according to their IQ (Group 1 = IQ up to 90; Group 2 = IQ from 91 to 110; Group 3 = IQ from 111 to 125; Group 4 = IQ over 125), who have rated (a) their own ability in mathematics and (b) their enjoyment of mathematics, both variables using a five-point scale (‘not at all’ to ‘a very great deal’). © 2018 Louis Cohen, Lawrence Manion and Keith Morrison; individual chapters, the contributors