Chapter 17: Statistical Analysis
CONTENTS The statistics approach Statistical tests – Types of data and appropriate tests – Chi-square – Comparing two means: the t-test – A number of means: one-way analysis of variance – A table of means: factorial analysis of variance – Correlation – Linear regression – Multiple regression – Factor and cluster analysis
The statistics approach Probabilistic statements The normal distribution Probabilistic statement formats Significance The null hypothesis Dependent and independent variables. A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Probabilistic statements descriptive: e.g. : 10% of adults play tennis comparative: e.g. : 10% play tennis, but 12% play golf relational: e.g. 15% of people with high incomes play tennis but only 7% of people with low incomes do so: there is a positive relationship between tennis-playing and income. However: when based on a samples, the above must be made using a probabilistic format A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Probabilistic statements contd We can be 95% confident that the proportion of adults that plays tennis is between 9% and 11% The proportion of golf players is significantly higher than the proportion of tennis players (at the 95% level of probability) There is a positive relationship between level of income and level of tennis playing (at the 95% level) (See discussion of Confidence intervals: Chapt 13). A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Probabilistic statement formats 95% probability – sometimes expressed as 5% – sometimes as % probability is also used – also expressed 1% or 0.01 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Normal distribution (Fig. 17.1): a. Drawing repeated samples (theory) A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Normal distribution contd b. Normal distribution/curve A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Normal curve (Fig. 13.1) A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Significance Statistically significant: unlikely to have happened by chance (highly probable) Level of significance is affected by sample size (not by population size) Probability of finding happening by chance related to normal curve and similar theoretical distributions. But NB: small differences or weak relationships may not be socially or managerially significant – even when they are statistically significant A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Null hypothesis H 0 – Null hypothesis: there is no significant difference or relationship H 1 – Alternative hypothesis: there is a significant difference or relationship eg. – H 0 tennis and golf participation levels are the same; – H 1 tennis and golf participation levels are significantly different. A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Dependent and independent variables Independent variable 2 Independent variable 3 Dependent variable Independent variable 1 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Statistical tests TaskFormat of data No. of var’bles Types of variable Test Relationship between 2 variables Crosstabulation of frequencies 2NominalChi-square Difference between 2 means - paired Means: for a whole sample 2Two scale/ordinal t-test - paired Difference between 2 means – indep- endent samples Means: for 2 sub- groups 21. scale/ordinal (means) 2. nominal (2 grps only) t-test – independent samples A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Statistical tests contd Task Format of data No. of var’bles Types of variableTest Relationship between 2 variables Means - for 3+sub- groups 2 1. scale/ ordinal (means) 2. nominal (3+ groups) One-way analysis of variance Relationship between 3 or more variables Means: crosstabulat ed 3+1. scale/ordinal (means) 2. Two or more nominal Factorial analysis of variance A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Statistical tests contd Task Format of data No. of var’bles Types of variable Test Relationship between 2 variables Individual measures 2 Two scale/ ordinal Correlation Linear relationship between 2 variables Individual measures 2Two scale/ ordinal Linear regression Linear relationship between 3+ variables Individual measures 3+ Three or more scale/ ordinal Multiple regression Relationships between large nos of variables Individual measures ManyLarge nos of scale/ ordinal Factor analysis Cluster analysis A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Data Extended version of Campus Sporting Life survey with – additional variables – additional cases See Appendix 17.2 SPSS used, as in Chapt. 16 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
2 ) Chi-square (X 2 ) Testing the relationship between two variables presented in a frequency crosstabulation. Null/alternative hypotheses: – H 0 - there is no relationship between student status and gender in the population – H 1 - there is a relationship between status and gender in the population Findings (Fig. 17.5): – Value of Chi-square: – Significance: – Less that 0.05 (5%) – Conclusion: H 0 rejected, H 1 accepted: there is a relationship A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Comparing two means: t-test Paired samples: whole sample: comparing means for two variables Independent samples: sample divided into two groups (eg. males and females) and comparing means for one variable A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Comparing 2 means: t-test : Paired samples (Fig. 17.9) Example 1: Compare average times played sport in last 3 months (12.2) with average times visited national parks (9.8) Difference is 2.4 value of t is Significance is 0.219, which is larger than 0.05 Null hypothesis is accepted: difference is not significant A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Comparing 2 means: t-test : Paired samples (Fig ) Compare course costs for males ($ pa) and females ($136.60) Difference is $28.60 value of t is significance is Null hypothesis is accepted: difference is not significant A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
One-way analysis of variance (ANOVA) (Fig , 13) Means of one variable for groups defined by another variable F-test rather than r-test eg. Means of times played sport by student status: – F/T student/no paid work: mean = 9.7 times in 3 months – F/T student/paid work: 9.6 times – P/T student – F/T job: 19.1 times – P/T student – Other: 12.2 times Value of F: 2.485, Significance 0.072, which is greater than 0.05 Null hypothesis accepted: no relationship between status and sport But for ‘going out for a meal’: F = 6.64 and Sig. = 0.001, which is less than 0.05, so null hypothesis rejected: there is a significant relationship A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Factorial analysis of variance (Fig , 15) Status not significant and gender not significant But for status x gender: F = 3.681, Sig. = 0.019, which is <0.05, so null hypothesis rejected: there is a significant relationship. Mean number of visits in three months StatusMaleFemale F/T student/no paid work F/T student/paid work P/T student - F/T job P/T student/Other A table of means: two variables and means of a third eg. Mean visits to theatre by gender by student status A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig ) Watched sport by income: weak positive: r =.46 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig. 7.16) Played sport by income: weak negative: r = A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig. 7.16) Sport exp. by income: strong positive: r= 0.91 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig ) Correlation coefficient (r) expresses the relationship numerically No relationship: r =0 Exact relationship: r = 1 (positive) -1 (negative) Correlation matrix shows correlations between a number of variables A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation matrix (simplified Fig ) IncomeSportWatch sport Visit park MealSport exp. Income1.00 Sport-.44**1.00 Watch sport.46**-.68**1.00 Visit park *1.00 Meal.08.45**-.29* Sport exp..91**-.37** * = significant at the 0.05 level ** = significant at the 0.01 level A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Regression Fits best fit ‘regression line’ to scatterplot: Fig A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Regression: best fit may be a curve (Fig ) A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Multi-variate analysis Multiple regression has one dependent variable and a number of independent, influencing, variables One development: Structural Equation Modelling explores inter-relationships between a number of variables Cluster and factor analysis: combine large numbers of variables into groups – eg. lifestyle or personality groups A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge