Learning outcomes By the end of this session you should know about:

Slides:



Advertisements
Similar presentations
To Select a Descriptive Statistic
Advertisements

A PowerPoint®-based guide to assist in choosing the suitable statistical test. NOTE: This presentation has the main purpose to assist researchers and students.
CHAPTER TWELVE ANALYSING DATA I: QUANTITATIVE DATA ANALYSIS.
Departments of Medicine and Biostatistics
Statistical Tests Karen H. Hagglund, M.S.
MSc Applied Psychology PYM403 Research Methods Quantitative Methods I.
Chapter 19 Data Analysis Overview
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 18-1 Chapter 18 Data Analysis Overview Statistics for Managers using Microsoft Excel.
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Non-Parametric Methods Professor of Epidemiology and Biostatistics
Exploratory Data Analysis. Height and Weight 1.Data checking, identifying problems and characteristics Data exploration and Statistical analysis.
Simple Linear Regression
Non-Parametric Methods Professor of Epidemiology and Biostatistics
Descriptive Statistics e.g.,frequencies, percentiles, mean, median, mode, ranges, inter-quartile ranges, sds, Zs Describe data Inferential Statistics e.g.,
Linear correlation and linear regression + summary of tests
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
Chap 18-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 18-1 Chapter 18 A Roadmap for Analyzing Data Basic Business Statistics.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Choosing and using your statistic. Steps of hypothesis testing 1. Establish the null hypothesis, H 0. 2.Establish the alternate hypothesis: H 1. 3.Decide.
Chapter 18 Data Analysis Overview Yandell – Econ 216 Chap 18-1.
Statistics & Evidence-Based Practice
Mann Whitney U Test - DV produces ordinal or interval type of data
Nonparametric Statistics
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 4: Bivariate Analysis (Contingency Analysis and Regression Analysis)
Data measurement, probability and Spearman’s Rho
Correlation analysis is undertaken to define the strength an direction of a linear relationship between two variables Two measurements are use to assess.
Done by : Mohammad Da’as Special thanks to Dana Rida and her slides 
Chapter 13 Nonlinear and Multiple Regression
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/6.
Hypothesis testing. Chi-square test
Statistics.
CHOOSING A STATISTICAL TEST
Basic Statistics Overview
Parametric vs Non-Parametric
Non-Parametric Tests.
Y - Tests Type Based on Response and Measure Variable Data
Inferential Statistics
Inferential statistics,
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
SDPBRN Postgraduate Training Day Dundee Dental Education Centre
Nonparametric Statistical Methods: Overview and Examples
Nonparametric Statistics
Introduction to Statistics
Basic Statistical Terms
Nonparametric Statistical Methods: Overview and Examples
Nonparametric Statistical Methods: Overview and Examples
Nonparametric Statistical Methods: Overview and Examples
Association, correlation and regression in biomedical research
Non-parametric tests, part A:
Non – Parametric Test Dr. Anshul Singh Thapa.
Program This course will be dived into 3 parts: Part 1 Descriptive statistics and introduction to continuous outcome variables Part 2 Continuous outcome.
Unit XI: Data Analysis in nursing research
15.1 The Role of Statistics in the Research Process
Parametric versus Nonparametric (Chi-square)
Simple tests in SPSS.
Learning outcomes By the end of this session you should know about:
Exercise 1: Open the file ‘Birthweight_reduced’
Exercise 1: Entering data into SPSS
Exercise 1 Use Transform  Compute variable to calculate weight lost by each person Calculate the overall mean weight lost Calculate the means and standard.
InferentIal StatIstIcs
Descriptive statistics Pearson’s correlation
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
MASH R session 3.
Univariate analysis Önder Ergönül, MD, MPH June 2019.
Learning outcomes By the end of this workshop you should:
Learning outcomes By the end of this session you should know about:
Introductory Statistics
Exercise 1: Open the file ‘Birthweight_reduced’
Presentation transcript:

Choosing the right test Mathematics & Statistics Help University of Sheffield

Learning outcomes By the end of this session you should know about: Some useful approaches to analysing data By the end of this session you should be able to: Recognise different data types Use a flowchart to decide which analysis method to use Undertake some basic analyses and construct appropriate charts for your data

Some initial thoughts

Planning a study What do you want to investigate and why? What are your aims? How are you going to investigate it? How will you collect your data? Who/what is in the sample? How will you summarise your data? How will you analyse your data?

Steps for choosing the right test (1) Clearly define your research question What is your main outcome of interest? There may be more than one. What data type is it? The data type will determine the type of analysis Are the observations paired? Can it be characterised using a known distribution (i.e. parametric vs non-parametric test)? What may affect the outcome of interest? What data type is it/are they? How will your results be summarised? What charts can you use to display your results?

Data types: recap What types of data are there? Within the data structure there are observations or individuals, and for each observation there are data variables. Data variables can be continuous, nominal or ordinal. Variables can be divided into two main categories: numerical and categorical. Categorical variables indicate categories, for example gender (Male or Female) and marital status (Single, Married, Divorced or Widowed). Sometimes they are coded as numbers e.g. 1= male. Categorical variables can be divided into two: ordinal and nominal. If the categories are meaningfully ordered, the variable is ordinal; if it doesn’t matter in which way the categories are ordered, then the variable is nominal. For example, satisfaction levels (dissatisfied, satisfied and highly satisfied) and education level (secondary, sixth form, undergraduate and postgraduate) are ordinal variables; Student’s religion (Christian, Muslim, Hindu, etc) and Gender (Male, Female) are nominal variables. Numerical variables appear as meaningful comparable numbers, such as blood pressure, height, weight, income, age, and probability of illness etc. Numerical variables can be further divided into two subtypes: continuous and discrete. The continuous variables can take any value within a range and are the most common, e.g. body weight, height, income, etc. Discrete variables can only take whole numbers, such as number of students in class, number of new patients every day, etc but are treated as continuous for statistical analysis if there are a large range of numbers. There is another variable type called ‘Label’ variable, which identifies observations uniquely, such as Student ID, subjects’ name.

Summary measures: recap Data type Summary statistics Nominal Mode, %’s Ordinal Mode, Median, %’s Discrete (Count) %’s, can also calculate means and medians as you would for continuous data but does depend on how many separate counts you have Continuous: normally distributed Mean, Standard deviation skewed Median, Interquartile range

Chart types: recap One variable Two variables Categorical: Pie chart, barchart Numerical discrete: barchart Numerical continuous: histogram, boxplots Two variables Both categorical: stacked barchart, clustered barchart, multiple pie charts One categorical / one numerical discrete: boxplots (sometimes!), multiple barcharts One categorical / one numerical continuous: boxplots, multiple histograms Both numerical: scatterplot

Steps for choosing the right test (2) Are you interested: Testing differences between groups. How many groups are there? Assessing/modelling the relationship between variables Are the observations paired? Is the pairing due to having repeated measurements of the same variable for each subject? Does the test you have chosen make any assumptions? Are the assumptions met? e.g. assumption of normality for t-test

Test assumptions Parametric tests: Non-parametric: Generally assume data or some function of the data follows a known distribution e.g. normal Parametric tests: Non-parametric: All tests have assumptions and one of the main assumptions for a lot of the most common tests is that the data is normally distributed. Tests with this assumption are called parametric tests. Non-parametric tests do not require this assumption as they are based on the ranks of the data rather than the actual data. Nonparametric techniques are usually based on ranks/signs rather than actual data

Non-parametric methods are used when: Dependent variable is ordinal A plot of the data appears to be very skewed or the data do not seem to follow any particular shape or distribution (e.g. Normal) Assumptions underlying parametric test not met There are potentially influential outliers in the dataset Sample size is small

Comparing averages (1) Normally distributed Skewed or ordinal Comparing BETWEEN groups 2 Independent sample t-test One way ANOVA 3+ Mann-Whitney Kruskall-Wallis For comparing means ask two questions: Are there repeated measurements of the same variable for each participant? How many means are being compared? 2 = t-test, 3+ = ANOVA

Paired data (1) Most commonly, measurements from the same individuals collected on more than one occasion Can be used to look at differences in mean score: 2 or more time points e.g. before/after a diet 2 or more conditions e.g. hearing test at different frequencies Each person listened to a sound until they could no longer hear it at three different frequencies. Would use Repeated measures ANOVA to test for a difference between the frequencies.

Comparing averages (1) Normally distributed Skewed or ordinal Comparing BETWEEN groups Comparing measurements WITHIN the same subject 2 Independent sample t-test One way ANOVA Paired t-test Repeated measures ANOVA 3+ Mann-Whitney Kruskall-Wallis For comparing means ask two questions: Are there repeated measurements of the same variable for each participant? How many means are being compared? 2 = t-test, 3+ = ANOVA Wilcoxon signed rank test Friedman

Comparing averages (2) Comparing: Dependent (outcome) variable Independent (explanatory) variable Parametric test (data are normally distributed) Non-parametric test (ordinal/ skewed data) Comparing two INDEPENDENT groups Continuous Nominal (Binary) Independent t- test Mann-Whitney test/ Wilcoxon rank sum Comparing 3+ INDEPENDENT groups Comparing 2 measurements on the same subject e.g. weight before and after a diet Comparing 3+ measurements on the same subject

Comparing averages (2) Comparing: Dependent (outcome) variable Independent (explanatory) variable Parametric test (data are normally distributed) Non-parametric test (ordinal/ skewed data) Comparing two INDEPENDENT groups Continuous Nominal (Binary) Independent t- test Mann-Whitney test/ Wilcoxon rank sum Comparing 3+ INDEPENDENT groups Nominal One-way ANOVA Kruskal-Wallis test Comparing 2 measurements on the same subject e.g. weight before and after a diet Comparing 3+ measurements on the same subject

Comparing averages (2) Comparing: Dependent (outcome) variable Independent (explanatory) variable Parametric test (data are normally distributed) Non-parametric test (ordinal/ skewed data) Comparing two INDEPENDENT groups Continuous Nominal (Binary) Independent t- test Mann-Whitney test/ Wilcoxon rank sum Comparing 3+ INDEPENDENT groups Nominal One-way ANOVA Kruskal-Wallis test Comparing 2 measurements on the same subject e.g. weight before and after a diet Time/ Condition variable Paired t-test Wilcoxon signed rank test Comparing 3+ measurements on the same subject

Comparing averages (2) Comparing: Dependent (outcome) variable Independent (explanatory) variable Parametric test (data are normally distributed) Non-parametric test (ordinal/ skewed data) Comparing two INDEPENDENT groups Continuous Nominal (Binary) Independent t- test Mann-Whitney test/ Wilcoxon rank sum Comparing 3+ INDEPENDENT groups Nominal One-way ANOVA Kruskal-Wallis test Comparing 2 measurements on the same subject e.g. weight before and after a diet Time/ Condition variable Paired t-test Wilcoxon signed rank test Comparing 3+ measurements on the same subject Time/ condition variable Repeated measures ANOVA Friedman test

Examples?

What to check for normality Comparing: What to check for normality Non-parametric test for ORDINAL variable or skewed data Independent samples t-test Dependent variable by group Mann-Whitney U test ANOVA Residuals (differences between each individual and their group mean) Kruskall-Wallis test Paired t-test Paired differences Wilcoxon signed rank test Repeated measures ANOVA Residuals by time point (differences between each individual and time point mean) Friedman test

What to check for normality Comparing: What to check for normality Non-parametric test for ORDINAL variable or skewed data Independent samples t-test Dependent variable by group Mann-Whitney U test ANOVA Residuals (differences between each individual and their group mean) Kruskall-Wallis test Paired t-test Paired differences Wilcoxon signed rank test Repeated measures ANOVA Residuals by time point (differences between each individual and time point mean) Friedman test

What to check for normality Comparing: What to check for normality Non-parametric test for ORDINAL variable or skewed data Independent samples t-test Dependent variable by group Mann-Whitney U test ANOVA Residuals (differences between each individual and their group mean) Kruskall-Wallis test Paired t-test Paired differences Wilcoxon signed rank test Repeated measures ANOVA Residuals by time point (differences between each individual and time point mean) Friedman test

What to check for normality Comparing: What to check for normality Non-parametric test for ORDINAL variable or skewed data Independent samples t-test Dependent variable by group Mann-Whitney U test ANOVA Residuals (differences between each individual and their group mean) Kruskall-Wallis test Paired t-test Paired differences Wilcoxon signed rank test Repeated measures ANOVA Residuals by time point (differences between each individual and time point mean) Friedman test

Example 1: Did gender affect ticket price paid on the Titanic? Steps: What is the outcome variable? What is the grouping / explanatory variable? What methods are available to analyse these data? Check the assumptions Conduct the appropriate analysis and report the results What test do you think would be appropriate?

Example 1: Did gender affect ticket price paid on the Titanic? Steps: What is the outcome variable? Ticket price What is the grouping / explanatory variable? Gender What methods are available to analyse these data? Comparing ticket price between two groups (male and female). Most appropriate method is independent samples t-test Check the assumptions. Assumes that the groups are independent, the data in the two groups are normally distributed and the variability in the two groups is similar. Conduct the appropriate analysis and report the results. If the assumptions for the t-test are not met, use the Mann-Whitney U test

Example 1: Did gender affect ticket price paid on the Titanic? Data were positively skewed A Mann-Whitney U test was carried out to compare the ticket price for men and women There was highly significant evidence (U=5.5, p < 0.001) to suggest a difference in the distributions of ticket price for male and females What else would be useful to know when interpreting these results? Medians: women £23 vs men £12

Investigating relationships Comparing: Dependent (outcome) variable Independent (explanatory) variable Parametric test (data are normally distributed) Non-parametric test (ordinal/ skewed data) Comparing two INDEPENDENT groups Continuous Pearson’s correlation Spearman’s correlation Predicting the value of one variable from the value of a predictor variable or looking for significant relationships Scale Any Simple linear regression Transform the data Nominal (binary) Logistic regression Assessing the relationship between two categorical variables Categorical Chi-squared test

Investigating relationships Comparing: Dependent (outcome) variable Independent (explanatory) variable Parametric test (data are normally distributed) Non-parametric test (ordinal/ skewed data) Comparing two INDEPENDENT groups Continuous Pearson’s correlation Spearman’s correlation Predicting the value of one variable from the value of a predictor variable or looking for significant relationships Any Simple linear regression Transform the data Nominal (binary) Logistic regression Assessing the relationship between two categorical variables Categorical Chi-squared test

Investigating relationships Comparing: Dependent (outcome) variable Independent (explanatory) variable Parametric test (data are normally distributed) Non-parametric test (ordinal/ skewed data) Comparing two INDEPENDENT groups Continuous Pearson’s correlation Spearman’s correlation Predicting the value of one variable from the value of a predictor variable or looking for significant relationships Any Simple linear regression Transform the data Nominal (binary) Logistic regression Assessing the relationship between two categorical variables Categorical Chi-squared test

Investigating relationships Comparing: Dependent (outcome) variable Independent (explanatory) variable Parametric test (data are normally distributed) Non-parametric test (ordinal/ skewed data) Comparing two INDEPENDENT groups Continuous Pearson’s correlation Spearman’s correlation Predicting the value of one variable from the value of a predictor variable or looking for significant relationships Any Simple linear regression Transform the data Nominal (binary) Logistic regression Assessing the relationship between two categorical variables Categorical Chi-squared test

Examples?

Example 2: two categorical variables Survival of the pushiest?

Example 2: Survival of the pushiest Research question: Was survival on the titanic linked to nationality? Dependent: Survival Independent: Nationality What test do you think you should use? Chi-squared test http://www.independent.co.uk/news/world/australasia/more-britons-than-americans-died-on-titanic-because-they-queued-1452299.html

Example 2: Survival of the pushiest The data suggests that Americans were more likely to survive as 56% survived compared to 32% of British and 35% of those from other countries Results from the χ2 test suggest, that there is evidence of a significant relationship between nationality and survival (p < 0.001)

Example 2: Further thoughts Class was one of the most important predictors of survival on the Titanic 70% of Americans were travelling in 1st class A more detailed analysis, using logistic regression showed that nationality was NOT a significant predictor of survival after controlling for class In looking at these data is there any other information that would be useful? The numbers for each nationality

Learning outcomes You should now know about: Some useful approaches to analysing data By the end of this session you should be able to: Recognise different data types Use a flowchart to decide which analysis method to use Undertake some basic analyses and construct appropriate charts for your data

Exercises Attempt the 4 exercises in SPSS In each case you need to identify an appropriate analysis based on the dataset provided Remember to check the assumptions for any analysis you conduct Add value labels to the data if required Use the flow charts & table to assist you

Download the data In your web browser, type in the following address and save the files to your computer: http://www.sheffield.ac.uk/mash/workshop_materials

Maths And Statistics Help Statistics appointments: Mon-Fri (10am-1pm) Statistics drop-in: Mon-Fri (10am-1pm), Weds (4-7pm) http://www.sheffield.ac.uk/mash

Resources: All resources are available in paper form at MASH or on the MASH website

Contacts Follow MASH on twitter: @mash_uos Staff (stats) Jenny Freeman (j.v.freeman@sheffield.ac.uk) Basile Marquier (b.marquier@sheffield.ac.uk) Marta Emmett (m.emmett@sheffield.ac.uk) Website http://www.sheffield.ac.uk/mash Follow MASH on twitter: @mash_uos