Statistics.

Slides:



Advertisements
Similar presentations
CHAPTER TWELVE ANALYSING DATA I: QUANTITATIVE DATA ANALYSIS.
Advertisements

Departments of Medicine and Biostatistics
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 8 Analyzing and Interpreting Quantitative.
Today Concepts underlying inferential statistics
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Chapter 14 Inferential Data Analysis
Richard M. Jacobs, OSA, Ph.D.
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Inferential Statistics
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
Statistical Significance R.Raveendran. Heart rate (bpm) Mean ± SEM n In men ± In women ± The difference between means.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Statistical analysis Prepared and gathered by Alireza Yousefy(Ph.D)
Inference and Inferential Statistics Methods of Educational Research EDU 660.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Angela Hebel Department of Natural Sciences
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Principles of statistical testing
Chapter 13 Understanding research results: statistical inference.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Nonparametric Statistics
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Interpretation of Common Statistical Tests Mary Burke, PhD, RN, CNE.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Bivariate analysis. * Bivariate analysis studies the relation between 2 variables while assuming that other factors (other associated variables) would.
Inferential Statistics
Nonparametric Statistics
Logic of Hypothesis Testing
Data measurement, probability and Spearman’s Rho
Causality, Null Hypothesis Testing, and Bivariate Analysis
Data analysis Research methods.
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/6.
CHOOSING A STATISTICAL TEST
Parametric vs Non-Parametric
Non-Parametric Tests.
Analysis of Data Graphics Quantitative data
Data measurement, probability and statistical tests
Spearman’s rho Chi-square (χ2)
Inferential Statistics
Inferential statistics,
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Medical Statistics Dr. Gholamreza Khalili
SA3202 Statistical Methods for Social Sciences
Nonparametric Statistics
Introduction to Statistics
Review for Exam 2 Some important themes from Chapters 6-9
1.3 Data Recording, Analysis and Presentation
Non – Parametric Test Dr. Anshul Singh Thapa.
Unit XI: Data Analysis in nursing research
Data measurement, probability and statistical tests
Statistics II: An Overview of Statistics
15.1 The Role of Statistics in the Research Process
Some statistics questions answered:
Quantitative Data Analysis
The Research Skills exam:
Understanding Statistical Inferences
InferentIal StatIstIcs
Inferential testing.
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
PSYCHOLOGY AND STATISTICS
Presentation transcript:

Statistics

What is the point? Statistics are used to help makes sense of data Can I trust this result – is it a real effect Or could it be due to chance? Unfortunately, hypothesis testing and P<0.05 has come to be viewed as the holy grail of scientific research But why are results such as P=0.051 and P=0.49 treated so differently?

The null hypothesis The null hypothesis refers to a general statement that there is no relationship between two measured phenomena or no difference among groups “There is no difference in the levels of proliferation between the compound-X treated and untreated cells” Once the data has been obtained, statistical testing aids in the rejection of the null hypothesis, through the production of a P-value

What is a P-value? P<0.05 means that there is a less than 5% chance of observing a difference as large as you observed even if the two population means were identical P<0.05 means that there is a greater than 95% chance that the difference observed is a real difference and a less than 5% chance that it is simply due to chance P<0.05 means the experiment has worked, the results are significant

What does a P-value let you do? The widely accepted P-value threshold (known as α) is 0.05 Although many statisticians would argue 0.01 is a better threshold When a P-value <0.05 is obtained, the null hypothesis can be rejected However, if P>0.05, it does not mean the null hypothesis can be accepted, but just that there is insufficient evidence to reject it There is always a chance of an error being made when rejecting the null hypothesis: Type I error – chance of obtaining a false positive, or incorrectly rejecting the null hypothesis Type II error – chance of obtaining a false negative, or failure to reject the null hypothesis Familywise error – chance of making a type I error when performing multiple comparisons

Parametric vs. nonparametric tests Misconception: the decision to use a nonparametric test over a parametric test is solely based upon the normality of data/Gaussian distribution Considerations when deciding: Is it only approximately Gaussian? Will transforming the data make it Gaussian? Is the data set too small to detect non-Gaussian distributions? Or is the data set large meaning normality tests are too sensitive? Is the data non-continuous?

Types of variable An independent variable is experimentally manipulated in order to observe an effect on a dependent variable Categorical variables are discrete or qualitative variables Nominal – variables with two or more categories that do not have an intrinsic order Dichotomous – nominal variables which only have two categories or levels e.g. male or female Ordinal – variables with two or more categories that can be ordered or ranked Continuous variables are quantitative Interval – measured along a continuum, with a numerical value e.g. temperature measured in °C or °F Ratio – like an interval variable, but with 0 meaning there is none of that variable e.g. height or weight

Test selection Comparison of means Parametric (means) Non-parametric (medians) Differences between the means of two independent groups Unpaired Student’s t-test Mann-Whitney U test Differences between paired (matched) samples Paired Student’s t-test Wilcoxon signed rank test Differences in the means of >3 independent groups for one variable One-way ANOVA (+ multiple comparisons?) Kruskal-Wallis test (+ multiple comparisons?) Differences between >3 groups on the same subject Repeated measures ANOVA Friedman test Relationships between variables Parametric Non-parametric Strength of a relationship between 2 continuous variables Pearson’s correlation coefficient Spearman’s correlation coefficient Predicting the value of one variable given the value of a predictor variable Linear regression   Assessing the relationship between 2 categorical variables Chi-squared test Assessing survival Comparing the survival of two groups Kaplan-Meier + logrank (Mantel-Cox) test or Gehan-Breslow-Wilcoxon test Analysing the effect of several risk factors on survival Proportional hazards regression (Cox regression)

Comparison of means Parametric (means) Non-parametric (medians) Differences between the means of two independent groups Unpaired Student’s t-test Mann-Whitney U test Differences between paired (matched) samples Paired Student’s t-test Wilcoxon signed rank test Differences in the means of >3 independent groups for one variable One-way ANOVA (+ multiple comparisons?) Kruskal-Wallis test (+ multiple comparisons?) Differences between >3 groups on the same subject Repeated measures ANOVA Friedman test Relationships between variables Parametric Non-parametric Strength of a relationship between 2 continuous variables Pearson’s correlation coefficient Spearman’s correlation coefficient Predicting the value of one variable given the value of a predictor variable Linear regression   Assessing the relationship between 2 categorical variables Chi-squared test Assessing survival Comparing the survival of two groups Kaplan-Meier + logrank (Mantel-Cox) test or Gehan-Breslow-Wilcoxon test Analysing the effect of several risk factors on survival Proportional hazards regression (Cox regression)

Student’s t-test Comparison of means Parametric (means) Non-parametric (medians) Differences between the means of two independent groups Unpaired Student’s t-test Mann-Whitney U test Differences between paired (matched) samples Paired Student’s t-test Wilcoxon signed rank test Differences in the means of >3 independent groups for one variable One-way ANOVA (+ multiple comparisons?) Kruskal-Wallis test (+ multiple comparisons?) Differences between >3 groups on the same subject Repeated measures ANOVA Friedman test Relationships between variables Parametric Non-parametric Strength of a relationship between 2 continuous variables Pearson’s correlation coefficient Spearman’s correlation coefficient Predicting the value of one variable given the value of a predictor variable Linear regression   Assessing the relationship between 2 categorical variables Chi-squared test Assessing survival Comparing the survival of two groups Kaplan-Meier + logrank (Mantel-Cox) test or Gehan-Breslow-Wilcoxon test Analysing the effect of several risk factors on survival Proportional hazards regression (Cox regression)

Spearman’s correlation Comparison of means Parametric (means) Non-parametric (medians) Differences between the means of two independent groups Unpaired Student’s t-test Mann-Whitney U test Differences between paired (matched) samples Paired Student’s t-test Wilcoxon signed rank test Differences in the means of >3 independent groups for one variable One-way ANOVA (+ multiple comparisons?) Kruskal-Wallis test (+ multiple comparisons?) Differences between >3 groups on the same subject Repeated measures ANOVA Friedman test Relationships between variables Parametric Non-parametric Strength of a relationship between 2 continuous variables Pearson’s correlation coefficient Spearman’s correlation coefficient Predicting the value of one variable given the value of a predictor variable Linear regression   Assessing the relationship between 2 categorical variables Chi-squared test Assessing survival Comparing the survival of two groups Kaplan-Meier + logrank (Mantel-Cox) test or Gehan-Breslow-Wilcoxon test Analysing the effect of several risk factors on survival Proportional hazards regression (Cox regression)

Kaplan-Meier curve and log-rank test

Problem: multiple t-tests Student aka William Sealey Gosset Problem: multiple t-tests One of the most frequent errors I’ve come across is authors using t-tests when there are >2 groups This raises the familywise error rate E.g. There are 4 groups, meaning 6 comparisons in total Family wise error rate = 1 – (1 – 0.05)6 = 0.265 So that’s a 26.5% chance of identifying at least one significant result! Therefore, tests should be used which adjust for the familywise error rate, or a correction should be applied to the P-values Additionally, authors often state they performed an ANOVA but do not mention a multiple comparisons test The ANOVA itself only reports that there is a significant effect, but does not indicate which groups are significantly different. Therefore a multiple comparisons test should be stated.

Problem: univariate then multivariate analysis Frequently in clinical studies, authors perform univariate analysis and then subsequently perform multivariate analysis only on the variables which show a significant effect. This is inappropriate – the univariate analysis can be misleading and should not be used as a method of selecting variables for multivariate analysis. Is it acceptable to first perform univariate analysis to identify significant effects, and then perform multivariate analysis on the significant variables? No, however, this is a strategy commonly used in many clinical studies. The reason why, is that the results of a univariate analysis can be misleading, resulting in the reporting of a significant effect where none exists (or only a weak relationship exists). This is not an appropriate method by which to select variable for multivariate analysis. Therefore, multivariate analysis should be performed on all the variables the authors have measured – there is a reason they were selected as measurements in the first place, so they may be contributing to the effect being measured and should not be excluded from the multivariate analysis.

Interpreting error bars Do overlapping error bars mean that the difference between the groups is not significant? Not necessarily. Whether error bars overlap us not a foolproof way to judge significant differences, and should only be used as a rule of thumb as it depends on: - whether the sample sizes are equal - whether the error bars are showing standard deviation (SD), standard error (SEM) or 95% confidence intervals (CI) - Cannot draw a conclusion based on SD bars, overlapping SEM bars indicate P>0.05 and non overlapping 95% CI bars indicate P<0.05 - whether multiple comparisons are being performed - With multiple comparisons following an ANOVA, the significance level is normally higher to adjust for the family-wise error rate, but the error bars are graphed individually for each group, therefore cannot conclude anything from error bars when multiple comparisons have been used

Guidance to authors For statistical analyses: when statistical analyses have been performed, the following information should be provided: the name of the statistical test used (and statement of the normality of the data, for when the test is only appropriate for normal data), the n number for each analysis, the comparisons of interest, the alpha level and the actual P-value for each test (not merely P<0.05). It should be clear which statistical test was used to generate every P-value. Error bars on graphs should be clearly labeled, and it should be stated whether the number following the ± sign is a standard deviation or a standard error. The word ‘significant’ should only be used when referring to statistically significant results, and should be accompanied by the relevant P-value. Significance indicators should be used on graphs and tables, and should be described in the figure or table legend with it clear which groups are being compared.

What to be looking out for: Authors should be stating the following: The software used for analysis – name, version and supplier Whether standard deviation or standard error of the mean is being presented Which statistical tests were performed: Are all the tests used stated? Are the tests appropriate? When parametric tests (e.g. t-test, ANOVA) have been used, have they tested for normality (bearing in mind sample size)? In many papers, they will have to perform different tests for different data sets, therefore which test was used for what should be clearly stated (and clear in the results/figure legends) What was the significance threshold they used What was the sample size Are actual P-values reported (not merely P<0.05)?