Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 14: Non-parametric tests Marshall University Genomics.

Slides:



Advertisements
Similar presentations
Chapter 16 Introduction to Nonparametric Statistics
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 16 l Nonparametrics: Testing with Ordinal Data or Nonnormal Distributions.
Nonparametric Statistics Timothy C. Bates
PSY 307 – Statistics for the Behavioral Sciences Chapter 20 – Tests for Ranked Data, Choosing Statistical Tests.
Ordinal Data. Ordinal Tests Non-parametric tests Non-parametric tests No assumptions about the shape of the distribution No assumptions about the shape.
Biomedical Presentation Name: 牟汝振 Teach Professor: 蔡章仁.
Final Review Session.
Statistics 07 Nonparametric Hypothesis Testing. Parametric testing such as Z test, t test and F test is suitable for the test of range variables or ratio.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 17: Nonparametric Tests & Course Summary.
PSY 1950 Nonparametric Statistics November 24, 2008.
Biostatistics in Research Practice: Non-parametric tests Dr Victoria Allgar.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Non-parametric statistics
Mann-Whitney and Wilcoxon Tests.
Nonparametrics and goodness of fit Petter Mostad
Chapter 15 Nonparametric Statistics
Non-Parametric Methods Professor of Epidemiology and Biostatistics
Statistical Methods II
Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week:
Practical statistics for Neuroscience miniprojects Steven Kiddle Slides & data :
Non-parametric Dr Azmi Mohd Tamil.
Chapter 14: Nonparametric Statistics
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 – Multiple comparisons, non-normality, outliers Marshall.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Non-Parametric Methods Professor of Epidemiology and Biostatistics
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Special Topics 504: Practical Methods in Analyzing Animal Science Experiments The course is: Designed to help familiarize you with the most common methods.
Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions.
9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping
Copyright © 2012 Pearson Education. Chapter 23 Nonparametric Methods.
Previous Lecture: Categorical Data Methods. Nonparametric Methods This Lecture Judy Zhong Ph.D.
Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.
© Copyright McGraw-Hill CHAPTER 13 Nonparametric Statistics.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Ordinally Scale Variables
Stats 2022n Non-Parametric Approaches to Data Chp 15.5 & Appendix E.
Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.
Nonparametric Tests IPS Chapter 15 © 2009 W.H. Freeman and Company.
Nonparametric Statistics
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Experimental Design and Statistics. Scientific Method
Objectives To understand the difference between parametric and nonparametric Know the difference between medically and statistically significant Understand.
Ch11: Comparing 2 Samples 11.1: INTRO: This chapter deals with analyzing continuous measurements. Later, some experimental design ideas will be introduced.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
Nonparametric Statistics
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
Chapter 21prepared by Elizabeth Bauer, Ph.D. 1 Ranking Data –Sometimes your data is ordinal level –We can put people in order and assign them ranks Common.
Nonparametric Tests with Ordinal Data Chapter 18.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 7 – Non-normality and outliers.
STATISTICAL TEST.
Non-parametric Approaches The Bootstrap. Non-parametric? Non-parametric or distribution-free tests have more lax and/or different assumptions Properties:
Bootstrapping and Randomization Techniques Q560: Experimental Methods in Cognitive Science Lecture 15.
Nonparametric Statistics Overview. Objectives Understand Difference between Parametric and Nonparametric Statistical Procedures Nonparametric methods.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Comparing Models.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 16 : Summary Marshall University Genomics Core Facility.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 15: Sample size and Power Marshall University Genomics.
Research Methodology Lecture No :25 (Hypothesis Testing – Difference in Groups)
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/6.
Non-Parametric Tests.
Nonparametric Statistics Overview
Non – Parametric Test Dr. Anshul Singh Thapa.
Presentation transcript:

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 14: Non-parametric tests Marshall University Genomics Core Facility

Parametric Tests The following tests all assume that the data are sampled from populations in which the values are normally distributed: – Unpaired t-test – Paired t-test Assumes that the differences within pairs are samples of a normally distributed population – ANOVA Data which is normally distributed can be completely summarized by the mean and standard deviation – These two values completely determine the distribution These are the parameters for the distribution These tests work by estimating one or both of these parameters – Known as parametric tests Marshall University School of Medicine

Non-parametric tests Tests which make no assumptions about the distribution are known as non-parametric tests Most commonly-used forms are based on ranking (ordering) the values in the data set and analyzing only the ranks This form of test is extremely robust to outliers The following are non-parametric versions of parametric tests They are used in similar situations to their parametric versions, but make no assumptions about normality. Marshall University School of Medicine

Non-parametric analogs of parametric tests ScenarioParametric TestNon-parametric test Comparing two groups, with no pairing Unpaired T-testMann-Whitney Test (a.k.a. Wilcoxon ranked-sum test) Comparing two paired groups Paired T-testWilcoxon matched-pairs signed-rank test Comparing two ordinal variables for correlation Pearson correlation (variables must be interval variables) Spearman’s rank correlation Comparing more than two groups One-way ANOVAKruskal-Wallis test Marshall University School of Medicine

The Mann-Whitney Test The Mann-Whitney test is the non-parametric equivalent of the unpaired T-test Use when you want to compare a variable between two groups, but you have reason to believe the data is not sampled from a normally-distributed population Marshall University School of Medicine

How the Mann-Whitney Test works The Mann-Whitney test works as follows: Compute the rank for all values, regardless of which group they come from – Smallest value has a rank of 1, next smallest has a rank of 2, etc. Choose one group: for each data point in that group, count the number of data points in the other group which are smaller – Sum these values, and call the sum U 1 Similarly compute U 2, or use the fact that U 1 +U 2 =n 1 n 2 Let U=min(U 1,U 2 ) The distribution of U under the null hypothesis is known, so software can compute a p-value Marshall University School of Medicine

The Wilcoxon matched-pairs signed- rank test The Wilcoxon matched-pairs signed-rank test is used for paired data Before and after treatment, etc. Unlike the paired t-test, it does not assume the differences are samples from a normally- distributed population Basic procedure: – Compute all signed differences – Rank the differences by their absolute value – Sum the ranks for the positive differences, and sum the ranks for the negative differences. – Compute the difference between these two sums of ranks – The distribution of this value under the null hypothesis is known Marshall University School of Medicine

Spearman’s Rank Correlation Spearman's Rank Correlation is used to test for dependence in the ordering of two variables – The variables only need be ordinal – Do not need to be interval variables (no scale) Works by computing the ranks of each variable Then just compute the Pearson correlation coefficient of the ranks Does not assume normally distributed populations Does not test for a linear relationship – Just a monotonic (increasing/decreasing) one Marshall University School of Medicine

Pros and cons of non-parametric tests Pros of non-parametric tests: – Since non-parametric tests do not rely on the assumption of normally- distributed populations, they can be used when that assumption fails, or cannot be verified Cons of non-parametric tests: – If the data really do come from normally-distributed populations, the non- parametric tests are less powerful than their parametric counterparts i.e. they will give higher p-values – For small sample sizes, they are much less powerful: Mann-Whitney p-values are always greater than 0.05 if the sample size is 7 or fewer – Nonparametric Tests typically do not compute confidence intervals Can sometimes be computed, but often require additional assumptions – Non-parametric tests are not related to regression models Cannot be extended to account for confounding variables using multiple regression techniques Marshall University School of Medicine

Choosing between parametric and non-parametric tests The choice between parametric and non-parametric tests is not straightforward A common, but invalid, approach is to use normality tests to automate the choice – The choice is most important for small data sets, for which normality tests are of limited use – Using the data set to determine the statistical analysis will underestimate p- values – If data fail normality tests, a transformation may be appropriate The most "honest" approach is to perform in independent experiment with a large sample to test for normality, and then design the experiment in hand based on the results of this – This is almost always impractical – For well-used experimental designs, an almost-equivalent approach is to follow customary procedure Essentially assuming this has been carried out in some way already Marshall University School of Medicine

How much difference does it make? The central limit theorem ensures that parametric tests work well with non-normal distributions if the sample is large enough – How large is large enough? – Depends on the distribution! – For most distributions, sample sizes in the range of dozens will remove any issues with normality You will still increase your statistical power by using a transformation if appropriate Conversely, if the data really come from a normally-distributed population and you choose a non- parametric test, you will lose statistical power – For large samples, however, the difference is minimal Small samples present problems: – Non-parametric tests have very little power for small samples – Parametric tests can give misleading results for small samples if the population data are non- normal – Tests for normality are not helpful for small samples Marshall University School of Medicine

Conclusions The bottom-line conclusion is that large samples are better than small samples – In general, the larger the better Of course, it can be prohibitively time consuming and/or expensive to analyze large samples If your experimental design is going to use a small sample, you need to be able to justify the data come from a normally distributed population – If this is a common experimental design that is conventionally analyzed this way, that may be good enough – For a new methodology, you should really perform an independent experiment with a large sample to test for normality first Use the results of this to guide the data analysis for future experiments Marshall University School of Medicine

Computationally-intensive non- parametric methods The non-parametric methods we examined worked by analyzing the ranks of the data Another class of non-parametric tests is the class of computationally-intensive methods There are two subclasses: – Permutation or randomization tests: Simulate the null distribution by repeatedly randomly reassigning group labels Compare the "real" data to the generated null distribution – Bootstrapping techniques: Effectively generate many samples from the population by resampling from the original sample Look at the distribution of summary data from the generated samples These techniques still require a reasonable sample size to begin with – Big enough to generate enough distinct permutations or bootstraps Marshall University School of Medicine

Summary Rank-based non-parametric tests are available as replacements for parametric tests Less powerful than parametric counterparts but work when the data are not sampled from a normal distribution Choice of test should not be automated – Should be part of experimental design and not depend on the data Choice is less important for large data sets But lose most power for small data sets Permutation and Bootstrapping techniques also provide alternatives to parametric tests Marshall University School of Medicine