Nonparametric Statistics Timothy C. Bates

Slides:



Advertisements
Similar presentations
Hypothesis Testing Steps in Hypothesis Testing:
Advertisements

CHOOSING A STATISTICAL TEST © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON.
PSY 307 – Statistics for the Behavioral Sciences Chapter 20 – Tests for Ranked Data, Choosing Statistical Tests.
Ordinal Data. Ordinal Tests Non-parametric tests Non-parametric tests No assumptions about the shape of the distribution No assumptions about the shape.
INTRODUCTION TO NON-PARAMETRIC ANALYSES CHI SQUARE ANALYSIS.
Chapter Seventeen HYPOTHESIS TESTING
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
Final Review Session.
Student’s t statistic Use Test for equality of two means
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 8 Analyzing and Interpreting Quantitative.
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Inferential Statistics
Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week:
AM Recitation 2/10/11.
Statistical Analysis I have all this data. Now what does it mean?
Non-parametric Dr Azmi Mohd Tamil.
Copyright, Gerry Quinn & Mick Keough, 1998 Please do not copy or distribute this file without the authors’ permission Experimental design and analysis.
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
+ Refresher in inferential statistics stats.
Comparing Two Samples Harry R. Erwin, PhD
Statistical Significance R.Raveendran. Heart rate (bpm) Mean ± SEM n In men ± In women ± The difference between means.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
A Repertoire of Hypothesis Tests  z-test – for use with normal distributions and large samples.  t-test – for use with small samples and when the pop.
Common Nonparametric Statistical Techniques in Behavioral Sciences Chi Zhang, Ph.D. University of Miami June, 2005.
Biostat 200 Lecture 7 1. Hypothesis tests so far T-test of one mean: Null hypothesis µ=µ 0 Test of one proportion: Null hypothesis p=p 0 Paired t-test:
Copyright © 2012 Pearson Education. Chapter 23 Nonparametric Methods.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Nonparametric Statistics aka, distribution-free statistics makes no assumption about the underlying distribution, other than that it is continuous the.
Chapter 14 Nonparametric Tests Part III: Additional Hypothesis Tests Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social & Behavioral.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Ordinally Scale Variables
Stats 2022n Non-Parametric Approaches to Data Chp 15.5 & Appendix E.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Experimental Design and Statistics. Scientific Method
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
Academic Research Academic Research Dr Kishor Bhanushali M
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Angela Hebel Department of Natural Sciences
Analyzing and Interpreting Quantitative Data
NON-PARAMETRIC STATISTICS
Analisis Non-Parametrik Antonius NW Pratama MK Metodologi Penelitian Bagian Farmasi Klinik dan Komunitas Fakultas Farmasi Universitas Jember.
Nonparametric Statistics
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
Chapter 21prepared by Elizabeth Bauer, Ph.D. 1 Ranking Data –Sometimes your data is ordinal level –We can put people in order and assign them ranks Common.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Chapter 13 Understanding research results: statistical inference.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 7 Analyzing and Interpreting Quantitative Data.
Nonparametric Statistics
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
Hypothesis Testing Procedures Many More Tests Exist!
 Kolmogor-Smirnov test  Mann-Whitney U test  Wilcoxon test  Kruskal-Wallis  Friedman test  Cochran Q test.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
Non-parametric Tests Research II MSW PT Class 8. Key Terms Power of a test refers to the probability of rejecting a false null hypothesis (or detect a.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/1.
Non-Parametric Tests 12/6.
Non-Parametric Tests.
Non – Parametric Test Dr. Anshul Singh Thapa.
Parametric versus Nonparametric (Chi-square)
Presentation transcript:

Nonparametric Statistics Timothy C. Bates

Parametric Statistics 1 Assume data are drawn from samples with a certain distribution (usually normal) Compute the likelihood that groups are related/unrelated or same/different given that underlying model t-test, Pearson’s correlation, ANOVA…

Parametric Statistics 2 Assumptions of Parametric statistics 1. Observations are independent 2. Your data are normally distributed 3. Variances are equal across groups Can be modified to cope with unequal ∂ 2

Non-parametric Statistics? Non-parametric statistics do not assume any underlying distribution They estimate the distribution AND compute the probability that your groups are the related/the same or unrelated/different

Nonparametric ≠ No parameters Model structure is not specified a priori but is instead determined from data. The data are parameterised by the analysis AKA: “distribution free”

Non-parametric Statistics Assumptions of non-parametric statistics 1. Observations are independent

Non-parametric Statistics? Non-parametric statistics do not assume any underlying distribution Estimating or modeling this distribution reduces their power to detect effects… So never use them unless you have to

Why use a Non-parametric Statistic? Very small samples (<20 replicates) High probability of violating the assumption of normality Leads to spurious Type-1 (false alarm) errors

Why use a Non-parametric Statistic? Outliers more often lead to spurious Type- 1 (false alarm) errors in parametric statistics. Nonparametric statistics reduce data to an ordinal rank, which reduces the impact or leverage of outliers.

Error Type-I error: False Alarm for a bogus effect reject the null hypothesis when it is really true Type-II error: Miss a real effect fail to reject our null hypothesis when it is really false Type-III error: :-) lazy, incompetent, or willful ignorance of the truth

Power 1-alpha

Non-parametric Choices Data type? χ2χ2 discrete Question? continuous Number of groups? Spearman’s Rank associationDifferent central value Mann-Whitney U Wilcoxon’s Rank Sums Kruskal-Wallis test two-groupsmore than 2 Brown- Forsythe Difference in ∂ 2

Non-parametric Choices Data type? χ2χ2 discrete Question? continuous Number of groups? Spearman’s Rank Like a Pearson’s R Mann-Whitney U Wilcoxon’s Rank Sums Kruskal-Wallis test two-groupsmore than 2 Like ANOVA Like Student’s t No alternative Different central value Brown- Forsythe Difference in ∂ 2 Like F-test association

Chi-Squared (Χ 2 ) χ2 tests the null hypothesis that observed events occur with an expected frequency in large samples frequencies are distributed as Χ 2 e.g. Ho: “This six-sided dice is fair ” Expect all 6 outcomes to occur equally often Assumptions Observations are independent Outcomes mutually exclusive Sample is not small Small samples require exact test:, i.e., binomial test

Chi-Squared Χ 2 formula Χ 2 = the sum of each squared difference between the observed and expected frequencies divided its expected frequency

Χ 2 and contingency tables Χ 2 essentially tests if each cell in a contingency table has its expected value In a 2-way table, this expectation will be the value of an adjacent cell

Example: coin toss Random sample of 100 coin tosses, of a coin believed to be fair We observed number of 45 heads, and and 55 tails Is the coin fair?

Coin toss If h o is true, our test statistic is drawn from a Χ 2 distribution with df = 1 (45-50) 2 + (55-50) 2 = = Χ 2 (1) = 1, p > 0.3

Coin toss Χ 2 in R chisq.test(c(45,55), p=c(.5,.5)) Chi-squared test for given probabilities Χ 2 = 1, df = 1, p =

Spearman Rank test (ρ (rho)) Named after Charles Spearman, Non-parametric measure of correlation Assesses how well an arbitrary monotonic function describes the relationship between two variables, Does not require the relationship be linear Does not require interval measurement

Spearman Rank test (ρ (rho)) Mathematically, it is simply a Pearson’s r computed on ranked data d = difference in rank of a given pair n = number of pairs Alternative test = Kendall's Tau (Kendall's τ)

Mann-Whitney U AKA: “Wilcoxon rank-sum test Mann & Whitney, 1947; Wilcoxon, 1945 Non-parametric test for difference in the medians of two independent samples Assumptions: Samples are independent Observations can be ranked (ordinal or better)

Mann-Whitney U U tests the difference in the medians of two independent samples n 1 = number of obs in sample 1 n 2 = number of obs in sample 2 R = sum of ranks of the lower-ranked sample

Mann-Whitney U or t-test? Should you use it over the t-test? Yes if you have a very small sample (<20) (central limit assumptions not met) Possibly if your data are inherently ordinal Otherwise, probably not. It is less prone to type-I error (spurious significance) due to outliers. But does not in fact handle comparisons of samples whose variances differ very well (Use unequal variance t-test with rank data)

Aesop: Mann-Whitney U Example Suppose that Aesop is dissatisfied with his classic experiment in which one tortoise was found to beat one hare in a race. He decides to carry out a significance test to discover whether the results could be extended to tortoises and hares in general…

Aesop 2: Mann-Whitney U He collects a sample of 6 tortoises and 6 hares, and makes them all run his race. The order in which they reach the finishing post (their rank order) is as follows: tort = c(1, 7, 8, 9, 10,11) hare = c(2, 3, 4, 5, 6, 12) Original tortoise still goes at warp speed, original hare is still lazy, but the others run truer to stereotype.

Aesop 3: Mann-Whitney U wilcox.test(tort, hare) Wilcoxon = W = 25, p-value = 0.31 Tortoises are not faster (but neither are hares) tort = c(1, 7, 8, 9, 10,11) (n 2 = 6) hare = c(2, 3, 4, 5, 6, 12) (n 1 = 6, R 1 =32)

Aesop 4: Mann-Whitney U Wilcoxon = W = 25, p-value = 0.31 Tortoises are not faster (but neither are hares). Welch Two Sample t-test t = , df = 10, p-value = 0.28 Alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: ~ 6.91 sample estimates: mean of x = 7.6 mean of y = 5.3

Power comparison with continuous normal data tort = hare = Wilcoxon W = 25, p = 0.31 t.test t.test(tort, hare, var.equal = TRUE) t(10) = 1.5, p = 0.16

Wilcoxon signed-rank test (related samples) Same idea as MW U, generalized to matched samples Equivalent to non-independent sample t- test

Kruskall-Wallis Non-parametric one-way analysis of variance by ranks (named after William Kruskal and W. Allen Wallis) tests equality of medians across groups. It is an extension of the Mann-Whitney U test to 3 or more groups. Does not assume a normal population, Assumes population variances among groups are equal.