Nonparametric Inference

Slides:



Advertisements
Similar presentations
Chapter 16 Introduction to Nonparametric Statistics
Advertisements

© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
EPI 809 / Spring 2008 Chapter 9 Nonparametric Statistics.
Ordinal Data. Ordinal Tests Non-parametric tests Non-parametric tests No assumptions about the shape of the distribution No assumptions about the shape.
statistics NONPARAMETRIC TEST
© 2003 Pearson Prentice Hall Statistics for Business and Economics Nonparametric Statistics Chapter 14.
Chapter Seventeen HYPOTHESIS TESTING
Analysis of Variance. Experimental Design u Investigator controls one or more independent variables –Called treatment variables or factors –Contain two.
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
Lecture 9 Today: –Log transformation: interpretation for population inference (3.5) –Rank sum test (4.2) –Wilcoxon signed-rank test (4.4.2) Thursday: –Welch’s.
© 2004 Prentice-Hall, Inc.Chap 10-1 Basic Business Statistics (9 th Edition) Chapter 10 Two-Sample Tests with Numerical Data.
Student’s t statistic Use Test for equality of two means
Biostatistics in Research Practice: Non-parametric tests Dr Victoria Allgar.
15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.
The Kruskal-Wallis Test The Kruskal-Wallis test is a nonparametric test that can be used to determine whether three or more independent samples were.
Non-parametric statistics
Nonparametric and Resampling Statistics. Wilcoxon Rank-Sum Test To compare two independent samples Null is that the two populations are identical The.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Chapter 15 Nonparametric Statistics
© 2011 Pearson Education, Inc
Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week:
AM Recitation 2/10/11.
Inferential Statistics: SPSS
Nonparametric Inference
14 Elements of Nonparametric Statistics
NONPARAMETRIC STATISTICS
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Independent samples- Wilcoxon rank sum test. Example The main outcome measure in MS is the expanded disability status scale (EDSS) The main outcome measure.
Comparing Two Population Means
Non-parametric Tests. With histograms like these, there really isn’t a need to perform the Shapiro-Wilk tests!
1 1 Slide © 2005 Thomson/South-Western AK/ECON 3480 M & N WINTER 2006 n Power Point Presentation n Professor Ying Kong School of Analytic Studies and Information.
Where are we?. What we have covered: - How to write a primary research paper.
Biostat 200 Lecture 7 1. Hypothesis tests so far T-test of one mean: Null hypothesis µ=µ 0 Test of one proportion: Null hypothesis p=p 0 Paired t-test:
What are Nonparametric Statistics? In all of the preceding chapters we have focused on testing and estimating parameters associated with distributions.
Nonparametric Statistics aka, distribution-free statistics makes no assumption about the underlying distribution, other than that it is continuous the.
Wilcoxon rank sum test (or the Mann-Whitney U test) In statistics, the Mann-Whitney U test (also called the Mann-Whitney-Wilcoxon (MWW), Wilcoxon rank-sum.
© 2000 Prentice-Hall, Inc. Statistics Nonparametric Statistics Chapter 14.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Fall 2002Biostat Nonparametric Tests Nonparametric tests are useful when normality or the CLT can not be used. Nonparametric tests base inference.
Ordinally Scale Variables
Parametric tests (independent t- test and paired t-test & ANOVA) Dr. Omar Al Jadaan.
Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.
Nonparametric Statistics. In previous testing, we assumed that our samples were drawn from normally distributed populations. This chapter introduces some.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
1 Nonparametric Statistical Techniques Chapter 17.
Nonparametric Statistics
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
NON-PARAMETRIC STATISTICS
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 14 th February 2013.
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Binomial Distribution and Applications. Binomial Probability Distribution A binomial random variable X is defined to the number of “successes” in n independent.
NONPARAMETRIC STATISTICS In general, a statistical technique is categorized as NPS if it has at least one of the following characteristics: 1. The method.
Chapter 21prepared by Elizabeth Bauer, Ph.D. 1 Ranking Data –Sometimes your data is ordinal level –We can put people in order and assign them ranks Common.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Analysis of variance Tron Anders Moger
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
13 Nonparametric Methods Introduction So far the underlying probability distribution functions (pdf) are assumed to be known, such as SND, t-distribution,
Hypothesis Testing Procedures Many More Tests Exist!
Two-Sample-Means-1 Two Independent Populations (Chapter 6) Develop a confidence interval for the difference in means between two independent normal populations.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
1 Nonparametric Statistical Techniques Chapter 18.
Chapter 10: The t Test For Two Independent Samples.
When the means of two groups are to be compared (where each group consists of subjects that are not related) then the excel two-sample t-test procedure.
NONPARAMETRIC STATISTICS
Presentation transcript:

Nonparametric Inference

Why Nonparametric Tests? We have been primarily discussing parametric tests; i.e. , tests that hold certain assumptions about when they are valid, e.g. t-tests and ANOVA both had assumptions regarding the shape of the distribution (normality) and about the necessity of having similar groups (homogeneity of variance). When these assumptions hold we can use standard sampling distributions (e.g. t-distribution, F-distribution) to find p-values.

Why Nonparametric Tests? When these assumptions are violated it is necessary to turn to tests that do not have such stringent assumptions ~ nonparametric or "distribution-free" tests. Specifically, there are three cases which necessitate the use of non-parametric tests: 1) The data for the response is not at least interval scale, i.e. measurements. For example the response might be ordinal. 3) There exists severely unequal variances between groups, i.e. there is obviously a violation of the homogeneity of variance assumption required for parametric tests. In the last two cases, we have interval level data, but it violates our parametric assumptions. Therefore, we no longer treat this data as interval, but as ordinal. In a sense, we demote it because it fails to meet specific assumptions. 2) The distribution of the data for the response is not normal. Recall that a relatively normal distribution is assumed for parametric tests.

Table of Parametric & Nonparametric Tests Purpose of Test Two-Sample t-Test (either case) Mann-Whitney/ Wilcoxon Rank Sum Test Compare two independent samples Paired t-Test Sign Test or Wilcoxon Signed-Rank Test Compare dependent samples Oneway ANOVA Kruskal-Wallis Test Compare k-independent samples

Independent Samples For two populations we use… Mann-Whitney/Wilcoxon Rank Sum Test For three or more populations we use… Kruskal-Wallis Test (at the end)

Mann-Whitney/Wilcoxon Rank Sum Test Alternative to two-sample t-Test Use when… - populations being sampled are not normally distributed. - sample sizes are small so assessing normality is not possible (ni < 20). - response is ordinal

Mann-Whitney/Wilcoxon Rank Sum Test General Hypotheses Ho: distribution of pop. A and pop. B are the same, i.e. A = B HA: distribution of pop. A and pop. B are NOT the same, i.e A = B HA: distribution of pop. A is shifted to the right of pop. B, i.e. A > B. HA: distribution of pop. A is shifted to the left of pop. B, i.e. A < B

Mann-Whitney/Wilcoxon Rank Sum Test Ho: A = B vs. HA: A > B Q: Is there evidence that the values in population A are generally larger than those in population B?

Mann-Whitney/Wilcoxon Rank Sum Test (Test Procedure) Rank all N = nA + nB observations in the combined sample from both populations in ascending order. Sum the ranks of the observations from populations A and B separately and denote the sums wA and wB. Assign average rank to tied observations. For HA: A < B reject Ho if wA is “small” or wB is “big”. For HA: A > B reject Ho if wA is “big” or wB is “small”. Use tables to determine how “big” or “small” the rank sums must be in order to reject Ho or use software to conduct the test.

Mann-Whitney/Wilcoxon Rank Sum Test (Critical Value Table) This table contains the value the smaller rank sum must be less than in order to reject the Ho for a one-tailed test situation for two significance levels (a = .05 & .01) Tables exist for the two-tailed tests as well. n is the sample size of the group with the smaller rank sum.

Example: Huntington’s Disease and Fasting Glucose Levels Davidson et al. studied the responses to oral glucose in patients with Huntington’s disease and in a group of control subjects. The five-hour responses are shown below. Is there evidence to suggest the five-hour glucose (mg present) is greater for patients with Huntington’s disease? Ho: Control = Huntington’s i.e. C = H HA: Control < Huntington’s i.e. C < H

Example: Observations & Ranks Control Group (nA = 10) Huntington’s Disease (nB = 11) 83 85 73 89 65 86 91 90 77 93 78 100 97 82 92 75 9 10.5 3 15 1.5 13 1.5 17 16 5.5 5.5 19 7 21 20 8 10.5 18 4 13 13 wA = 78 wB = 153

Example: Critical Value Table Here, nC = 10 (control) nH= 11 (Huntington’s) we will reject Ho: C = H in favor of HA: C < H if the rank sum for the control group is less than 86 at a = .05 level and less than 77 at a = .01 level.

Example: Decision/Conclusion Using the Wilcoxon Rank Sum Test we have evidence to suggest that the five hour glucose level for individuals with Huntington’s disease is greater than that for healthy controls (p < .05). Note: p < .05 because the observed rank sum for the control group is less than 86 which is the critical value for a = .05.

Rank Sum Test in JMP The p-values reported based upon large sample approximations which generally should not be used when sample sizes are small. Here the conclusion reached is the same but in general we should use tables if they are available.

Exact one-tailed p-value = .024/2 = .012 * Rank Sum Test in SPSS Exact one-tailed p-value = .024/2 = .012 *

Wilcoxon Signed-Rank Test Dependent Samples Sign Test Wilcoxon Signed-Rank Test

Sign Test The sign test can be used in place of the paired t-test when we have evidence that the paired differences are NOT normally distributed. It can be used when the response is ordinal. Best used when the response is difficult to quantify and only improvement can be measured, i.e. subject got better, got worse, or no change. Magnitude of the paired difference is lost when using this test.

Sign Test The sign test looks at the number of (+) and (-) differences amongst the nonzero paired differences. A preponderance of +’s or –’s can indicate that some type of change has occurred. If the null hypothesis of no change is true we expect +’s and –’s to be equally likely to occur, i.e. P(+) = P(-) = .50 and the number of each observed follows a binomial distribution.

Example: Sign Test A study evaluated hepatic arterial infusion of floxuridine and cisplatin for the treatment of liver metastases of colorectral cancer. Performance scores for 29 patients was recorded before and after infusion. Is there evidence that patients had a better performance score after infusion?

Example: Sign Test Patient Before (B) Infusion After (A) Infusion Difference (A – B) 1 2 -1 16 17 3 18 4 19 5 20 6 21 7 22 8 23 9 24 10 25 11 26 12 27 13 28 14 -2 29 15

Example: Sign Test Ho: No change in performance score following infusion, or more specifically median change in performance score is 0. HA: Performance scores improve following infusion, or more specifically median change in performance score > 0. Intuitively we will reject Ho if there is a “large” number of +’s.

Example: Sign Test - - - - - - 17 nonzeros differences, 11 +’s 6 –’s + Patient Before (B) Infusion After (A) Infusion Difference (A – B) 1 2 -1 16 17 3 18 4 19 5 20 6 21 7 22 8 23 9 24 10 25 11 26 12 27 13 -2 28 14 29 15 - + + - + - + + + + + - + - - + +

Example: Sign Test If Ho is true, X = the number of +’s has a binomial dist. with n = 17 and p = P(+) = .50. Therefore the p-value is simply the P(X > 11|n=17, p = .50)=.166 > a We fail to reject Ho, there is insufficient evidence to conclude the performance score improves following infusion (p = .166).

Wilcoxon Signed-Rank Test The problem with the sign test is that the magnitude or size of the paired differences is lost. The Wilcoxon Signed-Rank Test uses ranks of the paired differences to retain some sense of their size. Use when the distribution of the paired differences are NOT normal or when sample size is small. Can be used with an ordinal response.

Wilcoxon Signed Rank Test (Test Procedure) Exclude any differences which are zero. Put the rest of differences in ascending order ignoring their signs. Assign them ranks. If any differences are equal, average their ranks.

Example: Wilcoxon Signed Rank Test Resting Energy Expenditure (REE) for Patient with Cystic Fibrosis A researcher believes that patients with cystic fibrosis (CF) expend greater energy during resting than those without CF. To obtain a fair comparison she matches 13 patients with CF to 13 patients without CF on the basis of age, sex, height, and weight.

Example: Wilcoxon Signed Rank Test Pair CF (C) Healthy (H) Difference d = C - H Sign of Difference Abs. Diff. |d| Rank |d| Signed Rank 1 1153 996 157 + 6 2 1132 1080 52 3 1165 1182 -17 - 17 4 1460 1452 8 5 1634 1162 472 13 1493 1619 -126 126 7 1358 1140 218 9 1453 1123 330 11 1185 1113 72 10 1824 1463 361 12 1793 1632 161 1930 1614 316 216 2075 1836 239 6 3 -2 1 13 -5 9 11 4 12 7 8 10

Example: Wilcoxon Signed Rank Test Pair CF (C) Healthy (H) Difference d = C - H Signed Rank 1 1153 996 157 6 2 1132 1080 52 3 1165 1182 -17 -2 4 1460 1452 8 5 1634 1162 472 13 1493 1619 -126 - 5 7 1358 1140 218 9 1453 1123 330 11 1185 1113 72 10 1824 1463 361 12 1793 1632 161 1930 1614 316 2075 1836 239 We then calculate the sum of the positive ranks ( T+ ) and the sum of the negative ranks (T- ). Here we have T+ = 6 + 3 + 1 + 13 + 9 + 11 + 4 + 12 + 7 + 8 + 10 = 84 and T- = 2 + 5 = 7

Wilcoxon Signed Rank Test (Test Statistic) Intuitively we will reject the Ho ,which states that there is no difference between the populations, if either one of these rank sums is “large” and the other is “small”. The Wilcoxon Signed Rank Test uses the smaller rank sum, T = min( T+ ,T- ) , as the test statistic.

Example: Wilcoxon Signed Rank Test For the cystic fibrosis example we have the following hypotheses: Ho: there is no difference in the resting energy expenditure of individuals with CF and healthy controls who are the same gender, age, height, and weight. HA: the resting energy expenditure of individuals with CF is greater than that of healthy individuals who are the same gender, age, height, and weight. MEDIAN PAIRED DIFFERENCE = 0 MEDIAN PAIRED DIFFERENCE > 0

Example: Wilcoxon Signed Rank Test HA: the resting energy expenditure of individuals with CF is greater than that of healthy individuals who are the same gender, age, height, and weight. The alternative is clearly supported if T+ is “large” or T- is “small”. The test statistic T = min( T+ , T- ) = 7 Is T = 7 considered small, i.e. what is the corresponding p-value? To answer this question we need a Wilcoxon Signed Rank Test table or statistical software.

Example: Wilcoxon Signed Rank Test This table gives the value of T = min( T+ , T- ) that our observed value must be less than in order to reject Ho for the both two- and one-tailed tests. Here we have n = 13 & T = 7. We can see that our test statistic is less than 21 (a = .05) and 12 (a = .01) so we will reject Ho and we also estimate that our p-value < .01.

Example: Wilcoxon Signed Rank Test We conclude that individuals with cystic fibrosis (CF) have a large resting energy expenditure when compared to healthy individuals who are the same gender, age, height, and weight (p < .01).

Analysis in JMP Select Test Mean from Difference pull-down menu, 0 for null value, and check Wilcoxon option. The test statistic is reported as (T+ - T-)/2 = (84 – 7)/2 = 38.50 but we only need p-value = .0023.

Analysis in SPSS Click on CF first and then Healthy to specify that the paired difference will be defined as CF – Healthy & specify which tests to conduct. Note: the Difference column is not actually used in the SPSS analysis.

Analysis in SPSS For one-tailed Wilcoxon Signed Rank Test our p-value = .007/2 = .0035 (not exact!) For the Sign Test we have a one-tailed p-value = .022/2 = .011

Independent Samples If we have three or more populations to compare we use… Kruskal – Wallis Test

Kruskal-Wallis Test One-way ANOVA for a completely randomized design is based on the assumption of normality and equality of variance. The nonparametric alternative not relying on these assumptions is called the Kruskal-Wallis Test. Like the Mann-Whitney/Wilcoxon Rank Sum Test we use the sum of the ranks assigned to each group when considering the combined sample as the basis for our test statistic.

Kruskal-Wallis Test Basic Idea: 1) Looking at all observations together, rank them. 2) Let R1, R2, …,Rk be the sum of the ranks of each group 3) If some Ri’s are much larger than others, it indicates the response values in different groups come from different populations.

Kruskal-Wallis Test The test statistic is where, N = total sample size = n1 + n2 + ... + nk

Kruskal-Wallis Test The test statistic is Under the null hypothesis, this has an approximate chi-square distribution with df = k -1, i.e. . The approximation is OK when each group contains at least 5 observations. N = total sample size = n1 + n2 + ... + nk

Chi-squared Distribution and p-value Area = p-value

Example: Kruskal-Wallis Test A clinical trial evaluating the fever reducing effects of aspirin, ibuprofen, and acetaminophen was conducted. Study subjects were adults seen in an ER with diagnoses of flu with body temperatures between 100o F and 100.9o F. Subjects were randomly assigned to treatment. Changes in body temperature were recorded 2 hrs. after administration of treatments.

Example: Kruskal-Wallis Test Resulting Data: Temperature Decrease (deg. F) Aspirin Rank Ibuprofen Acetaminophen .95 .39 .19 1.48 .44 1.02 1.33 1.31 .07 1.28 2.48 .01 1.39 .62 -.39 (i.e. temp increase) 8 5 4 14 6 9 12 11 3 10 15 2 13 7 1 N = 15 R1 = 44 R2 = 50 R3 = 26 n1 = 4 n2 = 5 n3 = 6

Example: Kruskal-Wallis Test N = 15 R1 = 44 R2 = 50 R3 = 26 n1 = 4 n2 = 5 n3 = 6

Chi-squared Distribution and p-value Area = .033

Kruskal-Wallis in JMP (Demo) Analyze > Fit Y by X RESULTS R1 = 44 n1 = 4 R2 = 50 n2 = 5 R3 = 26 n3 = 6 H = 6.833 df = 2 p = .033

Kruskal-Wallis in SPSS (Demo) RESULTS R1 /n1 = 11.00 R2 /n2 = 10.00 R3 /n3 = 4.33 H = 6.833 df = 2 p = .033

Decision/Conclusion Using the Kruskal-Wallis test have evidence to suggest that the temperature changes after taking the different drugs are not the same (p = .033). Now we might like to know which drugs significantly differ from one another.

Multiple Comparisons for Kruskal – Wallis Test If we decide at least two populations differ in term of what is typical of their values we can use multiple comparisons to determine which populations differ. To do this we calculate an approximate p-value for each pair-wise comparison and then compare that p-value to a Bonferroni corrected significance level (a).

Multiple Comparisons for Kruskal – Wallis Test To determine if group i significantly differs from group j we compute . and then compute p-value = and compare to a/2m where m is the number of possible pair-wise comparisons, m =

Multiple Comparisons for Kruskal – Wallis Test Comparing Aspirin to Acetominophen N = 15 Aspirin Acetominophen R1 = 44 R3 = 26 n1 = 4 n3 = 6 Computing the Bonferroni corrected significance level we have .05/2(3) = .00833

Multiple Comparisons for Kruskal – Wallis Test As this is not significant no others will either, so how can this be? The problem is the Bonferroni correction is too conservative and the approximate normality of the multiple comparison is valid only when sample sizes are “large” and the sample sizes here quite small. Thus the comparison shown is fine for a demonstration of the procedure but the results cannot be trusted.

Nonparametric Multiple Comparisons in JMP

Nonparametric Multiple Comparisons in JMP