Nonparametric Statistics

Slides:



Advertisements
Similar presentations
Prepared by Lloyd R. Jaisingh
Advertisements

Independent t -test Features: One Independent Variable Two Groups, or Levels of the Independent Variable Independent Samples (Between-Groups): the two.
Chapter 16 Introduction to Nonparametric Statistics
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Nonparametric Methods Chapter 15.
EPI 809 / Spring 2008 Chapter 9 Nonparametric Statistics.
Ordinal Data. Ordinal Tests Non-parametric tests Non-parametric tests No assumptions about the shape of the distribution No assumptions about the shape.
statistics NONPARAMETRIC TEST
Nonparametric Inference
Chapter 15 Nonparametric Statistics General Objectives: In Chapters 8–10, we presented statistical techniques for comparing two populations by comparing.
© 2003 Pearson Prentice Hall Statistics for Business and Economics Nonparametric Statistics Chapter 14.
Nonparametric Statistics Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Chapter 12 Chi-Square Tests and Nonparametric Tests
EPI 809 / Spring 2008 Wilcoxon Signed Rank Test. EPI 809 / Spring 2008 Signed Rank Test Example You work in the finance department. Is the new financial.
Chapter 14 Analysis of Categorical Data
Chapter 12 Chi-Square Tests and Nonparametric Tests
Analysis of Variance. Experimental Design u Investigator controls one or more independent variables –Called treatment variables or factors –Contain two.
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
Statistics for Managers Using Microsoft® Excel 5th Edition
© 2004 Prentice-Hall, Inc.Chap 10-1 Basic Business Statistics (9 th Edition) Chapter 10 Two-Sample Tests with Numerical Data.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons Business Statistics, 4e by Ken Black Chapter 17 Nonparametric Statistics.
Student’s t statistic Use Test for equality of two means
SW388R6 Data Analysis and Computers I Slide 1 One-sample T-test of a Population Mean Confidence Intervals for a Population Mean.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 15 Nonparametric Statistics
Inferential Statistics
© 2011 Pearson Education, Inc
Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week:
AM Recitation 2/10/11.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Non-parametric Dr Azmi Mohd Tamil.
Inferential Statistics: SPSS
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 12-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
Nonparametric Inference
14 Elements of Nonparametric Statistics
NONPARAMETRIC STATISTICS
Non-parametric Tests. With histograms like these, there really isn’t a need to perform the Shapiro-Wilk tests!
Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions.
Copyright © 2012 Pearson Education. Chapter 23 Nonparametric Methods.
Chapter 14 Nonparametric Tests Part III: Additional Hypothesis Tests Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social & Behavioral.
Wilcoxon rank sum test (or the Mann-Whitney U test) In statistics, the Mann-Whitney U test (also called the Mann-Whitney-Wilcoxon (MWW), Wilcoxon rank-sum.
© 2000 Prentice-Hall, Inc. Statistics Nonparametric Statistics Chapter 14.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Ordinally Scale Variables
Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
1 Nonparametric Statistical Techniques Chapter 17.
Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 12-1 Chapter 12 Chi-Square Tests and Nonparametric Tests Statistics for Managers using.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
Chapter Eight: Using Statistics to Answer Questions.
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
Chapter 14: Nonparametric Statistics
NON-PARAMETRIC STATISTICS
Tuesday PM  Presentation of AM results  What are nonparametric tests?  Nonparametric tests for central tendency Mann-Whitney U test (aka Wilcoxon rank-sum.
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
Chapter Fifteen Chi-Square and Other Nonparametric Procedures.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Nonparametric Statistics.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
8 - 1 © 1998 Prentice-Hall, Inc. Statistics for Managers Using Microsoft Excel, 1/e Statistics for Managers Using Microsoft Excel Two-Sample & c-Sample.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
1 Nonparametric Statistical Techniques Chapter 18.
Nonparametric Statistics
Chapter 12 Chi-Square Tests and Nonparametric Tests
十二、Nonparametric Methods (Chapter 12)
Parametric versus Nonparametric (Chi-square)
Nonparametric Statistics
Presentation transcript:

Nonparametric Statistics binomial test.npar test /binomial (.50)= write (50).* sign test.npar test /sign= read with write (paired).*signrank test.npar tests /m-w= write by female(1 0).* kruskal-wallis test.npar tests /k-w=write by prog(1 3). Wilcox, R. R. (2012). Introduction to Robust Estimation and Hypothesis Testing, 3rd Ed. Academic Press. Wilcox, R. (2012). Modern Statistics for the Social and Behavioral Sciences: A Practical Introduction. CRC Press. Wilcox, R. R. (2010). Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy, Springer, 2nd Ed. Wilcox, R. R. (2009). Basics Statistics: Understanding Conventional Methods and Modern Insights. New York: Oxford.

Salad bar, Pizza Hut, China, State of the Art in Structural Engineering

Learning Objectives 1. Distinguish Parametric & Nonparametric Test Procedures 2. Explain a Variety of Nonparametric Test Procedures 3. Solve Hypothesis Testing Problems Using Nonparametric Tests Compute Spearman’s Rank Correlation

Hypothesis Testing Procedures x Many More Tests Exist! 12

Parametric Test Procedures 1. Require Interval Scale or Ratio Scale Whole Numbers or Fractions Example: Height in Inches (72, 60.5, 54.7) 2. Have Stringent Assumptions Example: Normal Distribution 3. Examples: t Test, Anova x

Advantages of Nonparametric Tests Easy to understand and use Usable with nominal data, fewer assumptions Results May Be as Exact as Parametric Procedures Appropriate for non-normal population distributions 6

Disadvantages of Nonparametric Tests 1. May Waste Information If Data Permit Using Parametric Procedures Example: Converting Data From Ratio to Nominal 2. Difficult to Compute by Hand for Large Samples 3. Tables Not Widely Available © 1984-1994 T/Maker Co. x

Recommended Statistical Techniques x 8

Frequently Used Nonparametric Tests 1. Sign Test 2. Wilcoxon Rank Sum Test 3. Wilcoxon Signed Rank Test 4. Kruskal Wallis H-Test Friedman’s Fr-Test Spearman’s Rank Correlation Coefficient x

Single-Variable Chi-Square Test x 47 9

Single-Variable Chi-Square Test 1. Compares the observed frequencies of categories to frequencies that would be expected if the null hypothesis were true. 2. Data are assumed to be a random sample. Each subject can only have one entry in the chi-square table. The expected frequency for each category should be at least 5. x

Single-Variable Chi-Square Test Open single variable chi.sav in SPSS. In SPSS, click Analyze  Nonparametric Tests  Chi-Square Move pbrand into the Test Variable List Click Options and select Descriptive under Statistics Click OK. x he randomly samples 42 18-year-olds and let them taste the three different brands. NPAR TESTS CHISQUARE=pbrand /EXPECTED=EQUAL. NPAR TESTS CHISQUARE=COLA /EXPECTED=12 13 17.

Single-Variable Chi-Square Test Ho: There is no preference for the three brands. Chi-square p-value = 0.032 < 0.05, reject Ho. Brand C is the favored brand. Preferred Brand Observed N Expected N Residual Brand A 10 14.0 -4.0 Brand B Brand C 22 8.0 Total 42 x Test Statistics Preferred Brand Chi-Square 6.857a df 2 Asymp. Sig. .032 a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is 14.0.

Sign Test 9 47

Sign Test 1. Tests Population Median,  (eta) 2. Corresponds to t-Test for 1 Mean # Sample Values Above (or Below) Median is used to test hypotheses 3. Small Sample Test Statistic Can Use Normal Approximation If n >= 20

Sign Test Uses P-Value to Make Decision Binomial: n = 8 p = 0.5 x Distribution has different shapes. 1st Graph: If inspecting 5 items & the Probability of a defect is 0.1 (10%), the Probability of finding 0 defective item is about 0.6 (60%). If inspecting 5 items & the Probability of a defect is 0.1 (10%), the Probability of finding 1 defective items is about .35 (35%). 2nd Graph: If inspecting 5 items & the Probability of a defect is 0.5 (50%), the Probability of finding 1 defective items is about .18 (18%). Note: Could use formula or tables at end of text to get Probabilities. P-Value Is the Probability of Getting an Observation At Least as Extreme as We Got. If 7 of 8 Observations ‘Favor’ Ha, Then P-Value = P(x  7) = .031 + .004 = .035. If  = .05, Then Reject H0 Since P-Value  .

Sign Test Example 7 people rated a new product on a 5-point Likert scale (1 = poor to 5 = excellent). The ratings are: 4 5 1 4 4 4 5. Is there evidence that the product has good market potential? Assume that the population is normally distributed.

R Sign Test If there is no difference in preferences, then we would expect a rating > 3 to be equally likely as a rating < 3. Hence, the probability that a rating > 3 (and < 3) is binomially distributed with p = 0.5 and n = 7. Test statistic = 6

Sign Test Solution H0:  = 3 Sign = 1 -pbinom(6,size=7,prob=.5) Ha:  > 3  = .05 Test Statistic: Sign = 1 -pbinom(6,size=7,prob=.5) [1] 0.0078125 Since p-value = 0.0078125 < 0.05, reject Ho. Median is significantly larger than 3 abs(qnorm(sign))/sqrt(7) gives the approximate effect size. x S = 6 (Only the third rating is less than 3: 4, 5, 1, 4, 4, 4, 5)

R Sign Test Twenty patients are given two treatments each (blindly and in randomized order) and then asked whether treatment A or B worked better. It turned out that 16 patients liked A better. If there was no difference between the two treatments, then we would expect the number of people favouring treatment A to be binomially distributed with p = 0.5 and n = 20. How (im)probable would it then be to obtain what we have observed? What we need is the probability of the observed 16 or more extreme, so we need to use “15 or less”: > 1-pbinom(15,size=20,prob=.5) [1] 0.005908966 If you want a two-tailed test because you have no prior idea about which treatment is better, then you will have to add the probability of obtaining equally extreme results in the opposite direction. > 1-pbinom(15,20,.5)+pbinom(4,20,.5) [1] 0.01181793 x

Cumulative Binomial Distribution x

SPSS Sign Test Open rating.sav in SPSS, which contains 7 people;s rating of the new product on a 5-point Likert scale (1 = terrible to 5 = excellent). Ho:  <= 3 vs. Ha:  > 3 In SPSS, click Analyze  Nonparametric Tests  Legacy Dialogs  Binomial Move “ratings” into the Test Variable List, and set Cut point = 3. Click OK. X 2. n = median

SPSS Sign Test Ours is a one-tail test, so the Sig. value has to be halved. Since p-value = 0.125/2 = 0.0625 > 0.05, do not reject Ho. The new product does not have sufficient good market potential to justify investment. x

Mann-Whitney Test Wilcoxon Rank Sum Test 47 9

Frequently Used Nonparametric Tests 1. Sign Test 2. Wilcoxon Rank Sum Test 3. Wilcoxon Signed Rank Test 4. Kruskal Wallis H-Test Friedman’s Fr-Test Spearman’s Rank Correlation Coefficient x

Mann-Whitney Test Assumptions: Independent Random Samples The assumption of normality has been violated in a t-test (especially if the sample size is small.) The assumption of homogeneity of variance has been violated in a t-test x Delete 3

Mann-Whitney Test Example Corresponds to the independent samples t-test. You want to see if the buying intentions for 2 two product designs are the same. For design 1, the ratings are 71, 82, 77, 92, 88. For design 2, the ratings are 86, 82, 94 & 97. Ho: Identical buying intention vs. Ha: Different buying intention Do the rates have the same probability distributions at the 5% level? Ho: Identical Distrib vs. Ha: Shifted Left or Right 51

Mann-Whitney Test Computation Table Design 1 Design 2 Rate Rank Rate Rank 71 1 85 5 x 82 3 3.5 82 4 3.5 77 2 94 8 92 7 97 9 88 6 ... ... Rank Sum 19.5 25.5

Mann-Whitney Test Corresponds to independent samples t-Test. Assign Ranks, Ri, to the n1 + n2 Sample Observations Average Ties Sum the Ranks, Ti, for Each Sample The rank sum of the smaller-sized sample is used to test hypotheses. x 2. If Unequal Sample Sizes, Let n1 Refer to Smaller-Sized Sample. Smallest Value = 1

Mann-Whitney Test Table (Portion)  = .05 one-tailed;  = .10 two-tailed x

Mann-Whitney Test Solution H0: Identical Distrib. Ha: Shifted Left or Right  = .10 n1 = 4 n2 = 5 Critical Value(s): Test Statistic: Decision: Conclusion: T2 = 5 + 3.5 + 8+ 9 = 25.5 (Smallest Sample) x Do Not Reject at  = .10 Do Not Reject Reject Reject There Is No Evidence Distrib. Are Not Equal 13 27  Ranks

R Wilcoxon Rank Sum Test Ho: The two product designs do not give rise to different buying intentions > ratings1 <- c(71,82,77,92,88) > ratings2 <- c(86,82, 94,97) > mann=wilcox.test(ratings1, ratings2, paired = F) > mann Wilcoxon rank sum test with continuity correction data: ratings1 and ratings2 W = 4.5, p-value = 0.2187 alternative hypothesis: true location shift is not equal to 0 > abs(qnorm(mann$p.value))/sqrt(5 + 4) [1] 0.2588163 > x

SPSS Mann-Whitney Test Read “product design.sav” into SPSS, which contains data of 9 respondents’ intention to buy a product of two different designs. Under Analyze  Nonparametric Tests  Legacy Dialogs  Two-Independent-Samples Tests Move “Probability of Buying” into Test Variable List, and design into Grouping Variable box. Click Define Groups, enter “1” for Group 1, and “2” for Group 2. Click Continue and then OK to get your output. X Mann and Whitney developed an equivalent but slightly different nonparametric procedure

SPSS Mann-Whitney Test Since Sig. = 0.190 > 0.05, do not reject Ho. The two product designs do not give rise to different buying intentions. x

Wilcoxon Signed Rank Test 9 47

Frequently Used Nonparametric Tests 1. Sign Test 2. Wilcoxon Rank Sum Test 3. Wilcoxon Signed Rank Test 4. Kruskal Wallis H-Test Friedman’s Fr-Test Spearman’s Rank Correlation Coefficient x

Wilcoxon Signed Rank Test 1. For repeated measurements taken from the same subject. 2. Corresponds to paired samples t-test. Tests Probability Distributions of 2 Related Populations n’ is the number of non-zero difference scores. Assumptions: Random Samples; Both Populations Are Continuous Can Use Normal Approximation If n  25

Signed Rank Test Example Is the new loyalty program better in boosting consumption (.05 level)? Buyer Old New Donna 9.98 9.88 Santosha 9.88 9.86 Sam 9.90 9.83 Tamika 9.99 9.80 Brian 9.94 9.87 Jorge 9.84 9.84

Signed Rank Test Computation Table x

Signed Rank Test Procedure 1. Obtain Difference Scores, Di = Old - New 2. Take Absolute Value of Differences, Di 3. Delete Differences With 0 Value 4. Assign Ranks, Ri, starting with 1 5. Assign Ranks Same Signs as Di Sum ‘+’ Ranks (T+) & ‘-’ Ranks (T-) Use T- for One-Tailed Test, and the Smaller of T- or T+ for 2-Tail Test x

Signed Rank Test Computation Table x

Wilcoxon Signed Rank Table (Portion)

Signed Rank Test Solution H0: Identical Distrib. Ha: Current Shifted Right  = .05 n’ = 5 (not 6; 1 elim.) Critical Value(s): Test Statistic: Decision: Conclusion: Since One-Tailed Test & Current Shifted Right, Use T-: T- = 0 x Look up n’ = 5 in table; not 6! Rejection region does not include cut-off. Reject at  = .05 Do Not Reject Reject 1 T0

R Signed Rank Test > old <- c(9.98, 9.88, 9.9, 9.99, 9.94, 9.84) > new <- c(9.88, 9.86, 9.83, 9.8, 9.87, 9.84) > wil = wilcox.test(old, new, paired = TRUE, alternative = "greater") > wil Wilcoxon signed rank test with continuity correction data: old and new V = 15, p-value = 0.02895 alternative hypothesis: true location shift is greater than 0 > abs(qnorm(wil$p.value))/sqrt(6 + 6) # Gives the approximate effect size. [1] 0.5474433 > X > result = wilcox.test(old, new, paired = TRUE, alternative = "greater") Warning messages: 1: In wilcox.test.default(old, new, paired = TRUE, alternative = "greater") : cannot compute exact p-value with ties 2: In wilcox.test.default(old, new, paired = TRUE, alternative = "greater") : cannot compute exact p-value with zeroes > summary(result) Length Class Mode statistic 1 -none- numeric parameter 0 -none- NULL p.value 1 -none- numeric null.value 1 -none- numeric alternative 1 -none- character method 1 -none- character data.name 1 -none- character > result$o.value NULL > result$p.value [1] 0.02895363 > result$parameter > result$statistic V 15 > result$null.value location shift > result$data.name [1] "old and new" > result$alternative [1] "greater" > result$method [1] "Wilcoxon signed rank test with continuity correction" >

SPSS Signed Rank Test Read “loyalty program.sav” into SPSS. Under Analyze  Nonparametric Tests  Legacy Dialogs  Two-Related-Samples Tests Set “old consumption level” as Variable 1, and “consumption level under new loyalty program” as Variable 2. Click Options, and select Descriptive. Click Continue and then OK to get your output. X Mann and Whitney developed an equivalent but slightly different nonparametric procedure

SPSS Signed Rank Test Since Sig. = 0.042 < 0.05, reject Ho. Consumption level under new loyalty program is significantly worse than the old level. x

Wilcoxon test The procedure is to subtract the theoretical mu and rank the differences according to their numerical value, ignoring the sign, and then calculate the sum of the positive or negative ranks. Assuming only that the distribution is symmetric around mu, the test statistic corresponds to selecting each number from 1 to n with probability 1/2 and calculating the sum. The distribution of the test statistic can be calculated exactly. It becomes computationally excessive in large samples, but the distribution is then very well approximated by a normal distribution. > wilcox.test(daily.intake, mu=7725) The test statistic V is the sum of the positive ranks. x unless the sample size is 6 or above, the signed-rank test simply cannot become significant at the 5% level. TheWilcoxon tests are susceptible to the problem of ties, where several observations share the same value. In such cases, you simply use the average of the tied ranks; for example, if there are four identical values corresponding to places 6 to 9, they will all be assigned the value 7.5.

Kruskal-Wallis H-Test 9 106

Frequently Used Nonparametric Tests 1. Sign Test 2. Wilcoxon Rank Sum Test 3. Wilcoxon Signed Rank Test 4. Kruskal Wallis H-Test Friedman’s Fr-Test Spearman’s Rank Correlation Coefficient x

Kruskal-Wallis H-Test 1. Tests the Equality of More Than 2 (p) Population Probability Distributions 2. Used to Analyze Completely Randomized Experimental Designs 4. Uses 2 Distribution with p - 1 df If At Least 1 Sample Size nj > 5 x

Kruskal-Wallis H-Test Assumptions Corresponds to ANOVA for More Than 2 Populations Independent, Random Samples At Least 5 Observations Per Sample x

Kruskal-Wallis H-Test Procedure 1. Assign Ranks, Ri , to the n Combined Observations Smallest Value = 1; Largest Value = n Average Ties 2. Sum Ranks for Each Group 3. Compute Test Statistic x Squared total of each group 111

Kruskal-Wallis H-Test Example As a marketing manager, you want to see how 3 different price levels affect sales. You assign 15 branches, 5 per price level, to the three price levels. At the .05 level, is there a difference in sales under the three price levels? Price1 Price2 Price3 25.40 23.40 20.00 26.31 21.80 22.20 24.10 23.50 19.75 23.74 22.75 20.60 25.10 21.60 20.40 112

Kruskal-Wallis H-Test Solution Raw Data Price1 Price2 Price3 25.40 23.40 20.00 26.31 21.80 22.20 24.10 23.50 19.75 23.74 22.75 20.60 25.10 21.60 20.40 Ranks Price1 Price2 Price3 14 9 2 15 6 7 12 10 1 11 8 4 13 5 3 65 38 17 x Total

Kruskal-Wallis H-Test Solution x 115

Kruskal-Wallis H-Test Solution H0: Identical Distrib. Ha: At Least 2 Differ  = .05 df = p - 1 = 3 - 1 = 2 Critical Value(s): Test Statistic: Decision: Conclusion: H = 11.58 x Reject at  = .05  = .05 There Is Evidence Pop. Distrib. Are Different  2 5.991

Kruskal-Wallis H-Test > mydata <- read.table('sales.csv', header = T, sep=',') > krus = kruskal.test(sales ~ price, data = mydata) > krus Kruskal-Wallis rank sum test data: sales by price Kruskal-Wallis chi-squared = 11.58, df = 2, p-value = 0.003058 > abs(qnorm(krus$p.value))/sqrt(nrow(mydata)) # Gives the approximate effect size. [1] 0.7078519 > X For sample sizes of five or greater, When H0 is true, the test statistic H has an approximate chi-square distribution with df = k-1.

Multiple Comparison > library(pgirmess) # The package is for doing multiple comparison, install.packages("pgirmess") if necessary. > kruskalmc(mydata$sales ~ mydata$price) Multiple comparison test after Kruskal-Wallis p.value: 0.05 Comparisons obs.dif critical.dif difference Price1-Price2 5.4 6.771197 FALSE Price1-Price3 9.6 6.771197 TRUE Price2-Price3 4.2 6.771197 FALSE x

SPSS Kruskal-Wallis H-Test Read “sales.sav” into SPSS. Under Analyze  Nonparametric Tests  Legacy Dialogs  K Independent Samples Move “sales” under Test Variable List, and “price” in Grouping Variable Box. Click Define Range, enter ‘1’ as Minimum and ‘3’ as Maximum. Click Options, and select Descriptive. Click Continue and then OK to get your output. x

SPSS Kruskal-Wallis H-Test Since Sig. = 0.003 < 0.05, reject Ho. There is a significant difference in sales under the three price levels. Price1 gives rise to the highest sales. x

Friedman Fr-Test x 106 9

Frequently Used Nonparametric Tests 1. Sign Test 2. Wilcoxon Rank Sum Test 3. Wilcoxon Signed Rank Test 4. Kruskal Wallis H-Test Friedman’s Fr-Test Spearman’s Rank Correlation Coefficient x

Friedman Fr-Test 1. Tests the Equality of More Than 2 (p) Population Probability Distributions 2. Corresponds to ANOVA for More Than 2 Means 3. Used to Analyze Randomized Block Experimental Designs 4. Uses 2 Distribution with p - 1 df If either p, the number of treatments, or b, the number of blocks, exceeds 5 x

Friedman Fr-Test Assumptions The p treatments are randomly assigned to experimental units within the b blocks Samples The measurements can be ranked within the blocks 3. Continuous population probability distributions x

Friedman Fr-Test Example For dependent samples. Three price levels were tested one after another in each branch. At the .05 level, is there a difference in the sales under the three price levels? Price1 Price2 Price3 3 5 0 23 17 15 11 5 7 8 4 2 19 11 5 112

Friedman Fr-Test Procedure 1. Assign Ranks, Ri = 1 – p, to the p treatments in each of the b blocks Smallest Value = 1; Largest Value = p Average Ties Sum Ranks for Each Treatment Compute Test Statistic x Squared total of each treatment 111

Friedman Fr-Test Solution Raw Data Price1 Price2 Price3 3 5 0 23 17 15 11 5 7 8 4 2 19 11 5 Ranks Price1 Price2 Price3 2 3 1 3 2 1 3 1 2 3 2 1 3 2 1 14 10 6 x Total

Friedman Fr-Test Solution x 115

Friedman Fr-Test Solution H0: Identical Distrib. Ha: At Least 2 Differ  = .05 df = p - 1 = 3 - 1 = 2 Critical Value(s): Test Statistic: Decision: Conclusion: Fr = 6.64 x Reject at  = .05  = .05 There Is Evidence Pop. Distrib. Are Different  2 5.991

Friedman Fr-Test > mydata <- read.table('related sample sales.csv', header = T, sep=',') > temp = as.matrix(mydata) > fri = friedman.test(temp) > fri Friedman rank sum test data: temp Friedman chi-squared = 8.4, df = 2, p-value = 0.015 > abs(qnorm(fri$p.value))/sqrt(nrow(temp)*ncol(temp)) [1] 0.5603451

Friedman Fr-Test > friedmanmc(temp) Multiple comparisons between groups after Friedman test p.value: 0.05 Comparisons obs.dif critical.dif difference 1-2 6 7.570429 FALSE 1-3 9 7.570429 TRUE 2-3 3 7.570429 FALSE > X Critical difference = z α/k(k-1) sqrt( k(k+1)/6N); k = # of groups

SPSS Friedman Fr-Test Read “related samples sales.sav” into SPSS. Under Analyze  Nonparametric Tests  Legacy Dialogs  K Related Samples Move price1 to price3 into the Test Variables Box. Click Statistics, and select Descriptive. Click Continue and then OK to get your output. x

SPSS Friedman Fr-Test Since Sig. = 0.015 < 0.05, reject Ho. There is a significant difference in sales under the three price levels. Price1 gives rise to the highest sales. x

Frequently Used Nonparametric Tests 1. Sign Test 2. Wilcoxon Rank Sum Test 3. Wilcoxon Signed Rank Test 4. Kruskal Wallis H-Test Friedman’s Fr-Test Spearman’s Rank Correlation Coefficient x

Nonparametric Correlation Spearman’s Rho Pearson’s correlation on the ranked data Kendall’s Tau Better than Spearman’s for small samples x Kendall’s tau, τ, should be used rather than Spearman’s coefficient when you have a small data set with a large number of tied ranks.

Spearman’s Rank Correlation Coefficient 1. Measures Correlation Between Ranks 2. Corresponds to Pearson Product Moment Correlation Coefficient 3. Values Range from -1 to +1 4. Equation (Shortcut) X Pearson’s correlation on the ranked data

Spearman’s Rank Correlation Procedure 1. Assign Ranks, Ri , to the Observations of Each Variable Separately 2. Calculate Differences, di , Between Each Pair of Ranks 3. Square Differences, di 2, Between Ranks 4. Sum Squared Differences for Each Variable 5. Use Shortcut Approximation Formula x

Spearman’s Rank Correlation Example You’re a research assistant for the FBI. You’re investigating the relationship between a person’s attempts at deception & % changes in their pupil size. You ask subjects a series of questions, some of which they must answer dishonestly. At the .05 level, what is the correlation coefficient? Subj. Deception Pupil 1 87 10 2 63 6 3 95 11 4 50 7 5 43 0 x 112

Spearman’s Rank Correlation Table x

Spearman’s Rank Correlation Table x

Spearman’s Rank Correlation Table x

Spearman’s Rank Correlation Table x

Spearman’s Rank Correlation Table x

Spearman’s Rank Correlation Table x

Spearman’s Rank Correlation Solution x

Doing a Rank Correlation X To conduct a bivariate correlation you need to find the Correlate option of the Analyze menu, and then select Bivariate to open the above dialog box. In the dialog box, select and move environ1 and environ4 into the Variables box, deselect Pearson, select Kandall’s tau-b and Spearman, and then click OK to get the output.

Correlation Output X The resulting Spearman’s rho^^2 needs to be interpreted slightly differently: it is the proportion of variance in the ranks that two variables share. Having said this, rho^^2 is usually a good approximation of R^^2 (especially in conditions of near-normal distributions). Kendall’s τ, however, is not numerically similar to either r or rho and so τ^^2 does not tell us about the proportion of variance shared by two variables (or the ranks of those two variables). Second, Kendall’s τ is 66–75% smaller than both Spearman’s rho and Pearson’s r (Strahan, 1982). As such, if τ is used as an effect size it should be borne in mind that it is not comparable to r and rs and should not be squared.

Related-Samples Nonparametric Tests: McNemar Test Before After Do Not Favor Favor A B C D x The McNemar test may be used with either nominal or ordinal data and is especially useful with before-after measurement of the same subjects. One can test the significance of any observed change by setting up a fourfold table of frequencies to represent the first and second set of responses. Since A + D represents the total number of people who changed (B and C are no change responses), the null hypothesis is that ½ (A+D) cases change in one direction and the same proportion in the other direction. The McNemar test uses a transformation of the chi-square test. The minus 1 in the equation is a correction for continuity since the chi-square is a continuous distribution and the observed frequencies represent a discrete distribution. An example is provided on the next slide. 20-88 88

An Example of the McNemar Test Before After Do Not Favor Favor A=10 B=90 C=60 D=40 x 20-89 89

Chi-square Test The Chi-Square Test procedure tabulates a variable into categories and tests the hypothesis that the observed frequencies do not differ from their expected values. From the menus choose:  Analyze    Nonparametric Tests      Chi-Square... x

Chi-square Test Select Day of the Week as the test variable. Click OK. x

Chi-square Test Average Daily Sales Observed N Expected N Residual 44 84.1 -40.1 78 -6.1 84 -.1 89 4.9 90 5.9 94 9.9 110 25.9 Total 589 Test Statistics Average Daily Sales Chi-Square 29.389a df 6 Asymp. Sig. .000 a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is 84.1. x The expected value for each row is equal to the sum of the observed frequencies divided by the number of rows in the table. In this example, there were 589 observed discharges per week, resulting in about 84 discharges per day. The residual is equal to the observed frequency minus the expected value. The table shows that Sunday has many fewer, and Friday, many more, patient discharges than an "every day is equal" assumption would expect. The obtained chi-square statistic equals 29.389. This is computed by squaring the residual for each day, dividing by its expected value, and summing across all days. The term df represents degrees of freedom. In a chi-square test, df is the number of expected values that can vary before the rest are completely determined. For a one-sample chi-square test, df is equal to the number of rows minus 1. Asymp. Sig. is the estimated probability of obtaining a chi-square value greater than or equal to 29.389 if daily sales are the same across the week. The low significance value suggests that the daily sales really does differ by day of the week.

Chi-square Test x Suppose we want to study Mon to Fri only. First, weigh dow by daily sales. Under Data, choose Weight Cases, click “Weight cases by” and move Average Daily Sales to the textbox. When you restricted the range of the test to weekdays, the sales appeared to be more uniform. You may be able to correct staff shortages by adopting separate weekday and weekend staff schedules.

Chi-square Test x When you restricted the range of the test to weekdays, the sales appeared to be more uniform. You may be able to correct staff shortages by adopting separate weekday and weekend staff schedules.

End of Chapter http://lmiller.org/nature-is-awesome/ The sun's path in the southern sky