Non-parametric: Analysis of Ranked Data

Slides:

Advertisements

Similar presentations

Nonparametric Methods: Analysis of Ranked Data

Advertisements

Prepared by Lloyd R. Jaisingh

Elementary Statistics

1 Chapter 20: Statistical Tests for Ordinal Data.

1 1 Slide 統計學 Spring 2004 授課教師：統計系余清祥日期： 2004 年 5 月 18 日第十三週：無母數方法.

Nonparametric Methods

16- 1 Chapter Sixteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.

Chapter 16 Introduction to Nonparametric Statistics

Two-Sample Tests of Hypothesis. Comparing two populations – Some Examples 1. Is there a difference in the mean value of residential real estate sold by.

Chapter 12 Chi-Square Tests and Nonparametric Tests

Chapter 14 Analysis of Categorical Data

Chapter 12 Chi-Square Tests and Nonparametric Tests

Statistics Are Fun! Analysis of Variance

Statistics for Managers Using Microsoft® Excel 5th Edition

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The Kruskal-Wallis Test The Kruskal-Wallis test is a nonparametric test that can be used to determine whether three or more independent samples were.

Chapter 9: Introduction to the t statistic

Chapter 15 Nonparametric Statistics

Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.

AM Recitation 2/10/11.

Chapter 11 Nonparametric Tests Larson/Farber 4th ed.

11 Chapter Nonparametric Tests © 2012 Pearson Education, Inc.

Non-parametric Methods: Analysis of Ranked Data

Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.

14 Elements of Nonparametric Statistics

NONPARAMETRIC STATISTICS

12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.

1 1 Slide © 2005 Thomson/South-Western AK/ECON 3480 M & N WINTER 2006 n Power Point Presentation n Professor Ying Kong School of Analytic Studies and Information.

Chapter 11 Nonparametric Tests.

What are Nonparametric Statistics? In all of the preceding chapters we have focused on testing and estimating parameters associated with distributions.

©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18.

© 2000 Prentice-Hall, Inc. Statistics Nonparametric Statistics Chapter 14.

1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.

Ordinally Scale Variables

MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.

Analysis of Variance Chapter 12 McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.

1 Nonparametric Statistical Techniques Chapter 17.

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.

©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Non-parametric: Analysis of Ranked Data Chapter 18.

Ka-fu Wong © 2003 Chap Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.

Statistics in Applied Science and Technology Chapter14. Nonparametric Methods.

CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.

NONPARAMETRIC STATISTICS In general, a statistical technique is categorized as NPS if it has at least one of the following characteristics: 1. The method.

Nonparametric Methods: Analysis of Ranked Data

1 QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 5 By Dr. Serhat Eren University OF PHOENIX.

Chapter 13 Understanding research results: statistical inference.

1 Nonparametric Statistical Techniques Chapter 18.

Analysis of Variance. The F Distribution Uses of the F Distribution – test whether two samples are from populations having equal variances – to compare.

Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.

©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Two-sample Tests of Hypothesis Chapter 11.

10 Chapter Chi-Square Tests and the F-Distribution Chapter 10

Chapter 11 Created by Bethany Stubbe and Stephan Kogitz.

Analysis of Variance . Chapter 12.

NONPARAMETRIC STATISTICS

Keller: Stats for Mgmt & Econ, 7th Ed Chi-Squared Tests

Chapter 12 Chi-Square Tests and Nonparametric Tests

NONPARAMETRIC STATISTICS

Statistics for Managers Using Microsoft Excel 3rd Edition

Two-Sample Tests of Hypothesis

Chapter 8 Hypothesis Testing with Two Samples.

Elementary Statistics

Hypothesis tests for the difference between two means: Independent samples Section 11.1.

Chapter 12 Nonparametric Methods

One-Way Analysis of Variance

NONPARAMETRIC METHODS

St. Edward’s University

Nonparametric Statistics

Chapter Sixteen McGraw-Hill/Irwin

COMPARING VARIABLES OF ORDINAL OR DICHOTOMOUS SCALES: SPEARMAN RANK- ORDER, POINT-BISERIAL, AND BISERIAL CORRELATIONS.

Chapter 9 Testing the Difference Between Two Means, Two Proportions, and Two Variances Copyright © 2012 The McGraw-Hill Companies, Inc. Permission required.

Presentation transcript:

Non-parametric: Analysis of Ranked Data Chapter 18

GOALS Conduct the sign test for dependent samples using the binomial and standard normal distributions as the test statistics. Conduct a test of hypothesis for dependent samples using the Wilcoxon signed-rank test. Conduct and interpret the Wilcoxon rank-sum test for independent samples. Conduct and interpret the Kruskal-Wallis test for several independent samples. Compute and interpret Spearman’s coefficient of rank correlation. Conduct a test of hypothesis to determine whether the correlation among the ranks in the population is different from zero.

The Sign Test The Sign Test is based on the sign of a difference between two related observations. No assumption is necessary regarding the shape of the population of differences. The binomial distribution is the test statistic for small samples and the standard normal (z) for large samples. The test requires dependent (related) samples.

The Sign Test continued Procedure to conduct the test: Determine the sign (+ or -) of the difference between related pairs. Determine the number of usable pairs. Compare the number of positive (or negative) differences to the critical value. n is the number of usable pairs (without ties), X is the number of pluses or minuses, and the binomial probability π = .5

The Sign Test - Example The director of information systems at Samuelson Chemicals recommended that an in-plant training program be instituted for managers. The objective is to improve the knowledge of database usage in accounting, procurement, production, and so on. A sample of 15 managers was selected at random. A panel of database experts determined the general level of competence of each manager with respect to using the database. Their competence and understanding were rated as being either outstanding, excellent, good, fair, or poor. After the three-month training program, the same panel of information systems experts rated each manager again. The two ratings (before and after) are shown along with the sign of the difference. A “+” sign indicates improvement, and a “-” sign indicates that the manager’s competence using databases had declined after the training program. Did the in-plant training program effectively increase the competence of the managers using the company’s database?

Step 1: State the Null and Alternative Hypotheses H0: π ≤.5 (There is no increase in competence as a result of the in- plant training program.) H1: π >.5 (There is an increase in competence as a result of the in- plant training program.) Step 2: Select a level of significance. We chose the .10 level. Step 3: Decide on the test statistic. It is the number of plus signs resulting from the experiment. Step 4: Formulate a decision rule. .

In this example α is .10. The probability of 3 or fewer successes is .029, found by .000 + .001 + .006 + .022. The probability of 11 or more successes is also .029. Adding the two probabilities gives .058. This is the closest we can come to .10 without exceeding it. Hence, the decision rule for a two-tailed test would be to reject the null hypothesis if there are 3 or fewer plus signs, or 11 or more plus signs.

Step 5: Make a decision regarding the null hypothesis. Eleven out of the 14 managers in the training course increased their database competency. The number 11 is in the rejection region, which starts at 10, so is rejected. We conclude that the three-month training course was effective. It increased the database competency of the managers.

Normal Approximation If the number of observations in the sample is larger than 10, the normal distribution can be used to approximate the binomial.

Normal Approximation - Example The market research department of Cola, Inc., has been given the assignment of testing a new soft drink. Two versions of the drink are considered—a rather sweet drink and a somewhat bitter one. A preference test is to be conducted consisting of a sample of 64 consumers. Each consumer will taste both the sweet cola (labeled A) and the bitter one (labeled B) and indicate a preference. Conduct a test of hypothesis to determine if there is a difference in the preference for the sweet and bitter tastes. Use the .05 significance level.

Normal Approximation - Example

Normal Approximation - Example

Wilcoxon Signed-Rank Test for Dependent Samples If the assumption of normality is violated for the paired-t test, use the Wilcoxon Signed-rank test. The test requires the ordinal scale of measurement. The observations must be related or dependent.

Wilcoxon Signed-Rank Test The steps for the test are: Compute the differences between related observations. Rank the absolute differences from low to high. Return the signs to the ranks and sum positive and negative ranks. Compare the smaller of the two rank sums with the T value, obtained from Appendix B.7.

Wilcoxon Signed-Rank Test for Dependent Samples - Example Fricker’s is a family restaurant chain located primarily in the southeastern part of the United States. It offers a full dinner menu, but its specialty is chicken. Recently, Fricker, the owner and founder, developed a new spicy flavor for the batter in which the chicken is cooked. Before replacing the current flavor, he wants to conduct some tests to be sure that patrons will like the spicy flavor better. To begin, Bernie selects a random sample of 15 customers. Each sampled customer is given a small piece of the current chicken and asked to rate its overall taste on a scale of 1 to 20. A value near 20 indicates the participant liked the flavor, whereas a score near 0 indicates they did not like the flavor. Next, the same 15 participants are given a sample of the new chicken with the spicier flavor and again asked to rate its taste on a scale of 1 to 20. The results are reported in the table on the right. Is it reasonable to conclude that the spicy flavor is preferred? Use the .05 significance level.

Wilcoxon Signed-Rank Test for Dependent Samples - Example

Each assigned rank in column 6 is then given the same sign as the original difference, and the results are reported in column 7. For example, the second participant has a difference score of 8 and a rank of 6. This value is located in the section of column 7. The R+ and R- columns are totaled. The sum of the positive ranks is 75 and the sum of the negative ranks is 30. The smaller of the two rank sums is used as the test statistic and referred to as T.

The critical values for the Wilcoxon signed-rank test are located in Appendix B.7. A portion of that table is shown on the table below.

The value at the intersection is 25, so the critical value is 25. The decision rule is to reject the null hypothesis if the smaller of the rank sums is 25 or less. The value obtained from Appendix B.7 is the largest value in the rejection region. To put it another way, our decision rule is to reject if the smaller of the two rank sums is 25 or less. In this case the smaller rank sum is 30, so the decision is not to reject the null hypothesis. We cannot conclude there is a difference in the flavor ratings between the current and the spicy.

Wilcoxon Rank-Sum Test The Wilcoxon Rank-Sum Test is used to determine if two independent samples came from the same or equal populations. No assumption about the shape of the population is required. The data must be at least ordinal scale. Each sample must contain at least eight observations.

Wilcoxon Rank-Sum Test To determine the value of the test statistic W, all data values are ranked from low to high as if they were from a single population. The sum of ranks for each of the two samples is determined.

Wilcoxon Rank-Sum Test for Independent Samples The Wilcoxon rank-sum test is based on the sum of ranks. The data are ranked as if the observations were part of a single sample. The sum of ranks for each of the two samples is determined If the null hypothesis is true, then the ranks will be about evenly distributed between the two samples, and the sum of the ranks for the two samples will be about the same.

Wilcoxon Rank-Sum Test for Independent Samples - Example Dan Thompson, the president of CEO Airlines, recently noted an increase in the number of no-shows for flights out of Atlanta. He is particularly interested in determining whether there are more no-shows for flights that originate from Atlanta compared with flights leaving Chicago. A sample of nine flights from Atlanta and eight from Chicago are reported in Table 18–4. At the .05 significance level, can we conclude that there are more no-shows for the flights originating in Atlanta?

The null and alternate hypotheses are: Mr. Thompson believes there are more no-shows for Atlanta flights. Thus, a one tailed test is appropriate, with the rejection region located in the upper tail. The null and alternate hypotheses are: H0: The population distribution of no-shows is the same or less for Atlanta and Chicago. H1: The population distribution of no-shows is larger for Atlanta than for Chicago. The test statistic follows the standard normal distribution. At the .05 significance level, we find from Appendix B.1 the critical value of z is 1.65. The null hypothesis is rejected if the computed value of z is greater than 1.65.

We rank the observations from both samples as if they were a single group. The Chicago flight with only 8 no-shows had the fewest, so it is assigned a rank of 1. The Chicago flight with 9 no-shows is ranked 2, and so on.

The value of W is calculated for the Atlanta group and is found to be 96.5, which is the sum of the ranks for the no-shows for the Atlanta flights.

Kruskal-Wallis Test: Analysis of Variance by Ranks This is used to compare three or more samples to determine if they came from equal populations. The ordinal scale of measurement is required. It is an alternative to the one-way ANOVA. The chi-square distribution is the test statistic. Each sample should have at least five observations. The sample data is ranked from low to high as if it were a single group.

Kruskal-Wallis Test: Analysis of Variance by Ranks - Example A management seminar consists of executives from manufacturing, finance, and engineering. Before scheduling the seminar sessions, the seminar leader is interested in whether the three groups are equally knowledgeable about management principles. Plans are to take samples of the executives in manufacturing, in finance, and in engineering and to administer a test to each executive. If there is no difference in the scores for the three distributions, the seminar leader will conduct just one session. However, if there is a difference in the scores, separate sessions will be given. We will use the Kruskal-Wallis test instead of ANOVA because the seminar leader is unwilling to assume that (1) the populations of management scores follow the normal distribution or (2) the population standard deviations are the same.

Kruskal-Wallis Test: Analysis of Variance by Ranks - Example Step 1: H0: The population distributions of the management scores for the populations of executives in manufacturing, finance, and engineering are the same. H1: The population distributions of the management scores for the populations of executives in manufacturing, finance, and engineering are NOT the same. Step 2: H0 is rejected if χ2 is greater than 7.185. There are 3 degrees of freedom at the .05 significance level.

Kruskal-Wallis Test: Analysis of Variance by Ranks - Example The next step is to select random samples from the three populations. A sample of seven manufacturing, eight finance, and six engineering executives was selected. Their scores on the test are recorded below.

Kruskal-Wallis Test: Analysis of Variance by Ranks - Example Considering the scores as a single population, the engineering executive with a score of 35 is the lowest, so it is ranked 1. There are two scores of 38. To resolve this tie, each score is given a rank of 2.5, found by (2+3)/2. This process is continued for all scores. The highest score is 107, and that finance executive is given a rank of 21. The scores, the ranks, and the sum of the ranks for each of the three samples are given in the table below.

Kruskal-Wallis Test: Analysis of Variance by Ranks - Example Because the computed value of H (5.736) is less than the critical value of 5.991, the null hypothesis is not rejected. There is not enough evidence to conclude there is a difference among the executives from manufacturing, finance, and engineering with respect to their typical knowledge of management principles. From a practical standpoint, the seminar leader should consider offering only one session including executives from all areas.

Kruskal-Wallis Test: Analysis of Variance by Ranks - Example

Rank-Order Correlation Spearman’s coefficient of rank correlation reports the association between two sets of ranked observations. The features are: It can range from –1.00 up to 1.00. It is similar to Pearson’s coefficient of correlation, but is based on ranked data. It computed using the formula:

Rank-Order Correlation - Example Lorrenger Plastics, Inc., recruits management trainees at colleges and universities throughout the United States. Each trainee is given a rating by the recruiter during the on-campus interview. This rating is an expression of future potential and may range from 0 to 15, with the higher score indicating more potential. The recent college graduate then enters an in-plant training program and is given another composite rating based on tests, opinions of group leaders, training officers, and so on. The on-campus rating and the in-plant training ratings are given in the table on the right.

Rank-Order Correlation - Example

Rank-Order Correlation - Example

Testing the Significance of rs State the null hypothesis: Rank correlation in population is 0. State the alternate hypothesis: Rank correlation in population is not 0. For a sample of 10 or more, the significance of is determined by computing t using the following formula. The sampling distribution of follows the t distribution with n - 2 degrees of freedom.

Testing the Significance of rs - Example

End of Chapter 18