Correlation – Spearman’s. What does it do? Measures rank correlation – whether highest value in the 1 st data set corresponds to highest in the 2 nd set.

Slides:



Advertisements
Similar presentations
Advanced Higher Geography
Advertisements

Tests of Hypotheses Based on a Single Sample
Elementary Statistics
Mann-Whitney U-test Testing for a difference U 1 = n 1 n 2 + ½ n 1 (n 1 + 1) – R 1.
T-test - unpaired Testing for a difference. What does it do? Tests for a difference in means Compares two cases (eg areas of lichen found in two locations)
2013/12/10.  The Kendall’s tau correlation is another non- parametric correlation coefficient  Let x 1, …, x n be a sample for random variable x and.
Hypothesis Testing Steps in Hypothesis Testing:
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Nonparametric Methods Chapter 15.
Lesson Inferences Between Two Variables. Objectives Perform Spearman’s rank-correlation test.
statistics NONPARAMETRIC TEST
The Simple Regression Model
10-2 Correlation A correlation exists between two variables when the values of one are somehow associated with the values of the other in some way. A.
Pearson Correlation Example A researcher wants to determine if there is a relationship between the annual number of lost workdays for each plant and the.
Nonparametrics and goodness of fit Petter Mostad
Chapter 14 in 1e Ch. 12 in 2/3 Can. Ed. Association Between Variables Measured at the Ordinal Level Using the Statistic Gamma and Conducting a Z-test for.
SIMPLE LINEAR REGRESSION
Means Tests Hypothesis Testing Assumptions Testing (Normality)
Chi-squared Testing for a difference. What does it do? Compares numbers of people/plants/species… in different categories (eg different pollution levels,
Chi-squared Goodness of fit. What does it do? Tests whether data you’ve collected are in line with national or regional statistics.  Are there similar.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
14 Elements of Nonparametric Statistics
Chapter 9 Hypothesis Testing: Single Population
Chapter 9 Part C. III. One-Tailed Tests B. P-values Using p-values is another approach to conducting a hypothesis test, yielding the same result. In general:
Choosing Your Test Spearman’s? Chi-squared? Mann-Whitney?
Copyright © 2012 Pearson Education. Chapter 23 Nonparametric Methods.
Correlation – Pearson’s. What does it do? Measures straight-line correlation – how close plotted points are to a straight line Takes values between –1.
Hypothesis of Association: Correlation
© 2000 Prentice-Hall, Inc. Statistics Nonparametric Statistics Chapter 14.
Correlation Analysis. Correlation Analysis: Introduction Management questions frequently revolve around the study of relationships between two or more.
The Statistical Analysis of Data. Outline I. Types of Data A. Qualitative B. Quantitative C. Independent vs Dependent variables II. Descriptive Statistics.
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
Chapter 221 What Is a Test of Significance?. Chapter 222 Thought Question 1 The defendant in a court case is either guilty or innocent. Which of these.
T-test - paired Testing for a difference. What does it do? Tests for a difference in means Compares two cases (eg soil moisture content north & south.
GG 313 Lecture 9 Nonparametric Tests 9/22/05. If we cannot assume that our data are at least approximately normally distributed - because there are a.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Chapter 7 Calculation of Pearson Coefficient of Correlation, r and testing its significance.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Nonparametric Statistics.
Correlation. u Definition u Formula Positive Correlation r =
Understanding Statistics © Curriculum Press 2003     H0H0 H1H1.
Wilcoxon Signed Rank Testing for a difference R+ RR
Chi-squared Association Index. What does it do? Looks for “links” between two factors  Do dandelions and plantains tend to grow together?  Does the.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
Advanced Higher STATISTICS Spearman’s Rank (Spearman’s rank correlation coefficient) Lesson Objectives 1. Explain why it is used. 2. List the advantages.
Practice You recently finished giving 5 Villanova students the MMPI paranoia measure. Determine if Villanova students’ paranoia score is significantly.
Lesson 5 DATA ANALYSIS. Am I using and independent groups design or repeated measures? Independent groups Mann- Whitney U test Repeated measures Wilcoxon.
Statistics for A2 Biology Standard deviation Student’s t-test Chi squared Spearman’s rank.
Testing for a difference
Scatter Plots and Correlation
Regression and Correlation
Spearman’s Rho Correlation
Testing for a difference
Testing for a difference
Chapter 14 in 1e Ch. 12 in 2/3 Can. Ed.
Inferential Statistics
Spearman’s rho Chi-square (χ2)
Inferential Statistics
Chi-Squared test AP Biology.
Hypothesis Tests for Proportions
Correlation coefficient
Chi-squared Association Index
Statistics.
Inferences Between Two Variables
Nonparametric Statistics
Spearman’s Rank Correlation Coefficient
Section 11.1: Significance Tests: Basics
Spearman’s Rank For relationship data.
8.2 Day I: Z-Tests for a Mean One Tailed Large Sample
Presentation transcript:

Correlation – Spearman’s

What does it do? Measures rank correlation – whether highest value in the 1 st data set corresponds to highest in the 2 nd set etc. eg Do higher nitrogen levels give greater plant growth? Takes values between –1 and 1 Note: the data do not have to be in a perfect straight line to have perfect rank correlation – just in the same order Perfect negative rank correlation +1 0 No correlation Perfect positive rank correlation

Planning to use it? You have at least 5 data pairs (more is better) You want to use rank correlation rather than straight line correlation – if your data are close to a straight line, Pearson’s may be better You do not have too many ties Make sure that…

How does it work? You assume (null hypothesis) there is no correlation The test involves ranking the data (rank 1 for highest value, rank 2 for 2 nd highest etc) and looking at the differences between ranks.  If the two sets of ranks tend to agree (eg the highest nitrate levels being associated with the greatest plant growth) – it’s positive correlation  If the two sets of ranks tend to disagree (eg the smallest soil salinity levels being associated with the greatest plant growth) – it’s negative correlation

Doing the test These are the stages in doing the test: 1.Write down your hypotheseshypotheses 2.Work out the ranksranks 3.Do the calculations to get a value for the correlationcalculations 4.Look at the tablestables 5.Make a decisiondecision Click here Click here for an example

Hypotheses H 0   = 0 (there is no correlation) For H 1, you have a choice, depending on what alternative you were looking for. H 1:  > 0 (positive correlation) orH 1:  < 0 (negative correlation) orH 0:   0 (some correlation) If you have a good scientific reason for expecting a particular kind of correlation, use one of the first two. If not, use the   0

You’ll have two sets of data Eg nitrate concentrations and mean seedling height You rank each set of data separately, giving 1 to the largest value, 2 to the 2 nd largest, 3 to the 3 rd largest etc Eg you’d rank all the concentration data, giving rank 1 to the largest. Then you’d rank all the mean seedling heights, giving rank 1 to the largest. If you have any ties, you give them the average of the ranks they would have had otherwise Eg if two concentrations tied for 3 rd place, they would otherwise have used up ranks 3 and 4. So you give them the average of 3 and 4 =3.5 Ranks

Calculations Work out the difference between the ranks for each point. These are called d-values Eg – the difference in the rank for nitrate concentration and for mean seedling height. Square all your d-values and add up the answers. This gives you  d 2 Substitute into the formula n is the number of samples

Tables This is a Spearman’s correlation coefficient table This is the number of pairs These are your significance levels eg 0.05 = 5% Note different 1 and 2-tail values

Make a decision If your value is bigger than the tables value (ignoring signs), then you can reject the null hypothesis. Otherwise you must accept it. Make sure you choose the right tables value – it depends whether your test is 1 or 2 tailed:  If you are using H 1 :  > 0 or H 1 :  < 0, you are doing a 1-tailed test  If you are using H 1 :   0, you are doing a 2-tailed test

Example: Soil Salinity & Plant Height The data below were collected on soil salinity and plant height. Hypotheses: H 0:  = 0 (no correlation) H 1   0 (some correlation)

Ranks Soil salinity Plant height Rank (salinity) Rank (height) These tied for 4 th place. They are both given the rank 4.5 (the average of 4 and 5) Since the previous two “used up” the ranks of 4 and 5, this has rank 6

Calculations Soil salinity Plant height Rank (salinity) Rank (height) d d So  d 2 = = 56.5

Test We have used H 1   0 – so it is a 2-tailed test Tables value (5% level): So we must accept H 0 – there is no significant correlation  =