Correlation – Pearson’s. What does it do? Measures straight-line correlation – how close plotted points are to a straight line Takes values between –1.

Slides:



Advertisements
Similar presentations
Mann-Whitney U-test Testing for a difference U 1 = n 1 n 2 + ½ n 1 (n 1 + 1) – R 1.
Advertisements

T-test - unpaired Testing for a difference. What does it do? Tests for a difference in means Compares two cases (eg areas of lichen found in two locations)
Hypothesis Testing Steps in Hypothesis Testing:
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
Correlation Correlation is the relationship between two quantitative variables. Correlation coefficient (r) measures the strength of the linear relationship.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. *Chapter 29 Multiple Regression.
SIMPLE LINEAR REGRESSION
Introduction to Probability and Statistics Linear Regression and Correlation.
SIMPLE LINEAR REGRESSION
Surds Learning objectives Different kind of numbers
Nonparametrics and goodness of fit Petter Mostad
Linear Regression.
SIMPLE LINEAR REGRESSION
The Chi-Square Distribution 1. The student will be able to  Perform a Goodness of Fit hypothesis test  Perform a Test of Independence hypothesis test.
Correlation Scatter Plots Correlation Coefficients Significance Test.
Regression Analysis (2)
Chi-squared Testing for a difference. What does it do? Compares numbers of people/plants/species… in different categories (eg different pollution levels,
Chi-squared Goodness of fit. What does it do? Tests whether data you’ve collected are in line with national or regional statistics.  Are there similar.
CORRELATION & REGRESSION
Correlation and Regression
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Measure your handspan and foot length in cm to nearest mm We will record them as Bivariate data below: Now we need to plot them in what kind of graph?
The Scientific Method Interpreting Data — Correlation and Regression Analysis.
Choosing Your Test Spearman’s? Chi-squared? Mann-Whitney?
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
© 2010 Pearson Prentice Hall. All rights reserved. CHAPTER 12 Statistics.
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
Biostatistics Lecture 17 6/15 & 6/16/2015. Chapter 17 – Correlation & Regression Correlation (Pearson’s correlation coefficient) Linear Regression Multiple.
Least Squares Regression: y on x © Christine Crisp “Teach A Level Maths” Vol. 2: A2 Core Modules.
School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science LIS Introduction.
Sec 1.5 Scatter Plots and Least Squares Lines Come in & plot your height (x-axis) and shoe size (y-axis) on the graph. Add your coordinate point to the.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by.
T-test - paired Testing for a difference. What does it do? Tests for a difference in means Compares two cases (eg soil moisture content north & south.
Objective Students will add, subtract, multiply, divide, and simplify radicals.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Review of Topic Equations Changing subject of formulae Inequalities.
Correlation & Regression Analysis
1 Data Analysis Linear Regression Data Analysis Linear Regression Ernesto A. Diaz Department of Mathematics Redwood High School.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Example x y We wish to check for a non zero correlation.
Wilcoxon Signed Rank Testing for a difference R+ RR
Chi-squared Association Index. What does it do? Looks for “links” between two factors  Do dandelions and plantains tend to grow together?  Does the.
Correlation – Spearman’s. What does it do? Measures rank correlation – whether highest value in the 1 st data set corresponds to highest in the 2 nd set.
© The McGraw-Hill Companies, Inc., Chapter 10 Correlation and Regression.
REGRESSION AND CORRELATION SIMPLE LINEAR REGRESSION 10.2 SCATTER DIAGRAM 10.3 GRAPHICAL METHOD FOR DETERMINING REGRESSION 10.4 LEAST SQUARE METHOD.
Box Method for Factoring Factoring expressions in the form of.
AP PHYSICS 1 SUMMER PACKET Table of Contents 1.What is Physics? 2.Scientific Method 3.Mathematics and Physics 4.Standards of Measurement 5.Metric System.
Testing for a difference
Regression and Correlation
Testing for a difference
Correlation and Simple Linear Regression
Testing for a difference
Elementary Statistics
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Statistical Inference about Regression
Chi-squared Association Index
Simple Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Correlation – Pearson’s

What does it do? Measures straight-line correlation – how close plotted points are to a straight line Takes values between –1 and 1 Perfect negative correlation +1 0 No correlation Perfect positive correlation

Planning to use it? You have continuous data (eg lengths, weights…) – it isn’t valid otherwise You have at least 5 data pairs (more is better) You want to use Pearson’s rather than rank correlation – does the scatter diagram look close to a straight line? Make sure that…

How does it work? You assume (null hypothesis) there is no correlation The test involves calculating totals from your data and substituting into a formula. This works out how far off a straight line your points are The calculation can be done automatically on a spreadsheet, and on many graphic calculators

Doing the test These are the stages in doing the test: 1.Write down your hypotheseshypotheses 2.Work out the totals needed for the formulatotals 3.Use the formula to get a value for the correlationformula 4.Look at the tablestables 5.Make a decisiondecision Click here Click here for an example Click here Click here to find out how to calculate a best-fit line

Hypotheses H 0: r = 0 (there is no correlation) For H 1, you have a choice, depending on what alternative you were looking for. H 1: r > 0 (positive correlation) orH 1: r < 0 (negative correlation) orH 0: r  0 (some correlation) If you have a good scientific reason for expecting a particular kind of correlation, use one of the first two. If not, use the r  0

Totals Get your data in table form like this, and complete the extra columns shown xyx 2 y 2 xy Total each column. This gives you  x,  y,  x 2,  y 2, and  xy

Formula n = number of data pairs  x = sum of x-values,  y = sum of y values etc

Tables This is a Pearson’s correlation coefficient table This is your number of pairs These are your significance levels eg 0.05 = 5%

Make a decision If your value is bigger than the tables value (ignoring signs), then you can reject the null hypothesis. Otherwise you must accept it. Make sure you choose the right tables value – it depends whether your test is 1 or 2 tailed:  If you are using H 1 : r > 0 or H 1 : r < 0, you are doing a 1-tailed test  If you are using H 1 : r  0, you are doing a 2-tailed test

Soil Salinity & Plant Height The data below were collected on soil salinity and plant height. Hypotheses: H 0: r = 0 (no correlation) H 1 r  0 (some correlation)

Totals Soil Salinity (x) Plant Height (y) x y xy  x = 78  y = 265  x 2 = 1438  y 2 =  xy = 2582 NB: You HAVE to work out  y 2 by squaring all the values and adding up. You CAN’T work out the sum of y, then square.

Formula We now put all the totals into the formula: Click here Click here for some hints on working this out on a calculator

Pearson’s on the Calculator First check if the calculator is “scientific” – that is, it automatically does multiplication before addition Try  3. If you get 14, it does multiplication 1 st If you get 18, it doesn’t Work out the top of the fraction.  For a scientific calculator, put it in exactly as shown ((78)(65) means 78  65)  For a non-scientific calculator, put in brackets 2582 – (1/6  78  65) (-863) Work out each part of the bottom of the fraction.  Non-scientific calculator: (1/6  (78 2 )) (424, ) Multiply the two parts from the bottom together ( ) Take the square root of previous answer – keep answer in memory ( ) Divide top of fraction by previous answer

The test We have used H 1 r  0 – so it is a 2-tailed test Tables value (5% level): Our value: So we can reject H 0 – there is some correlation

Calculating a Best-Fit Line If Pearson’s is significant, then it’s valid to calculate a best fit (regression) line The line has equation y = a + bx where a and b can be calculated This lets you make predictions of the height of a plant given the soil salinity, by putting values of x into the equation

Finding the Line The line has equation y = a + bx So for the soil salinity, the line is: So the equation is: y = – 2.035x