Correlation  We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here.

Slides:



Advertisements
Similar presentations
Section 6.1: Scatterplots and Correlation (Day 1).
Advertisements

Looking at data: relationships - Correlation Lecture Unit 7.
 Objective: To look for relationships between two quantitative variables.
Correlation Data collected from students in Statistics classes included their heights (in inches) and weights (in pounds): Here we see a positive association.
Scatter Diagrams and Linear Correlation
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Looking at data: relationships - Correlation IPS chapter 2.2 Copyright Brigitte Baldi 2005 ©
CHAPTER 3 Describing Relationships
Chapter 6 Prediction, Residuals, Influence Some remarks: Residual = Observed Y – Predicted Y Residuals are errors.
Descriptive Methods in Regression and Correlation
Scatterplots, Association, and Correlation Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Examining Relationships
Scatterplots, Association,
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
Examining Relationships Prob. And Stat. 2.2 Correlation.
Scatterplots, Associations, and Correlation
Chapter 6 Scatterplots and Correlation Chapter 7 Objectives Scatterplots  Scatterplots  Explanatory and response variables  Interpreting scatterplots.
CHAPTER 7: Exploring Data: Part I Review
Slide 7-1 Copyright © 2004 Pearson Education, Inc.
1 Chapter 7 Scatterplots, Association, and Correlation.
Lesson Scatterplots and Correlation. Knowledge Objectives Explain the difference between an explanatory variable and a response variable Explain.
Objectives (IPS Chapter 2.1)
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
Objectives 2.1Scatterplots  Scatterplots  Explanatory and response variables  Interpreting scatterplots  Outliers Adapted from authors’ slides © 2012.
Chapter 4 Scatterplots and Correlation. Explanatory and Response Variables u Interested in studying the relationship between two variables by measuring.
The Practice of Statistics
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Describing Relationships Using Correlations. 2 More Statistical Notation Correlational analysis requires scores from two variables. X stands for the scores.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Correlations: Relationship, Strength, & Direction Scatterplots are used to plot correlational data – It displays the extent that two variables are related.
Scatterplots and Correlations
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
4.2 Correlation The Correlation Coefficient r Properties of r 1.
Correlation. Correlation is a measure of the strength of the relation between two or more variables. Any correlation coefficient has two parts – Valence:
Chapter 4 Scatterplots and Correlation. Chapter outline Explanatory and response variables Displaying relationships: Scatterplots Interpreting scatterplots.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Chapter 12: Correlation and Linear Regression 1.
Statistics: Analyzing 2 Quantitative Variables MIDDLE SCHOOL LEVEL  Session #2  Presented by: Dr. Del Ferster.
Chapter 7 Scatterplots, Association, and Correlation.
Scatterplots Association and Correlation Chapter 7.
Chapter 14 STA 200 Summer I Scatter Plots A scatter plot is a graph that shows the relationship between two quantitative variables measured on the.
Lecture 8 Sections Objectives: Bivariate and Multivariate Data and Distributions − Scatter Plots − Form, Direction, Strength − Correlation − Properties.
Module 11 Scatterplots, Association, and Correlation.
Chapter 5 Summarizing Bivariate Data Correlation.
Scatterplots and Correlation Section 3.1 Part 2 of 2 Reference Text: The Practice of Statistics, Fourth Edition. Starnes, Yates, Moore.
Lecture 4 Chapter 3. Bivariate Associations. Objectives (PSLS Chapter 3) Relationships: Scatterplots and correlation  Bivariate data  Scatterplots (2.
Lecture 3 – Sep 3. Normal quantile plots are complex to do by hand, but they are standard features in most statistical software. Good fit to a straight.
Statistics 7 Scatterplots, Association, and Correlation.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.1 Scatterplots and Correlation.
Two-Variable Data Analysis
Chapter 12: Correlation and Linear Regression 1.
3. Relationships Scatterplots and correlation
Ch. 10 – Scatterplots, Association and Correlation (Day 1)
Looking at data: relationships - Correlation
The Practice of Statistics in the Life Sciences Fourth Edition
Chapter 2 Looking at Data— Relationships
Examining Relationships
Objectives (IPS Chapter 2.3)
11A Correlation, 11B Measuring Correlation
Chapter 3: Describing Relationships
Summarizing Bivariate Data
AP Stats Agenda Text book swap 2nd edition to 3rd Frappy – YAY
Association between 2 variables
Statistics 101 CORRELATION Section 3.2.
Bivariate Data Response Variable: measures the outcome of a study (aka Dependent Variable) Explanatory Variable: helps explain or influences the change.
Correlation We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here.
Correlation Coefficient
Scatterplots contd: Correlation The regression line
Presentation transcript:

Correlation  We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here are both scatterplots of the same data, on different scales. The second seems to be a stronger association…  So we need a measure of association independent of the graphics…

Use the correlation coefficient, r  The correlation coefficient is a measure of the direction and strength of a linear relationship.  It is calculated using the mean and the standard deviation of both the x and y variables.  Correlation can only be used to describe quantitative variables. Categorical variables don’t have means and standard deviations.

The correlation coefficient r Time to swim: = 35, s x = 0.7 Pulse rate: = 140 s y = 9.5

Part of the calculation involves finding z, the standardized score similar to the one we used when working with the normal distribution. Standardization: Allows us to compare correlations between data sets where variables are measured in different units or when variables are different. For instance, we might want to compare the correlation between [swim time and pulse], with the correlation between [swim time and breathing rate]. You DON'T want to do this by hand. Make sure you learn how to use your calculator or the computer to find r. z for time z for pulse

r does not distinguish between x & y The correlation coefficient, r, treats x and y symmetrically "Time to swim" is the explanatory variable here, and belongs on the x axis. However, in either plot r is the same (r=-0.75). r = -0.75

Changing the units of measure of variables does not change the correlation coefficient r, because we "standardize out" the units when getting z-scores. r has no unit of measure (unlike x and y) r = z-score plot is the same for both plots z for time z for pulse

r ranges from -1 to +1 r quantifies the strength and direction of a linear relationship between 2 quantitative variables. Strength: how closely the points follow a straight line. Direction: is positive when individuals with higher X values tend to have higher values of Y.

When variability in one or both variables decreases, the correlation coefficient gets stronger (  closer to +1 or -1).

No matter how strong the association, r should not be used to describe non-linear relationships - we have other methods… Note: You can sometimes transform a non-linear association to a linear form, for instance by taking the logarithm. You can then calculate a correlation using the transformed data. Correlation coefficient r describes linear relationships

Correlations are calculated using means and standard deviations, and thus are NOT resistant to outliers - try the Statistical Applet under Resources in the eBook on the Stats Portal… Influential points Just moving one point away from the general trend here decreases the correlation from to -0.75

In this example, adding two outliers decreases r from 0.95 to Go to the Stats Portal, under Resources, try Statistical Applets, and choose the Correlation and Regression one… put some points in the scatterplot, watch the value of r and see what happens when you put in an outlier or two…

Homework: Read section 2.2, pay careful attention to the properties of the correlation coefficient, r To explore how extreme outlying observations influence r, play around with the Statistical Applet on Correlation and Regression under Resources in the eBook on the Stats Portal… – Then, using the computer to draw the scatterplots and do the computations as needed, do problems # , 2.47, 2.53, 2.55, 2.56, 2.60