Correlation We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here.

Slides:



Advertisements
Similar presentations
Correlation Data collected from students in Statistics classes included their heights (in inches) and weights (in pounds): Here we see a positive association.
Advertisements

Correlation and Regression. Relationships between variables Example: Suppose that you notice that the more you study for an exam, the better your score.
Examining Relationships Prob. And Stat. 2.2 Correlation.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The correlation coefficient, r, tells us about strength (scatter) and direction of the linear relationship between two quantitative variables. In addition,
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
4.2 Correlation The Correlation Coefficient r Properties of r 1.
Chapter 4 Scatterplots and Correlation. Chapter outline Explanatory and response variables Displaying relationships: Scatterplots Interpreting scatterplots.
What Do You See?. A scatterplot is a graphic tool used to display the relationship between two quantitative variables. How to Read a Scatterplot A scatterplot.
Correlation  We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here.
Part II Exploring Relationships Between Variables.
Two-Variable Data Analysis
Scatterplots, Association, and Correlation
Chapter 3: Describing Relationships
CHAPTER 7 LINEAR RELATIONSHIPS
Variables Dependent variable: measures an outcome of a study
Two Quantitative Variables
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
SCATTERPLOTS, ASSOCIATION AND RELATIONSHIPS
Chapter 6: Exploring Data: Relationships Lesson Plan
Chapter 3: Describing Relationships
Chapter 6: Exploring Data: Relationships Lesson Plan
Basic Practice of Statistics - 5th Edition
Chapter 3: Describing Relationships
Variables Dependent variable: measures an outcome of a study
Chapter 2 Looking at Data— Relationships
Ice Cream Sales vs Temperature
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Unit 4 Vocabulary.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Warmup In a study to determine whether surgery or chemotherapy results in higher survival rates for a certain type of cancer, whether or not the patient.
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3 Scatterplots and Correlation.
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
September 25, 2013 Chapter 3: Describing Relationships Section 3.1
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Summarizing Bivariate Data
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Correlation r Explained
Chapter 3: Describing Relationships
Describing Bivariate Relationships
AP Stats Agenda Text book swap 2nd edition to 3rd Frappy – YAY
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Correlation Coefficient
Basic Practice of Statistics - 3rd Edition
Review of Chapter 3 Examining Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
3.2 Correlation Pg
Presentation transcript:

Correlation We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here are both of the same data, on different scales. The second seems to be a stronger association…

Here’s a formula for Pearson’s correlation coefficient: This formula is not for computing r but for understanding r. Notice that the first step in this formula involves standardizing each x and y value and then multiplying the two standardized values (how many s.d.s above or below the means the x’s and y’s are...) together. When two variables x and y are positively associated their standardized values tend to be both positive or both negative (think of height and weight) so the product is positive. When two variables are negatively associated then if x for example is above the mean, the y tends to be below the mean (and vice versa) so the product is negative.

The correlation coefficient, r, is a numerical measure of the strength of the linear relationship between two quantitative variables. It is always a number between -1 and +1. Positive r positive association Negative r negative association r=+1 implies a perfect positive relationship; points falling exactly on a straight line with positive slope r=-1 implies a perfect negative relationship; points falling exactly on a straight line with negative slope r~0 implies a very weak linear relationship

Correlation makes no distinction between explanatory & response variables – doesn’t matter which is which… Both variables must be quantitative r uses standardized values of the observations, so changing scales of one or the other or both of the variables doesn’t affect the value of r. r measures the strength of the linear relationship between the two variables. It does not measure the strength of non-linear or curvilinear relationships, no matter how strong the relationship is… r is not resistant to outliers – be careful about using r in the presence of outliers on either variable

To explore how extreme outlying observations influence r, see the applet on Correlation and Regression at whfreeman.com/ips6e . Homework: Reading 2.1 Use R to scatterplot, add different characters for a "lurking variable", compute correlation coefficient, compute slope and intercept of the regression line, plot regression line on the scatterplot (see next page for some code to do all this…) HW: On page 16 of Reading & Problems 2.1, do problems # 4.3, 4.7, 4.9 using R. Also, look at the UN data on GDP and CO2 emissions: plot, correlate, regress… DESCRIBE/EXPLAIN WHAT YOU FIND!

plot(x,y) # gives a scatterplot of y (vertical) on #x (horizontal) To add a different plotting #character, use the pch= option as in plot(x,y,pch=15) #(or try different numbers) #or plot(x,y,pch="x") # or plot(x,y,pch=as.numeric(sex)) plot(x,y,pch=15,cex=1.5) #cex=1.5 makes the plotting #characters 1.5 times as big as default characters cor(x,y) #gives the Pearson correlation coefficient # denoted by r between x and y lm(y~x) #gives the least squares linear regression # of y on x abline(lm(y~x)) #draws the regression line on a #scatterplot (that's already drawn) summary(lm(y~x)) # shows more detail about the #slope and intercept.