Scatterplots, Association, and Correlation 60 min.

Slides:



Advertisements
Similar presentations
Chapter 3: Describing Relationships
Advertisements

Correlation and Linear Regression
 Objective: To look for relationships between two quantitative variables.
Scatterplots, Association, and Correlation
Correlation Data collected from students in Statistics classes included their heights (in inches) and weights (in pounds): Here we see a positive association.
Section 3.1 Scatterplots. Two-Variable Quantitative Data  Most statistical studies involve more than one variable.  We may believe that some of the.
Scatterplots, Association,and Correlation
Scatter Diagrams and Correlation
Scatterplots, Association, and Correlation Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 7 Scatterplots, Association, and Correlation.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 6, Slide 1 Chapter 6 Scatterplots, Association and Correlation.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Correlation with a Non - Linear Emphasis Day 2.  Correlation measures the strength of the linear association between 2 quantitative variables.  Before.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 7 Scatterplots, Association, and Correlation.
Copyright © 2010 Pearson Education, Inc. Unit 2: Chapter 7 Scatterplots, Association, and Correlation.
Examining Relationships
Scatterplots, Association,
Examining Relationships Prob. And Stat. 2.2 Correlation.
Scatterplots, Associations, and Correlation
 Chapter 7 Scatterplots, Association, and Correlation.
Slide 7-1 Copyright © 2004 Pearson Education, Inc.
1 Chapter 7 Scatterplots, Association, and Correlation.
4.1 Scatter Diagrams and Correlation. 2 Variables ● In many studies, we measure more than one variable for each individual ● Some examples are  Rainfall.
Copyright © 2010 Pearson Education, Inc. Slide Lauren is enrolled in a very large college calculus class. On the first exam, the class mean was a.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Correlation.  It should come as no great surprise that there is an association between height and weight  Yes, as you would expect, taller students.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Chapter 7 Scatterplots, Association, and Correlation
Scatter Diagrams and Correlation Variables ● In many studies, we measure more than one variable for each individual ● Some examples are  Rainfall.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
4.2 Correlation The Correlation Coefficient r Properties of r 1.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Scatterplots, Association, and Correlation.
Copyright © 2010 Pearson Education, Inc. Chapter 7 Scatterplots, Association, and Correlation.
DO NOW 1.Given the scatterplot below, what are the key elements that make up a scatterplot? 2.List as many facts / conclusions that you can draw from this.
CP Prob & Stats Unit 4 – Chapter 7 A Tale of Two Variables.
UNIT 4 Bivariate Data Scatter Plots and Regression.
Chapter 7 Scatterplots, Association, and Correlation.
Scatterplots Association and Correlation Chapter 7.
Module 11 Scatterplots, Association, and Correlation.
Correlation  We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here.
Copyright © 2010 Pearson Education, Inc. Chapter 3 Scatterplots, Correlation and Least Squares Regression Slide
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 7 Scatterplots, Association, and Correlation.
Chapter 5 Lesson 5.1a Summarizing Bivariate Data 5.1a: Correlation.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Scatterplots, Association, and Correlation.
Honors Statistics Chapter 7 Scatterplots, Association, and Correlation.
Statistics 7 Scatterplots, Association, and Correlation.
Scatterplots, Association, and Correlation. Scatterplots are the best way to start observing the relationship and picturing the association between two.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.1 Scatterplots and Correlation.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 7- 1.
Part II Exploring Relationships Between Variables.
Scatterplots, Association, and Correlation
Chapter 3: Describing Relationships
Chapter 7: Scatterplots, Association, and Correlation
Chapter 7 Scatterplots, Association, and Correlation
Scatterplots, Association and Correlation
Chapter 7: Scatterplots, Association, and Correlation
Chapter 7 Scatterplots, Association, and Correlation
Chapter 7 Scatterplots, Association, and Correlation
Chapter 7 Part 1 Scatterplots, Association, and Correlation
Chapter 7 Part 2 Scatterplots, Association, and Correlation
Looking at Scatterplots
Scatterplots, Association, and Correlation
Scatterplots, Association, and Correlation
Scatterplots, Association, and Correlation
Chi-square Test for Goodness of Fit (GOF)
Scatterplots, Association and Correlation
Scatterplots Scatterplots may be the most common and most effective display for data. In a scatterplot, you can see patterns, trends, relationships, and.
Correlation r Explained
Scatterplots, Association, and Correlation
Presentation transcript:

Scatterplots, Association, and Correlation 60 min

 Associations between two quantitative variables can be explored using scatterplots.  Ex: Study time v.s test scores Study time (in hours) Test score (%) Study time (in hours) Test score (%)

 Scatterplots may be the most common and most effective display for data.  Scatterplots are the best way to start observing the relationship and the ideal way to picture associations between two quantitative variables.  When looking at scatterplots, we will look for direction, form, strength, and unusual features.

 A pattern that runs from the upper left to the lower right is said to have a negative direction.  A trend running the other way has a positive direction.  The figure shows a negative direction between the year since 1970 and the and the prediction errors made by NOAA.  As the years have passed, the predictions have improved (errors have decreased).

 The example in the text shows a negative association between central pressure and maximum wind speed  As the central pressure increases, the maximum wind speed decreases.

 If there is a straight line (linear) relationship, it will appear as a cloud or swarm of points stretched out in a generally consistent, straight form.

Example  What is the coldest temperature?  ( 0 in Kelvin scale, the so-called absolute zero)  How do scientists determine that it cannot be colder than 0 K? The volume of hydrogen gas depends on its temperature: Temp V(mole)

 The relationship is straight? curved gently? Increasing then decreasing? No pattern?

 How strong is the relationship?  Note: we will quantify the amount of scatter soon.

 Look for the unexpected: Oftentimes the most interesting thing to see in a scatterplot is the thing you never thought to look for.  One example of such a surprise is an outlier standing away from the overall pattern of the scatterplot.  Clusters or subgroups should also raise questions.

 It is important to determine which of the two quantitative variables goes on the x-axis and which on the y-axis. This determination is made based on the roles played by the variables.  When the roles are clear, the explanatory or predictor variable goes on the x-axis, and the response variable goes on the y-axis. ◦ Example 1: Price versus Demand of a product ◦ Example 2: A study confirm that people who use artificial sweeteners in place of sugar tend to be heavier than people who use sugar. Does this mean that artificial sweeteners cause weight gain?

 The roles that we choose for variables are more about how we think about them rather than about the variables themselves.  Just placing a variable on the x-axis doesn’t necessarily mean that it explains or predicts anything. And the variable on the y-axis may not respond to it in any way.

 Data collected from students in Statistics classes included their heights (in inches) and weights (in pounds):  Here we see a positive association and a fairly straight form, although there seems to be a high outlier.

 How strong is the association between weight and height of Statistics students?  If we had to put a number on the strength, we would not want it to depend on the units we used.  A scatterplot of heights (in centimeters) and weights (in kilograms) doesn’t change the shape of the pattern:

 Since the units don’t matter, why not remove them altogether?  We could standardize both variables and write the coordinates of a point as (z x, z y ).  Here is a scatterplot of the standardized weights and heights:

 Some points (those in green) strengthen the impression of a positive association between height and weight.  Other points (those in red) tend to weaken the positive association.  Points with z-scores of zero (those in blue) don’t vote either way.

 The correlation coefficient (r) gives us a numerical measurement of the strength of the linear relationship between the explanatory and response variables.  For the students’ heights and weights, the correlation is  What does this mean in terms of strength? We’ll address this shortly.

 Correlation measures the strength of the linear association between the two variables. ◦ Variables can have a strong association but still have a small correlation if the association isn’t linear.  The sign of a correlation coefficient r gives the direction of the association.  Correlation coefficient r is always between -1 and +1. ◦ A correlation near 1 or -1 corresponds to a very strong linear association. ◦ A correlation near zero corresponds to a weak (or almost no) linear association.

 Correlation treats x and y symmetrically: ◦ The correlation of x with y is the same as the correlation of y with x.  Correlation has no units.  Correlation is not affected by changes in the center or scale of either variable. ◦ Correlation depends only on the z-scores, and they are unaffected by changes in center or scale.  Correlation is sensitive to outliers. A single outlying value can make a small correlation large or make a large one small.

◦ Researchers have recently found that kids who lie may have more success ahead. “Don’t tell a lie” is one of the first lessons parents try to teach their kids, but a new study finds children who lie may end up doing better as adults. A study at Toronto University of more than 1000 children ages 2 to 17 found the ability….(May 21, 2010)  Should we encourage our kids to lie?

 Whenever we have a strong correlation, it is tempting to explain it by imagining that the predictor variable has caused the response variable to help.  Scatterplots and correlation coefficients never prove causation.  A hidden variable that stands behind a relationship and determines it by simultaneously affecting the other two variables is called a lurking variable.

 Don’t confuse “correlation” with “causation.” ◦ Scatterplots and correlations never demonstrate causation. These statistical tools can only demonstrate an association between variables.  Don’t correlate categorical variables.  Be sure the association is linear. ◦ There may be a strong association between two variables that have a nonlinear association.

 We examine scatterplots for direction, form, strength, and unusual features.  Although not every relationship is linear, when the scatterplot is straight enough, the correlation coefficient is a useful numerical summary. ◦ The sign of the correlation tells us the direction of the association. ◦ The magnitude of the correlation tells us the strength of a linear association. ◦ Correlation has no units, so shifting or scaling the data, standardizing, or swapping the variables has no effect on the numerical value.

Page 186 – 193 Problem # 3, 5, 7, 11, 13, 17, 25, 29, 33, 35.