Describing Relationships: Scatterplots and Correlation

Slides:



Advertisements
Similar presentations
7.1 Seeking Correlation LEARNING GOAL
Advertisements

Chapter 4 The Relation between Two Variables
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
5/17/2015Chapter 41 Scatterplots and Correlation.
Chapter 41 Describing Relationships: Scatterplots and Correlation.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
Describing the Relation Between Two Variables
Chapter 10 Relationships between variables
Scatterplots By Wendy Knight. Review of Scatterplots  Scatterplots – Show the relationship between 2 quantitative variables measured on the same individual.
Scatter Diagrams and Correlation
Describing Relationships: Scatter Plots and Correlation ● The world is an indivisible whole (butterfly effect and chaos theory; quantum entanglement, etc.)
Relationship of two variables
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
CHAPTER 4: Scatterplots and Correlation ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
BPS - 3rd Ed. Chapter 41 Scatterplots and Correlation.
BPS - 3rd Ed. Chapter 41 Scatterplots and Correlation.
Chapter 14 Describing Relationships: Scatterplots and Correlation Chapter 141.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 1 – Slide 1 of 30 Chapter 4 Section 1 Scatter Diagrams and Correlation.
1 Examining Relationships in Data William P. Wattles, Ph.D. Francis Marion University.
4.1 Scatter Diagrams and Correlation. 2 Variables ● In many studies, we measure more than one variable for each individual ● Some examples are  Rainfall.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Lesson Scatterplots and Correlation. Knowledge Objectives Explain the difference between an explanatory variable and a response variable Explain.
Essential Statistics Chapter 41 Scatterplots and Correlation.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Scatterplots are used to investigate and describe the relationship between two numerical variables When constructing a scatterplot it is conventional to.
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 23, 2009.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
CHAPTER 4 SCATTERPLOTS AND CORRELATION BPS - 5th Ed. Chapter 4 1.
Chapter 4 Scatterplots and Correlation. Explanatory and Response Variables u Interested in studying the relationship between two variables by measuring.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Scatter Diagrams and Correlation Variables ● In many studies, we measure more than one variable for each individual ● Some examples are  Rainfall.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Chapter 7 Scatterplots, Association, and Correlation.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
 Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination  Justify the regression.
Chapter 141 Describing Relationships: Scatterplots and Correlation.
Business Statistics for Managerial Decision Making
BPS - 5th Ed. Chapter 41 Scatterplots and Correlation.
What Do You See?. A scatterplot is a graphic tool used to display the relationship between two quantitative variables. How to Read a Scatterplot A scatterplot.
Notes Chapter 7 Bivariate Data. Relationships between two (or more) variables. The response variable measures an outcome of a study. The explanatory variable.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Chapter 5 Summarizing Bivariate Data Correlation.
Correlation & Linear Regression Using a TI-Nspire.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Essential Statistics Chapter 41 Scatterplots and Correlation.
CHAPTER 3 Describing Relationships
CHAPTER 7 LINEAR RELATIONSHIPS
Basic Practice of Statistics - 3rd Edition
Daniela Stan Raicu School of CTI, DePaul University
Basic Practice of Statistics - 3rd Edition
Daniela Stan Raicu School of CTI, DePaul University
Basic Practice of Statistics - 3rd Edition
Basic Practice of Statistics - 5th Edition
Daniela Stan Raicu School of CTI, DePaul University
CHAPTER 4: Scatterplots and Correlation
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Basic Practice of Statistics - 3rd Edition
CHAPTER 3 Describing Relationships
Essential Statistics Scatterplots and Correlation
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Summarizing Bivariate Data
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapters Important Concepts and Terms
Review of Chapter 3 Examining Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Describing Relationships: Scatterplots and Correlation Statistical Thinking Chapter 14 Describing Relationships: Scatterplots and Correlation Chapter 14 Chapter 13

Statistical Thinking Correlation Objective: Analyze a collection of paired data (sometimes called bivariate data). A correlation exists between two variables when there is a relationship (or an association) between them. We will consider only linear relationships. - when graphed, the points approximate a straight-line pattern. Chapter 13 Chapter 13

Statistical Thinking Scatterplot A scatterplot is a graph in which paired (x, y) data (usually collected on the same individuals) are plotted with one variable represented on a horizontal (x -) axis and the other variable represented on a vertical (y-) axis. Each individual pair (x, y) is plotted as a single point. Example: Chapter 13 Chapter 13

Examining a Scatterplot Statistical Thinking Examining a Scatterplot You can describe the overall pattern of a scatterplot by the Form – linear or non-linear ( quadratic, exponential, no correlation etc.) Direction – negative, positive. Strength – strong, very strong, moderately strong, weak etc. Look for outliers and how they affect the correlation. Chapter 13 Chapter 13

Scatterplot Example: Draw a scatter plot for the data below. What is the nature of the relationship between X and Y. x 2 4 –2 – 4 y 6 x 1 2 3 4 5 y -4 -2 Strong, positive and linear. Chapter 13

Examining a Scatterplot Statistical Thinking Examining a Scatterplot Two variables are positively correlated when high values of the variables tend to occur together and low values of the variables tend to occur together. The scatterplot slopes upwards from left to right. Two variables are negatively correlated when high values of one of the variables tend to occur with low values of the other and vice versa. The scatterplot slopes downwards from left to right. Chapter 13 Chapter 13

Types of Correlation As x increases, y tends to decrease. Statistical Thinking Types of Correlation x y x y As x increases, y tends to decrease. As x increases, y tends to increase. Negative Linear Correlation Positive Linear Correlation x y x y No Correlation Non-linear Correlation Chapter 13 Chapter 13 7

Examples of Relationships Statistical Thinking Examples of Relationships Chapter 13 Chapter 13

Statistical Thinking Thought Question 1 What type of association would the following pairs of variables have – positive, negative, or none? Temperature during the summer and electricity bills Temperature during the winter and heating costs Number of years of education and height Frequency of brushing and number of cavities Number of churches and number of bars in cities Height of husband and height of wife Chapter 13 Chapter 13

Statistical Thinking Thought Question 2 Consider the two scatterplots below. How does the outlier impact the correlation for each plot? does the outlier increase the correlation, decrease the correlation, or have no impact? Chapter 13 Chapter 13

Measuring Strength & Direction of a Linear Relationship How closely does a non-horizontal straight line fit the points of a scatterplot? The correlation coefficient (often referred to as just correlation): r measure of the strength of the relationship: the stronger the relationship, the larger the magnitude of r. measure of the direction of the relationship: positive r indicates a positive relationship, negative r indicates a negative relationship. Chapter 13

Correlation Coefficient Statistical Thinking Correlation Coefficient Greek Capital Letter Sigma – denotes summation or addition. The <Plot> link on this slide is to the Correlation & Regression applet found on the VCU Stat 208 website. The address is http://www.people.vcu.edu/~jemays/regression/ . Chapter 13 Chapter 13

Correlation Coefficient The range of the correlation coefficient is -1 to 1. -1 1 If r = -1 there is a perfect negative correlation If r is close to 0 there is no linear correlation If r = 1 there is a perfect positive correlation Chapter 13

Linear Correlation Strong negative correlation Statistical Thinking Linear Correlation x y x y r = 0.91 r = 0.88 Strong negative correlation Strong positive correlation x y x y r = 0.42 r = 0.07 Try Weak positive correlation Non-linear Correlation Chapter 13 Chapter 13 14

Correlation Coefficient Statistical Thinking Correlation Coefficient special values for r : a perfect positive linear relationship would have r = +1 a perfect negative linear relationship would have r = -1 if there is no linear relationship, or if the scatterplot points are best fit by a horizontal line, then r = 0 Note: r must be between -1 and +1, inclusive r > 0: as one variable changes, the other variable tends to change in the same direction r < 0: as one variable changes, the other variable tends to change in the opposite direction The <Plot> link on this slide is to the Correlation & Regression applet found on the VCU Stat 208 website. The address is http://www.people.vcu.edu/~jemays/regression/ . Chapter 13 Chapter 13

Examples of Correlations Statistical Thinking Examples of Correlations Husband’s versus Wife’s ages r = .94 Husband’s versus Wife’s heights r = .36 Professional Golfer’s Putting Success: Distance of putt in feet versus percent success r = -.94 The <Plot> link on this slide is to the Correlation & Regression applet found on the VCU Stat 208 website. The address is http://www.people.vcu.edu/~jemays/regression/ . Plot Chapter 13 Chapter 13

Correlation Coefficient Statistical Thinking Correlation Coefficient Because r uses the z-scores for the observations, it does not change when we change the units of measurements of x , y or both. Correlation ignores the distinction between explanatory and response variables. r measures the strength of only linear association between variables. A large value of r does not necessarily mean that there is a strong linear relationship between the variables – the relationship might not be linear; always look at the scatterplot. When r is close to 0, it does not mean that there is no relationship between the variables, it means there is no linear relationship. Outliers can inflate or deflate correlations. The <Plot> link on this slide is to the Correlation & Regression applet found on the VCU Stat 208 website. The address is http://www.people.vcu.edu/~jemays/regression/ . Try Chapter 13 Chapter 13

Not all Relationships are Linear Miles per Gallon versus Speed Curved relationship (r is misleading) Speed chosen for each subject varies from 20 mph to 60 mph MPG varies from trial to trial, even at the same speed Statistical relationship r=-0.06 Chapter 13

Common Errors Involving Correlation Statistical Thinking Common Errors Involving Correlation 1. Causation: It is wrong to conclude that correlation implies causality. 2. Averages: Averages suppress individual variation and may inflate the correlation coefficient. 3. Linearity: There may be some relationship between x and y even when there is no linear correlation. page 525 of Elementary Statistics, 10th Edition Chapter 13 Chapter 13

Correlation and Causation The fact that two variables are strongly correlated does not in itself imply a cause-and-effect relationship between the variables. If there is a significant correlation between two variables, you should consider the following possibilities. Is there a direct cause-and-effect relationship between the variables? Does x cause y? Chapter 13

Correlation and Causation Is there a reverse cause-and-effect relationship between the variables? Does y cause x? Is it possible that the relationship between the variables can be caused by a third variable or by a combination of several other variables? Is it possible that the relationship between two variables may be a coincidence? Chapter 13

Example A survey of the world’s nations in 2004 shows a strong Statistical Thinking Example A survey of the world’s nations in 2004 shows a strong positive correlation between percentage of countries using cell phones and life expectancy in years at birth. Does this mean that cell phones are good for your health? No. It simply means that in countries where cell phone use is high, the life expectancy tends to be high as well. What might explain the strong correlation? The economy could be a lurking variable. Richer countries generally have more cell phone use and better health care. The <Plot> link on this slide is to the Correlation & Regression applet found on the VCU Stat 208 website. The address is http://www.people.vcu.edu/~jemays/regression/ . Chapter 13 Chapter 13

Example The correlation between Age and Income as measured on 100 Statistical Thinking Example The correlation between Age and Income as measured on 100 people is r = 0.75. Explain whether or not each of these conclusions is justified. When Age increases, Income increases as well. The form of the relationship between Age and Income is linear. There are no outliers in the scatterplot of Income vs. Age. Whether we measure Age in years or months, the correlation will still be 0.75. The <Plot> link on this slide is to the Correlation & Regression applet found on the VCU Stat 208 website. The address is http://www.people.vcu.edu/~jemays/regression/ . Chapter 13 Chapter 13

Example Explain the mistakes in the statements below: Statistical Thinking Example Explain the mistakes in the statements below: “My correlation of -0.772 between GDP and Infant Mortality Rate shows that there is almost no association between GDP and Infant Mortality Rate”. “There was a correlation of 0.44 between GDP and Continent” “There was a very strong correlation of 1.22 between Life Expectancy and GDP”. The <Plot> link on this slide is to the Correlation & Regression applet found on the VCU Stat 208 website. The address is http://www.people.vcu.edu/~jemays/regression/ . Chapter 13 Chapter 13

Warnings about Statistical Significance “Statistical significance” does not imply the relationship is strong enough to be considered “practically important.” Even weak relationships may be labeled statistically significant if the sample size is very large. Even very strong relationships may not be labeled statistically significant if the sample size is very small. Chapter 13

Key Concepts Strength of Linear Relationship Direction of Linear Relationship Correlation Coefficient Problems with Correlations r can only be calculated for quantitative data. Chapter 13