Slide Slide 1 Chapter 4 Scatterplots and Correlation.

Slides:



Advertisements
Similar presentations
AP Statistics Section 3.1B Correlation
Advertisements

Scatterplots and Correlation
 Objective: To look for relationships between two quantitative variables.
Chapter 4 Describing the Relation Between Two Variables 4.3 Diagnostics on the Least-squares Regression Line.
Chapter Describing the Relation between Two Variables © 2010 Pearson Prentice Hall. All rights reserved 3 4.
Scatter Diagrams and Linear Correlation
Chapter 4 Describing the Relation Between Two Variables
© 2010 Pearson Prentice Hall. All rights reserved Scatterplots and Correlation Coefficient.
Describing the Relation Between Two Variables
Chap 3-1 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 3 Describing Data: Numerical.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Association between two variables Example: University fees for the Big Ten Universities Data were collected to study the association between the percentage.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Describing the Relation between Two Variables 4.
Describing Relationships: Scatterplots and Correlation
Scatter Diagrams and Correlation
AP STATISTICS LESSON 3 – 1 EXAMINING RELATIONSHIPS SCATTER PLOTS.
Correlation and Regression
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Sections 9-1 and 9-2 Overview Correlation. PAIRED DATA Is there a relationship? If so, what is the equation? Use that equation for prediction. In this.
Check it out! 4.3.3: Distinguishing Between Correlation and Causation
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.2 The Association.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 1 – Slide 1 of 30 Chapter 4 Section 1 Scatter Diagrams and Correlation.
4.1 Scatter Diagrams and Correlation. 2 Variables ● In many studies, we measure more than one variable for each individual ● Some examples are  Rainfall.
Lesson Scatterplots and Correlation. Knowledge Objectives Explain the difference between an explanatory variable and a response variable Explain.
Objectives (IPS Chapter 2.1)
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Chapter 4 Describing the Relation Between Two Variables 4.1 Scatter Diagrams; Correlation.
+ Warm Up Tests 1. + The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.1 Scatterplots.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
CSC323 – Week 3 Outline  Quiz  Associations between two variables Scatter plots Correlation coefficient  Linear regression analysis.
Scatter Diagrams and Correlation Variables ● In many studies, we measure more than one variable for each individual ● Some examples are  Rainfall.
Association between 2 variables We've described the distribution of 1 variable - but what if 2 variables are measured on the same individual? Examples?
Chapter 4 - Scatterplots and Correlation Dealing with several variables within a group vs. the same variable for different groups. Response Variable:
The Big Picture Where we are coming from and where we are headed…
 Describe the association between two quantitative variables using a scatterplot’s direction, form, and strength  If the scatterplot’s form is linear,
Lesson Scatter Diagrams and Correlation. Objectives Draw and interpret scatter diagrams Understand the properties of the linear correlation coefficient.
Chapter 4 Scatterplots and Correlation. Chapter outline Explanatory and response variables Displaying relationships: Scatterplots Interpreting scatterplots.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Describing the Relation between Two Variables 4.
Notes Chapter 7 Bivariate Data. Relationships between two (or more) variables. The response variable measures an outcome of a study. The explanatory variable.
3.1 Scatterplots and Correlation Objectives SWBAT: IDENTIFY explanatory and response variables in situations where one variable helps to explain or influences.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Describing the Relation between Two Variables 4.
Response Variable: measures the outcome of a study (aka Dependent Variable) Explanatory Variable: helps explain or influences the change in the response.
Correlation  We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here.
Balloon Activity Thoughts Did you discover a relationship between the circumference of the balloons and the time it took for them to descend? What were.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Scatterplots, Association, and Correlation.
Unit 3 – Association: Contingency, Correlation, and Regression Lesson 3-2 Quantitative Associations.
Chapter Describing the Relation between Two Variables © 2010 Pearson Prentice Hall. All rights reserved 3 4.
Lesson Scatterplots and Correlation. Objectives Describe why it is important to investigate relationships between variables Identify explanatory.
Two-Variable Data Analysis
Review and Preview and Correlation
CHAPTER 7 LINEAR RELATIONSHIPS
3.1 Scatterplots and Correlation
Aim – How can we analyze bivariate data using scatterplots?
Copyright © Cengage Learning. All rights reserved.
Describing the Relation between Two Variables
Basic Practice of Statistics - 3rd Edition
Basic Practice of Statistics - 5th Edition
Scatterplots, Association, and Correlation
Chapter 7 Part 1 Scatterplots, Association, and Correlation
Chapter 2 Looking at Data— Relationships
Scatterplots and Correlation
3 4 Chapter Describing the Relation between Two Variables
Chapter 4 - Scatterplots and Correlation
3 4 Chapter Describing the Relation between Two Variables
Scatterplots, Association and Correlation
Association between 2 variables
Bivariate Data Response Variable: measures the outcome of a study (aka Dependent Variable) Explanatory Variable: helps explain or influences the change.
Presentation transcript:

Slide Slide 1 Chapter 4 Scatterplots and Correlation

Slide Slide 2 Scatterplots Linear Correlation Coefficient Section 4.1 Scatter Diagrams and Correlation

Slide Slide 3 The source of the data is a full page advertisement placed in the Straits Times newspaper issue of February 29, 1992, by a Singapore-based retailer of diamond jewelry. The variables are the size of the diamond in carats (1 carat =.2 gram) and the price of ladies’ rings (single diamond stone) in Singapore dollars. Carats Singapore dollars …….….. How would you describe the association between the two variables? Association between two variables: Size of diamond and price of ring

Slide Slide 4 SCATTERPLOT: Diamond rings data Carat Price in US dollars N=48Averages.d.MinMax X Carat Y Price in US $ Diamond carats vs Price in US$

Slide Slide 5 Terminology Response variable: measures the outcome of the study (Dependent variable) Explanatory variable: explains or causes changes in the response variable (Independent variable) Example: Carat=Explanatory variablePrice=Response variable

Slide Slide

Slide Slide 7 EXAMPLE Interpreting a Scatter Diagram The data shown to the right are based on a study for drilling rock. The researchers wanted to determine whether the time it takes to dry drill a distance of 5 feet in rock increases with the depth at which the drilling begins. Depth, x, is the explanatory variable, Time, y, (in minutes) to drill five feet is the response variable. Draw a scatter diagram of the data. Source: Penner, R., and Watts, D.G. “Mining Information.” The American Statistician, Vol. 45, No. 1, Feb. 1991, p. 6.

Slide Slide 8 4-8

Slide Slide 9 Interpreting scatter plots 1.Look for the overall pattern and for striking deviations 2.Define form, direction and strength of the relationship: a.Form: roughly linear if the points follow a straight line or nonlinear… b.Direction: positive or negative? c.Strength: how closely the points follow a clear form 3.Check for the presence of outliers, individual values that fall outside the overall pattern 4.Two variables are positively (negatively) associated if the increase of one variable correspond to an increase (decrease) in the other variable. Demo

Slide Slide 10 Various Types of Relations in a Scatter Diagram 4-10

Slide Slide 11 Example: 2000 Presidential Elections Did the butterfly ballots confuse voters? Did voters for Al Gore instead cast their votes for other candidates? Bush spokesman Ari Fleishcher stated on Nov that "Palm Beach County is a Pat Buchanan stronghold and that's why Pat Buchanan received 3,407 votes there." What is the level of support that Pat Buchanan enjoys in Palm Beach County? The published election results show the association between the vote totals for Pat Buchanan and the total population for Florida counties.

Slide Slide 12 Is the association positive or negative? Is the form of the relationship almost linear? Outlier present?

Slide Slide 13 Another example: The statistics of poverty and inequality Data from U.N.E.S.C.O Demographic Year Book. For 97 countries in the world, data are given for birth rates and for an index of the Gross National Product.

Slide Slide 14 Note: More information can be added into a graph by putting the categorical variable ON the scatter plot, either as a label of the points, or as a symbol instead of the points themselves, or by the use of color (different color for different category) as in the previous graph.

Slide Slide 15 The plot before shows a non-linear association! Sometimes we can make it linear, by using some transformations on the variables. Possible transformations are, for example, “ln”, “exp”, “sqrt”. Here we consider the natural log of GNP. Birth rate vs Log G.N.P. Linearization using Mathematical Transformations:

Slide Slide 16 Measure of Linear Association If there is a strong linear association between the variables, then the cloud of points on the scatter plot will be close to a line. Birth rate (1,000 pop) Log G.N.P.

Slide Slide 17 The Correlation Coefficient r The correlation coefficient r measures the direction and the strength of the linear relationship between two variables. It is a value between –1 and 1 The closer r is to 1 or –1, the stronger the linear association is. Positive values of r imply a positive association, negative values imply a negative association Values of r close to 0 imply weak linear association. Sample r is defined as: Where X data have average and standard deviation s x, and Y data have average and standard deviation s y.

Slide Slide 18 EXAMPLE Determining the Linear Correlation Coefficient Determine the linear correlation coefficient of the drilling data. 4-18

Slide Slide 19 (x i )/s x (y i )/s y product

Slide Slide

Slide Slide 21 Properties of r  The correlation coefficient r varies between –1 and 1. If r=0 means there is no linear association between X and Y. If r=1 or –1, then the points in a scatter plot lie on a straight line.  Positive r indicates positive association between X and Y. Negative r indicates negative association between X and Y.  Both variables X and Y must be quantitative. The correlation coefficient between X and Y is the same as the correlation between Y and X  r does not change if we change the units of measurement for X and Y  The correlation measures only the linear relationship between two variables  r can be strongly affected by the presence of outliers.

Slide Slide 22 Example of correlation Birth rate (1,000 pop) Log G.N.P. r = Negative association

Slide Slide 23 Diamond rings data Carat Price in US dollars N=48Averages.d.MinMax X Carat Y Price in US $ Strong positive association: r = Diamond carats vs Price in US$

Slide Slide Positive Correlation In each plot there are 100 points. The correlation coefficient measures the amount of clustering around a line. If r is close to 1, then points lie close to a straight line!!

Slide Slide Negative Correlation Negative correlation: as x increases, y tends to decrease. If r is close to – 1, then points lie close to a straight line!!

Slide Slide 26 Match the correlation with the plot! Match the diagrams with the following correlations: – 0.93 – 0.75 – More here

Slide Slide 27 Change of scale These are the low and high temperatures in Boulder (CO) for the month of April The first scatter plot uses degrees in Fahrenheit and the second plot uses degrees in centigrade. Notice that C o = 5/9*(F o – 32) Are the correlations between low and high temperatures in the two graphs different? r = 0.74r = ?

Slide Slide Different correlations? In which diagram below is the correlation coefficient the largest? The smallest?

Slide Slide Outliers and nonlinear association How are the data sets different?

Slide Slide Plot the data: the nature of the association between x and y is very different. The correlation coefficient can be misleading in presence of outliers or non-linear association. Check the scatter plot of the data Perfect association! Why is r not equal to 1? Outliers change the value of r. What would the value of r be without the outliers? For each of these: r = 0.82

Slide Slide Which of the following diagrams should be summarized by r? (1) (2) (3)

Slide Slide Correlation does not mean Causation!!

Slide Slide 33 Ice cream sales and crime rates have a very high correlation. Does this mean that local governments should shut down all ice cream shops? Ans: There is another variable: temperature! As air temperatures rise, both ice cream sales and crime rates rise. Here, temperature is a lurking variable. Two variables can be related through a lurking variable even though there is no causal relation Example

Slide Slide 34 SCATTERPLOT and CORRELATION using Excel

Slide Slide 35 To graph a Scatterplot –(Highlight the two data columns) –Use the Chart Wizard –Choose: XY(Scatter) –Follow the dialog window steps appropriately (label axes etc.)

Slide Slide 36 Computing the Correlation coefficient  The correlation coefficient is computed using the Correlation function in the Data Analysis Toolpak. Click on TOOLS > DATA ANALYSIS > Correlation  Or you can use the function: = CORREL(data range X, data range Y) Example: If the X values are in B2:B25 and the Y values are in C2:C25, the correlation between the X data and Y data is obtained as follows: = CORREL(B2:B25, C2:C25)

Slide Slide 37 SCATTERPLOT and CORRELATION using Ti83

Slide Slide 38 Create the two Lists To input data into the STAT list editor: Enter STAT edit mode by pressing [STAT] [1]. Enter the data in the L1 and L2 lists, pressing [ENTER] after each entry. Press [2nd] [MODE] to QUIT and return to the home screen. Example: L1: {7,2,4,2,5} L2: {8,4,6,2,7}

Slide Slide 39 Graph the ScatterPlot Press [2nd] [Y=] to access the STAT PLOT editor. Press [ENTER] to edit Plot1. Press [ENTER] to turn ON Plot1. Scroll down and highlight the scatter plot graph type (first option in the first row). Press [ENTER] to select the scatter plot graph type. Scroll down and make sure Xlist: is set to L1 and Ylist: is set to L2. To input L1, press [2nd] [1]. To input L2, press [2nd] [2]. Press [GRAPH] to display the scatter plot. You may have to change the “Windows” settings to view your graph.

Slide Slide 40 Get the Correlation Coefficient r Turn on diagnostics with the [DiagnosticOn] command: –[2 nd ] [0] gets [CATALOG] – Scroll down to DiagnosticOn and press [ENTER] twice. [STAT] [►] [CALC] Scroll down to 4: LinReg(ax+b) press [ENTER] twice.