Sit in your permanent seat

Slides:



Advertisements
Similar presentations
Scatterplots and Correlation
Advertisements

Correlation and Regression. Relationships between variables Example: Suppose that you notice that the more you study for an exam, the better your score.
Simple ideas of correlation Correlation refers to a connection between two sets of data. We will also be able to quantify the strength of that relationship.
Lecture 3-2 Summarizing Relationships among variables ©
Introduction to Quantitative Data Analysis (continued) Reading on Quantitative Data Analysis: Baxter and Babbie, 2004, Chapter 12.
Math 15 Introduction to Scientific Data Analysis Lecture 5 Association Statistics & Regression Analysis University of California, Merced.
1 Chapter 7 Scatterplots, Association, and Correlation.
Research & Statistics Looking for Conclusions. Statistics Mathematics is used to organize, summarize, and interpret mathematical data 2 types of statistics.
Scatterplots and Correlation Section 3.1 Part 1 of 2 Reference Text: The Practice of Statistics, Fourth Edition. Starnes, Yates, Moore.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Lecture 29 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
Discovering Mathematics Week 9 – Unit 6 Graphs MU123 Dr. Hassan Sharafuddin.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Welcome to Week 05 College Statistics
Chapter 3 Unusual points and cautions in regression.
Sit in your permanent seat
Scatter Plots and Correlation Coefficients
Sit in your permanent seat
Scatterplots Chapter 6.1 Notes.
CHAPTER 3 Describing Relationships
Describing Relationships
Variables Dependent variable: measures an outcome of a study
Two Quantitative Variables
Chapter 3: Describing Relationships
Correlation 10/27.

Correlations and Scatterplots
QM222 A1 Visualizing data using Excel graphs
Scatterplots A way of displaying numeric data
EDRS6208 Fundamentals of Education Research 1
Describing Relationships
Statistics for the Social Sciences
The Practice of Statistics in the Life Sciences Fourth Edition
Scatterplots and Correlation
Chapter 3: Describing Relationships
Variables Dependent variable: measures an outcome of a study
Correlation vs. Causation
Chapter 2 Looking at Data— Relationships
Chapter 3: Describing Relationships
Scatterplots and Correlation
Section 1.4 Curve Fitting with Linear Models
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 4 - Scatterplots and Correlation
Chapter 3 Scatterplots and Correlation.
3.1: Scatterplots & Correlation
Correlation.
6.2 Determining the Four Characteristics of an Association
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
An Introduction to Correlational Research
Examining Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Summarizing Bivariate Data
CHAPTER 3 Describing Relationships
Scatterplots Scatterplots may be the most common and most effective display for data. In a scatterplot, you can see patterns, trends, relationships, and.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Describing Relationships
Association between 2 variables
Introduction to Excel 2007 Part 1: Basics and Descriptive Statistics Psych 209.
Basic Practice of Statistics - 3rd Edition
CHAPTER 3 Describing Relationships
Scatterplots, Association, and Correlation
Correlation and Prediction
Presentation transcript:

Sit in your permanent seat QM222 Class 4 Section A1 Summarizing the Relationship between Two Variables, and Why Correlation doesn’t tell us what causes what Sit in your permanent seat Open NYCUniversityAdmissionsScores.xlsx in sites.bu.edu/qm222projectcourse/Other Materials/Data and other materials used in class (then close your computer) QM222 Fall 2017 Section A1

Some to-do’s Do the Excel checklist with a TA (any QM222 TA) by the end of this week. Assignment 1 is due next Monday: For two topics (although you need only do 1 if you and I have already agreed on one) What specific question or questions will your project address?  What is the data set you plan to use?  You are going to measure a relationship between two variables. What are these 2 variables? (Note: a variable can be categorical, such as gender.)  What company, governmental body or other organization would be interested in knowing the answer to this question?  QM222 Fall 2017 Section A1

Today’s Objectives Scatter diagrams as one way to study the relationship between two variables Making scatter diagrams in Excel Correlation coefficients as a numeric way to measure the relationship between two variables. Making correlation coefficients in Excel Correlation is not causality: part 1 QM222 Fall 2017 Section A1

Scatterplots can tell us The direction (sign) of relationship between two variables (is the slope positive or negative?) The form of the relationship: linear vs. curved The strength of relationship If there are outliers QM222 Fall 2017 Section A1

Example: The Midwest seems to have the best SAT math scores Example: The Midwest seems to have the best SAT math scores. But is this because fewer high schoolers in the Midwest take the SAT? QM222 Fall 2017 Section A1

Example: The Midwest seems to have the best SAT math scores Example: The Midwest seems to have the best SAT math scores. But is this because fewer high schoolers in the Midwest take the SAT? QM222 Fall 2017 Section A1

Use a scatter plot! Each dot represents one “observation”, one data point QM222 Fall 2017 Section A1

Use a scatter plot! Each dot represents one “observation”, one data point What is an observation in this data set? A state. If I made a line, would the slope be positive or negative? Negative Would a line or a curve fit better? Probably a curve. Is the relationship strong? Hmmm…. kind of Are there outliers? Not really far out ones. QM222 Fall 2017 Section A1

Making scatter diagrams in Excel In class exercise: open NYCUniversityAdmissionsScores.xlsx (a data set on average SAT scores from each NYC high school) in on sites.bu.edu/qm222projectcourse/Other Materials/Data and other materials used in class Make a scatter diagram with the math score on the Y-Axis and the reading score on the X-axis. You need to place the two columns you want in your graph side-by-side. The variable you want on the x-axis should be on the left. Make sure the top row of each column has a descriptive label for the variable. On the Insert tab, click the picture of a scatter diagram and then click on the first scatter with only markers and with no connecting lines. What does each observation represent? Make a scatter diagram with the school’s math mean score on the Y-axis and the school’s reading score on the X-axis. QM222 Fall 2017 Section A1

Your scatter diagram from Excel… QM222 Fall 2017 Section A1

Today’s Objectives Scatter diagrams as one way to study the relationship between two variables Making scatter diagrams in Excel Correlation coefficients as a numeric way to measure the relationship between two variables. Making correlation coefficients in Excel Correlation is not causality: part 1 QM222 Fall 2017 Section A1

We’d also like a numerical measure of how closely two variables move together: the Correlation coefficient The correlation (coefficient) tells us two things: The direction of association: When X goes up, does Y go up or down? The strength of the association: How closely related are Y and X, or, how strong is the link? It doesn’t tell us if the relationship is linear or curved – In fact, it assumes that the relationship is linear. QM222 Fall 2017 Section A1

Correlation coefficient: notation r or ρ A positive correlation coefficient means: that when we see a higher value for one variable, we also tend to see a higher value for the other variable. A negative correlation coefficient means that when we see a higher value for one variable, we tend to see a lower value for the other variable. QM222 Fall 2017 Section A1

Correlation coefficient A correlation coefficient that is zero means that there is no correlation If you did a scatter of X and Y, the dots would seem to have no relationship. QM222 Fall 2017 Section A1

The correlation coefficient is between 1 & -1 Closer to |1| means a stronger association When r = 1 there is perfect positive correlation; if you did a scatter of X and Y, the dots would all lie exactly on an upward sloping line. When r = -1 there is perfect negative correlation; if you did a scatter of X and Y, the dots would all lie exactly on a downward sloping line. When r = 0 there is no correlation; if you did a scatter of X and Y, the dots would seem to have no relationship with each other. If you were to fit a line to the dots, it would be flat (since Y doesn’t change as X changes). QM222 Fall 2017 Section A1

Interpreting the values of correlation Measured correlations are almost never exactly 0, 1, or –1 A claim that two variables are uncorrelated typically means that the correlation is “near” 0 No absolute standard for what is a strong correlation, what is a weak correlation, and what is no correlation QM222 Fall 2017 Section A1

How do you think the correlation coefficients compare in Figure A and Figure B below? QM222 Fall 2017 Section A1

How do you think the correlation coefficients compare in Figure A and Figure B below? Both are positive. Figure B fits more tightly around the line – its correlation coefficient is closer to 1. The fact that one is steeper doesn’t affect the correlation. QM222 Fall 2017 Section A1

Correlation in Excel To get the correlation (between 2 variables in Excel, =CORREL(range X, range Y) In-Class exercise: with NYCUniversityAdmissionsScores.xlsx again: 1. Get the correlation between schools’ math and reading mean scores. 2. Get the correlation between the number of test takers and the reading mean scores. QM222 Fall 2017 Section A1

Correlation in Excel To get the correlation between all pairs of a bunch of variables in Excel, choose Excel Data→In Data Analysis → Correlation. Highlight all columns Do it with NYCUniversityAdmissionsScores.xlsx   Mathematics Mean Reading Mean Number of Test Takers 1 0.88312 0.07117 -0.00327 Understanding this: Why are there 1’s along the diagonal? Why are there no numbers above the diagonal? QM222 Fall 2017 Section A1

Correlation v. relation The correlation coefficient measures the strength of linear relationship. A low value is not enough to conclude a lack of a strong link between the two variables. This picture has a near zero correlation … The two variables are very related, but it’s not a line with a single slope, but. QM222 Fall 2017 Section A1

Today’s Objectives Scatter diagrams as one way to study the relationship between two variables Making scatter diagrams in Excel Correlation coefficients as a numeric way to measure the relationship between two variables. Making correlation coefficients in Excel Correlation is not causality: part 1 QM222 Fall 2017 Section A1

Correlation does not mean Causation (i.e. one thing causes another) https://www.youtube.com/watch?v=8B271L3NtAw QM222 Fall 2017 Section A1

Why correlation does not imply causation Possible explanations for correlation between x and y: X causes Y a change in X will change Y. Y causes X a change in Y will change X X causes Y AND Y causes X this is known as simultaneity Another variable(s) cause both X and Y this is called a confounding factor QM222 Fall 2017 Section A1

Let’s go through the examples in the video… which is it?: A. X causes Y B. Y causes X C. X causes Y AND Y causes X (simultaneity) D. Another variable(s) cause both X and Y (confounding factor) Ice cream (X) causes drownings (Y). Married men live longer than single men. Infants who sleep with the lights on tend to grow up short-sighted. Self esteem causes good grades. QM222 Fall 2017 Section A1