Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/

Slides:



Advertisements
Similar presentations
Chapter 4 The Relation between Two Variables
Advertisements

Chapter 3 Bivariate Data
Chapter 6: Exploring Data: Relationships Lesson Plan
Scatter Diagrams and Linear Correlation
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
2.4 Cautions about Correlation and Regression. Residuals (again!) Recall our discussion about residuals- what is a residual? The idea for line of best.
Looking at Data-Relationships 2.1 –Scatter plots.
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Ch 2 and 9.1 Relationships Between 2 Variables
Basic Practice of Statistics - 3rd Edition
Correlation & Regression Math 137 Fresno State Burger.
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.
Relationship of two variables
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Looking at data: relationships - Caution about correlation and regression - The question of causation IPS chapters 2.4 and 2.5 © 2006 W. H. Freeman and.
Chapter 6: Exploring Data: Relationships Chi-Kwong Li Displaying Relationships: Scatterplots Regression Lines Correlation Least-Squares Regression Interpreting.
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
Chapter 6: Exploring Data: Relationships Lesson Plan Displaying Relationships: Scatterplots Making Predictions: Regression Line Correlation Least-Squares.
Chapter 3 concepts/objectives Define and describe density curves Measure position using percentiles Measure position using z-scores Describe Normal distributions.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
CHAPTER 7: Exploring Data: Part I Review
AP Statistics Chapter 8 & 9 Day 3
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Chapter 2 Looking at Data - Relationships. Relations Among Variables Response variable - Outcome measurement (or characteristic) of a study. Also called:
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
BPS - 3rd Ed. Chapter 51 Regression. BPS - 3rd Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Chapter 5 Regression BPS - 5th Ed. Chapter 51. Linear Regression  Objective: To quantify the linear relationship between an explanatory variable (x)
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
Chapters 8 & 9 Linear Regression & Regression Wisdom.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
Lecture Presentation Slides SEVENTH EDITION STATISTICS Moore / McCabe / Craig Introduction to the Practice of Chapter 2 Looking at Data: Relationships.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 5 Regression. u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We can then predict.
AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
 What is an association between variables?  Explanatory and response variables  Key characteristics of a data set 1.
Chapter 12: Correlation and Linear Regression 1.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
UNIT 4 Bivariate Data Scatter Plots and Regression.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Regression Chapter 5 January 24 – Part II.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Chapter 12: Correlation and Linear Regression 1.
Two-Variable Data Analysis
Midterm Review IN CLASS. Chapter 1: The Art and Science of Data 1.Recognize individuals and variables in a statistical study. 2.Distinguish between categorical.
Chapter 12: Correlation and Linear Regression 1.
CHAPTER 3 Describing Relationships
Correlation & Regression
Chapter 4.2 Notes LSRL.
Essential Statistics Regression
Cautions about Correlation and Regression
Chapter 2 Looking at Data— Relationships
Chapter 2: Looking at Data — Relationships
Chapter 2 Looking at Data— Relationships
Chapter 2 Looking at Data— Relationships
Examining Relationships
Basic Practice of Statistics - 5th Edition Regression
Review of Chapter 3 Examining Relationships
Basic Practice of Statistics - 3rd Edition Regression
CHAPTER 3 Describing Relationships
Basic Practice of Statistics - 3rd Edition Lecture Powerpoint
Honors Statistics Review Chapters 7 & 8
Review of Chapter 3 Examining Relationships
Presentation transcript:

Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/ 1

General Procedure 1.Plot the data. 2.Look for the overall pattern. 3.Calculate a numeric summary. 4.Answer the question (which will be defined shortly) 2

2.1: Relationships - Goals Be able to define what is meant by an association between variables. Be able to categorize whether a variable is a response variable or a explanatory variable. Be able to identify the key characteristics of a data set. 3

Questions What objects do the data describe? What variables are present and how are they measured? Are all of the variables quantitative? Are the variables associated with each other? 4

Association (cont.) Two variables are associated if knowing the values of one of the variables tells you something about the values of the other variable. 1.Do you want to explore the association? 2.Do you want to show causality? 5

Variable Types Response variable (Y): outcome of the study Explanatory variable (X): explains or causes changes in the response variable 6

Key Characteristics of Data Cases: Identify what they are and how many Label: Identify what the label variable is (if present) Categorical or quantitative: Classify each variable as categorical or quantitative. Values. Identify the possible values for each variable. Explanatory or Response: Classify each variable as explanatory or response. 7

2.2: Scatterplots - Goals Be able to create a scatterplot (lab) Be able to interpret a scatterplot – Pattern – Outliers – Form, direction and strength of a relationship Be able to interpret scatterplots which have categorical variables. 8

Scatterplot - Procedure 1.Decide which variable is the explanatory variable and put on X axis. The response variable goes on the Y axis. 2.Label and scale your axes. 3.Plot the (x,y) pairs. 9

Example: Scatterplot The following data is to determine the relationship between age and change in systolic blood pressure (BP, mm Hg) after 24 hours in response to a particular treatment. a) Draw a scatterplot of this data. Obs Age BP

Example: Scatterplot (cont) Age 11

Pattern Form Direction Strength Outliers 12

Pattern Linear Nonlinear No relationship 13

Outliers 14

Example: Scatterplot (cont) Age 15

Scatterplot with Categorical Variables 16

I am a Turkey, not Tukey! Thank you for not eating me! 17

2.3: Correlation - Goals Be able to use (and calculate) the correlation to describe the direction and strength of a linear relationship. Be able to recognize the properties of the correlation. Be able to determine when (and when not) you can use correlation to measure the association. 18

Sample correlation, r (Pearson’s Sample Correlation Coefficient) 19

Sum of Squares 20

Properties of Correlation r > 0 ==> positive association r negative association r is always a number between -1 and 1. The strength of the linear relationship increases as |r| moves to 1. – |r| = 1 only occurs if there is a perfect linear relationship – r = 0 ==> x and y are uncorrelated. 21

Positive/Negative Correlation 22

Example: Positive/Negative Correlation 1) Would the correlation between the age of a used car and its price be positive or negative? Why? 2) Would the correlation between the weight of a vehicle and miles per gallon be positive or negative? Why? 23

Properties of Correlation r > 0 ==> positive association r negative association r is always a number between -1 and 1. The strength of the linear relationship increases as |r| moves to 1. – |r| = 1 only occurs if there is a perfect linear relationship – r = 0 ==> x and y are uncorrelated. 24

Variety of Correlation Values 25

Value of r 26

Properties of Correlation r > 0 ==> positive association r negative association r is always a number between -1 and 1. The strength of the linear relationship increases as |r| moves to 1. – |r| = 1 only occurs if there is a perfect linear relationship – r = 0 ==> x and y are uncorrelated. 27

Comments about Correlation 28

Cautions about Correlation Correlation requires that both variables be quantitative. Correlation measures the strength of LINEAR relationships only. The correlation is not resistant to outliers. Correlation is not a complete summary of bivariate data. 29

Datasets with r =

Questions about Correlation Does a small r indicate that x and y are NOT associated? Does a large r indicate that x and y are linearly associated? 31

2.4: Least-Squares Regression - Goals Be able to generally describe the method of ‘Least Squares Regression’ Be able to calculate and interpret the regression line. Using the least square regression line, be able to predict the value of y for any appropriate value of x. Be able to calculate r 2. Be able to explain the meaning of r 2. – Be able to discern what r 2 does NOT explain. 32

Regression Line A regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes. We can use a regression line to predict the value of y for a given value of x. 33

Idea of Linear Regression 34

Linear Regression b 0 = ȳ - b 1 x̄ 35 ŷ = b 0 + b 1 x

Example: Regression Line Age 36 ŷ = x

Example: Regression Line The following data is to determine the relationship between age and change in systolic blood pressure (BP, mm Hg) after 24 hours in response to a particular treatment. x̄ = , ȳ = , s x = , s y = 9.688, r = b) What is the regression line for this data? c) What would the predicted value be for someone who is 51 years old? Obs Age BP

Facts about Least Square Regression 38

r2r2 39

Example: Regression Line The following data is to determine the relationship between age and change in systolic blood pressure (BP, mm Hg) after 24 hours in response to a particular treatment. d) What percent of variation of Y is due to the regression line? Obs Age BP

Beware of interpretation of r 2 Linearity Outliers Good prediction 41

2.5: Cautions about Correlation and Regression - Goals Be able to calculate the residuals. Be able to use a residual plot to assess the fit of a regression line. Be able to identify outliers and influential observations by looking at scatterplots and residual plots. Be able to determine when you can predict a new value. Be able to identify lurking variables that can influence the relationship between two variables. Be able to explain the different between association and causation. 42

Residuals 43

Example: Regression Line The following data is to determine the relationship between age and change in systolic blood pressure (BP, mm Hg) after 24 hours in response to a particular treatment. e) What is the residual for someone who is 51 years old? Obs Age BP

Residual Plots Good Linearity Violation 45

Residual Plots Good Constant variance violation 46

Residual Plots – Bp OriginalY outlier 47

Residual Plots – Bp OriginalX outlier 48

Influential Point An outlier is an observation that lies outside the overall pattern of the other observations. An observation is influential for a statistical calculation if removing it would markedly change the result of the calculation. 49

Cautions about Correlation and Regression: Extrapolation 50

Cautions about Correlation and Regression: Both describe linear relationship. Both are affected by outliers. Always PLOT the data. Beware of extrapolation. Beware of lurking variables – Lurking variables are important in the study, but are not included. – Confounding variables confuse the issue. Correlation (association) does NOT imply causation! 51

Lurking Variables In each of these cases, identify the lurking variable. 1. For children, there is an extremely strong correlation between shoe size and math scores. 2. There is a very strong correlation between ice cream sales and number of deaths by drowning. 3. There is very strong correlation between number of churches in a town and number of bars in a town. 52

What is the lurking variable? /true-fact-the-lack-of-pirates-is-causing-global-warming/ 53

2.6: Data Analysis for Two-Way Tables - Goals Statements The distribution of a two random variables (bivariate) is called a joint distribution. Two random variables are similar to two events in that they can have conditional probabilities and be independent of each other. Goal Interpret examples of Simpson’s paradox 54

Simpson’s Paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined to form a single group. This reversal is called Simpson’s paradox. 55

Simpson’s Paradox Consider the acceptance rates for the following groups of men and women who applied to college. 56

Simpson’s Paradox Business School Art School 57

2.7: The Question of Causation - Goals Be able to explain an association – Causation – Common response – Confounding variables Apply the criteria for establishing causation. 58

Causation Association does not mean causation! 59

Establishing Causation Perform an experiment! What do we need for causation? 1.The association is strong. 2.The association is consistent. The connection happens in repeated trials The connection happens under varying conditions 3.Higher doses are associated with strong responses. 4.Alleged cause precedes the effect. 5.The alleged cause is plausible. 60