Ch 2 and 9.1 Relationships Between 2 Variables

Slides:



Advertisements
Similar presentations
MATH 2400 Chapter 5 Notes. Regression Line Uses data to create a linear equation in the form y = ax + b where “a” is the slope of the line (unit rate.
Advertisements

Chapter 4 The Relation between Two Variables
Agresti/Franklin Statistics, 1 of 52 Chapter 3 Association: Contingency, Correlation, and Regression Learn …. How to examine links between two variables.
Chapter 3 Bivariate Data
Chapter 6: Exploring Data: Relationships Lesson Plan
Overview Correlation Regression -Definition
Scatter Diagrams and Linear Correlation
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
CHAPTER 4: Scatterplots and Correlation. Chapter 4 Concepts 2  Explanatory and Response Variables  Displaying Relationships: Scatterplots  Interpreting.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
The Practice of Statistics
Looking at Data-Relationships 2.1 –Scatter plots.
Chapter 3: Examining Relationships
Chapter 3 Describing Relationships
CHAPTER 3 Describing Relationships
Basic Practice of Statistics - 3rd Edition
Chapter 3: Describing Relationships
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.
Descriptive Methods in Regression and Correlation
Relationship of two variables
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Chapter 6: Exploring Data: Relationships Chi-Kwong Li Displaying Relationships: Scatterplots Regression Lines Correlation Least-Squares Regression Interpreting.
Chapter 3: Examining relationships between Data
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
Chapter 6: Exploring Data: Relationships Lesson Plan Displaying Relationships: Scatterplots Making Predictions: Regression Line Correlation Least-Squares.
LECTURE UNIT 7 Understanding Relationships Among Variables Scatterplots and correlation Fitting a straight line to bivariate data.
Chapter 3 concepts/objectives Define and describe density curves Measure position using percentiles Measure position using z-scores Describe Normal distributions.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
CHAPTER 7: Exploring Data: Part I Review
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
CHAPTER 4: Scatterplots and Correlation ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
Lecture Presentation Slides SEVENTH EDITION STATISTICS Moore / McCabe / Craig Introduction to the Practice of Chapter 2 Looking at Data: Relationships.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Chapter 7 Scatterplots, Association, and Correlation.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 5 Regression. u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We can then predict.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
 What is an association between variables?  Explanatory and response variables  Key characteristics of a data set 1.
Unit 3: Describing Relationships
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Correlation/Regression - part 2 Consider Example 2.12 in section 2.3. Look at the scatterplot… Example 2.13 shows that the prediction line is given by.
Business Statistics for Managerial Decision Making
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Statistics: Analyzing 2 Quantitative Variables MIDDLE SCHOOL LEVEL  Session #2  Presented by: Dr. Del Ferster.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Get out p. 193 HW and notes. LEAST-SQUARES REGRESSION 3.2 Interpreting Computer Regression Output.
Response Variable: measures the outcome of a study (aka Dependent Variable) Explanatory Variable: helps explain or influences the change in the response.
Chapter 3: Describing Relationships
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.1 Scatterplots and Correlation.
Chapter 6: Exploring Data: Relationships Lesson Plan
Chapter 2 Looking at Data— Relationships
Chapter 2: Looking at Data — Relationships
Chapter 2 Looking at Data— Relationships
Chapter 6: Exploring Data: Relationships Lesson Plan
Chapter 2 Looking at Data— Relationships
Unit 4 Vocabulary.
Chapter 2 Looking at Data— Relationships
CHAPTER 3 Describing Relationships
Warmup A study was done comparing the number of registered automatic weapons (in thousands) along with the murder rate (in murders per 100,000) for 8.
Homework: PG. 204 #30, 31 pg. 212 #35,36 30.) a. Reading scores are predicted to increase by for each one-point increase in IQ. For x=90: 45.98;
Presentation transcript:

Ch 2 and 9.1 Relationships Between 2 Variables More than one variable can be measured on each individual. Examples: Gender and Height Size and Cost Eye color and Major We want to look at the relationship among these variables. Is there an association between these two variables? Two variables measured on the same individuals are associated if some values tend to occur more often with some values of the second variable than with other values of that variable.

Relationships Between 2 Variables If we expect one variable to influence another, we call it the ___________ variable. Explains or influences changes in the response variable The variable that is influenced is called the ____________ variable. Measures an outcome of a study In each of the following examples, identify the explanatory and response variables Gender and blood pressure Class attendance and course grade Number of beers and BAC

Relationships Between 2 Variables We may be interested in relationships of different types of variables. Categorical and Numeric Categorical and Categorical Numeric and Numeric

Relationships between Categorical and Numeric Variables We are interested in comparing the numerical variable across each of the levels of the categorical variable. Examples: Compare high speeds for 4 different car brands Compare sucrose levels for 5 different types of fruit Compare GPR for 20 different majors

Relationships between Categorical and Numeric Variables Graphical Comparison Example: Sucrose levels of fruits (fictitious data)

Relationships between Categorical and Numeric Variables Numerical Comparison We could also look at summary statistics for each group.

Ch 9.1 Relationships Between Two Categorical Variables Depending on the situation, one of the variables is the explanatory variable and the other is the response variable. In this case, we look at the percentages of one variable for each level of the other variable. Examples: Gender and Soda Preference Country of Origin and Marital Status Smoking Habits and Socioeconomic Status

Two-Way Tables Two-way tables come about when we are interested in the relationship between two categorical variables. One of the variables is the _____________. The other is the _______________. The combination of a row variable and a column variable is a ______________.

Two-Way Tables Example: Column variable Cells Row Totals Column Totals Row variable Row Totals Column Totals Overall Total

Relationships between two categorical variables Example: Gender and Highest Degree Obtained Joint Distribution: How likely are you to have a bachelor’s degree and be a male? _____________ Marginal Distribution: What is the least likely highest degree obtained? _____________ Conditional Distribution: If you are a female, how likely are you to have obtained a graduate degree? ______________

Relationships between two categorical variables Shows the percentages for the joint, marginal, and conditional distributions.

Ch 2 Relationships Between 2 Numeric Variables Depending on the situation, one of the variables is the explanatory variable and the other is the response variable. There is not always an explanatory-response relationship. Examples: Height and Weight Income and Age SAT scores on math exam and on verbal exam Amount of time spent studying for an exam and exam score

Relationships between 2 numeric variables Scatterplots Look for overall pattern and any striking deviations from that pattern. Look for outliers, values falling outside the overall pattern of the relationship You can describe the overall pattern of a scatterplot by the form, direction, and strength of the relationship. Form: Linear or clusters Direction Two variables are _____________________ when above-average values of one tend to accompany above-average values of the other and likewise below-average values also tend to occur together. Two variables are _____________________ when above-average values of one variable accompany below-average values of the other variable, and vice-versa. Strength-how close the points lie to a line

Relationships between 2 numeric variables Example: Response: MPG Explanatory: Weight Response Variable (y-axis) ___________ Association Explanatory Variable (x-axis)

Relationships between 2 numeric variables Relationships between two numeric variables Example Vehicle Weight Horsepower __________ Association

Relationships between 2 numeric variables ___________ or r: measures the direction and strength of the linear relationship between two numeric variables General Properties It must be between -1 and 1, or (-1≤ r ≤ 1). If r is negative, the relationship is negative. If r = –1, there is a perfect negative linear relationship (extreme case). If r is positive, the relationship is positive. If r = 1, there is a perfect positive linear relationship (extreme case). If r is 0, there is no linear relationship. r measures the strength of the linear relationship. If explanatory and response are switched, r remains the same. r has no units of measurement associated with it Scale changes do not affect r

Relationships between 2 numeric variables Examples of extreme cases r = 1 r = 0 r = -1

Relationships between 2 numeric variables Match the correlation with to the scatterplot r = 0.04 r =0.43 r = -0.84 r = 0.76 r = 0.21

Relationships between 2 numeric variables It is possible for there to be a strong relationship between two variables and still have r ≈ 0. EX.

Relationships between 2 numeric variables Important notes: Association does not imply causation Correlation does not imply causation Slope is not correlation A scale change does not change the correlation. Correlation doesn’t measure the strength of a non-linear relationship:

Regression Line A regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes. A regression line summarizes the relationship between two variables, but only in a specific setting: when one of the variables helps explain or predict the other. We often use a regression line to predict the value of y for a given value of x. Regression, unlike correlation, requires that we have an explanatory variable and a response variable

Regression Line Fitting a line to data means drawing a line that comes as close as possible to the points. Extrapolation-the use of a regression line for prediction far outside the range of values of the explanatory variable x that you used to obtain the line. Such predictions are often not accurate.

Least-Squares Regression Line The least-squares regression line of y on x is the line that makes the sum of squares of the vertical distances of the data points from the line as small as possible. These vertical distances are called the residuals, or the error in prediction, because they measure how far the point is from the line: where y is the point and is the predicted point.

Least-Squares Regression Line The equation of the least-squares regression line of y on x is

Least-Squares Regression Line The expression for slope, b1, says that along the regression line, a change of one standard deviation in x corresponds to a change of r standard deviations in y. The slope, b1, is the amount by which y changes when x increases by one unit. The intercept, b0, is the value of y when The least-squares regression line ALWAYS passes through the point

r2 in Regression The square of the correlation, r2, is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x. Use r2 as a measure of how successfully the regression explains the response. Interpret r2 as the “percent of variation explained” For Simple Linear Regression, r2 is simply the square of the correlation coefficient.

Relationships between 2 numeric variables Example How much of the variation is explained by the least squares line of y on x? ______ What is the correlation coefficient? ______ Horsepower = -10.78 + 0.04*weight (Equation of the line.) __________: y-value or response (horsepower) when line crosses the y-axis. _______: increase in response for a unit increase in explanatory variable. So if weight increases by one pound, horsepower increases by 0.04 units (on average).

Relationships between 2 variables Lurking Variable: A variable that is not among the explanatory or response variables in a study and yet may influence the interpretation of relationships among those variables. Simpson’s Paradox: An association or comparison that holds for all of several groups can reverse direction when the data are combined to form a single group. This reversal is called Simpson’s Paradox. This can happen when a lurking variable is present. Please see Examples 9.9 and 9.10 in the text.

Outliers and Influential Observations in Regression An outlier is an observation that lies outside the overall pattern of the other observations. An observation is influential for a statistical calculation if removing it would markedly change the result of the calculation. Points that are outliers in the x direction of a scatterplot are often influential for the least-squares regression line.

Outliers and Influential Observations in Regression Child 18 is an outlier in the x direction. Because of its extreme position on the age scale, this point has a strong influence on the position of the regression line. r2 is also affected by the influential observation. With Child 18, r2 = 41%, but without Child 18, r2 = 11%. The apparent strength of the association was largely due to a single influential observation. The dashed line was calculated leaving out Child 18. The solid line is with Child 18.