Download presentation
Presentation is loading. Please wait.
Published byGeorge Caldwell Modified over 8 years ago
1
Copyright © 2017, 2014 Pearson Education, Inc. Slide 1 Chapter 4 Regression Analysis: Exploring Associations between Variables
2
Copyright © 2017, 2014 Pearson Education, Inc. Slide 2 Chapter 4 Topics Explore associations between numerical variables graphically and numerically Model linear trends using a regression line
3
Copyright © 2017, 2014 Pearson Education, Inc. Slide 3 VISUALIZING VARIABILITY WITH A SCATTERPLOT Section 4.1 Use Technology to Create a Scatterplot Use Scatterplots to Investigate Associations Between Numerical Variables Rawpixel. Shutterstock
4
Copyright © 2017, 2014 Pearson Education, Inc. Slide 4 Copyright © 2017, 2014 Pearson Education, Inc. Slide 4 Visualizing Variability with a Scatterplot Scatterplot The primary tool for examining relationships between two numerical variables. Each point in the scatterplot represents one observation. Usually created using technology such as a computer software program or a graphing calculator.
5
Copyright © 2017, 2014 Pearson Education, Inc. Slide 5 Copyright © 2017, 2014 Pearson Education, Inc. Slide 5 Median Age of Marriage for Women Each point in the scatterplot represents one state in the US and the District of Columbia. Each point represents the median age of marriage for women and men in the state. Each data point has the form: (median age of women, median age for men)
6
Copyright © 2017, 2014 Pearson Education, Inc. Slide 6 Copyright © 2017, 2014 Pearson Education, Inc. Slide 6 Examining Scatterplots Note three features: 1.Trend (like center) 2.Strength (like spread) 3.Shape
7
Copyright © 2017, 2014 Pearson Education, Inc. Slide 7 Copyright © 2017, 2014 Pearson Education, Inc. Slide 7 Trend The general tendency of the scatterplot as you read from left to right Typical trends: 1.Increasing (uphill), called a positive association 2.Decreasing (downhill), called a negative association 3.No trend, if there is neither an uphill nor downhill tendency
8
Copyright © 2017, 2014 Pearson Education, Inc. Slide 8 Copyright © 2017, 2014 Pearson Education, Inc. Slide 8 Example: Positive Trend This scatterplot shows a positive trend because the graph goes uphill as you scan from left to right. This means as the age of the car increases, the mileage also tends to increase.
9
Copyright © 2017, 2014 Pearson Education, Inc. Slide 9 Copyright © 2017, 2014 Pearson Education, Inc. Slide 9 Example: Negative Trend This scatterplot shows a negative trend because the graph goes downhill as you scan from left to right. This means as literacy rate increases, total births per woman tends to decrease.
10
Copyright © 2017, 2014 Pearson Education, Inc. Slide 10 Copyright © 2017, 2014 Pearson Education, Inc. Slide 10 Example: No Trend This scatterplot shows no trend because the points seem to follow no predictable pattern. This means that for every age group we can find relatively fast and relative slow runners. Marathon running speed does not seem to be related to age of runner.
11
Copyright © 2017, 2014 Pearson Education, Inc. Slide 11 Copyright © 2017, 2014 Pearson Education, Inc. Slide 11 Example: Trend, Neither Positive nor Negative This data set shows an association between two variables, but it cannot be characterized as positive nor negative.
12
Copyright © 2017, 2014 Pearson Education, Inc. Slide 12 Copyright © 2017, 2014 Pearson Education, Inc. Slide 12 Strength of an Association Scatterplots with large amounts of scatter or vertical variation indicate a weak association. Scatterplots with small amounts of scatter or little vertical variation indicate a strong association.
13
Copyright © 2017, 2014 Pearson Education, Inc. Slide 13 Copyright © 2017, 2014 Pearson Education, Inc. Slide 13 Example: Strength of Association Is there a stronger association between height and weight or between waist size and weight?
14
Copyright © 2017, 2014 Pearson Education, Inc. Slide 14 Copyright © 2017, 2014 Pearson Education, Inc. Slide 14 Example: Strength of Association There seems to be a stronger association between waist size and weight (less vertical variation in the graph).
15
Copyright © 2017, 2014 Pearson Education, Inc. Slide 15 Copyright © 2017, 2014 Pearson Education, Inc. Slide 15 Shape: Linear Scatterplots that cluster around a line model linear trends. This scatterplot shows there is a linear association between volume of searches for the word “vampire” and the word “zombie.”
16
Copyright © 2017, 2014 Pearson Education, Inc. Slide 16 Copyright © 2017, 2014 Pearson Education, Inc. Slide 16 Shape: Non-Linear Sometimes there are trends in data that are non-linear – trends that are better modeled by a curve rather than a line. This scatterplot shows there is a non-linear trend between temperature and pollutant ozone levels.
17
Copyright © 2017, 2014 Pearson Education, Inc. Slide 17 Copyright © 2017, 2014 Pearson Education, Inc. Slide 17 Writing Descriptions of Associations When writing a description of an association between two numerical variables, always include: 1.Trend 2.Shape 3.Strength In addition, mention any observations that don’t fit the general trend (if any).
18
Copyright © 2017, 2014 Pearson Education, Inc. Slide 18 Copyright © 2017, 2014 Pearson Education, Inc. Slide 18 Example: Describing Associations How would you describe the association between median age of marriage for women and median age of marriage for men in the 50 states and the District of Columbia?
19
Copyright © 2017, 2014 Pearson Education, Inc. Slide 19 Copyright © 2017, 2014 Pearson Education, Inc. Slide 19 Example: Describing Associations The association between median age of marriage for women and the median age of marriage for men is positive and linear. In other words, women who marry at an older age tend to marry men who are an older age. The association is strong because there is very little vertical variation in the graph.
20
Copyright © 2017, 2014 Pearson Education, Inc. Slide 20 Copyright © 2017, 2014 Pearson Education, Inc. Slide 20 Be Careful Describing Associations Always use a phrase like “tends to” when describing an association because the trend you are describing has variability – the association you are describing may not be true for all individuals. Always point out any data points that appear to be unusual or not part of the general pattern.
21
Copyright © 2017, 2014 Pearson Education, Inc. Slide 21 MEASURING STRENGTH OF ASSOCIATION WITH CORRELATION Section 4.2 Find and Interpret the Correlation Coefficient forestpath. Shutterstock
22
Copyright © 2017, 2014 Pearson Education, Inc. Slide 22 Copyright © 2017, 2014 Pearson Education, Inc. Slide 22 Correlation Coefficient A number that measures the strength of a linear relationship Symbol: r Always between –1 and +1 r values close to –1 or +1 indicate a strong linear association r values close to 0 indicate a weak association
23
Copyright © 2017, 2014 Pearson Education, Inc. Slide 23 Copyright © 2017, 2014 Pearson Education, Inc. Slide 23 r Values of 1 and –1 Correlation coefficients of 1 and –1 indicate perfect positive and perfect negative associations. The data points lie exactly on a line.
24
Copyright © 2017, 2014 Pearson Education, Inc. Slide 24 Copyright © 2017, 2014 Pearson Education, Inc. Slide 24 Visualizing the Correlation Coefficient Notice that as r increases, there is less vertical variation in the data (the trend is stronger).
25
Copyright © 2017, 2014 Pearson Education, Inc. Slide 25 Copyright © 2017, 2014 Pearson Education, Inc. Slide 25 Computing the Correlation Coefficient Background: Data are converted to z-scores which are multiplied together. These products are then added and the resulting sum is divided by n–1. In practice: The correlation coefficient is found using technology.
26
Copyright © 2017, 2014 Pearson Education, Inc. Slide 26 Copyright © 2017, 2014 Pearson Education, Inc. Slide 26 Example The table below shows the heights and weights for 6 women. Compute and interpret r, the correlation coefficient. Height616263646668 Weight104110141125170160
27
Copyright © 2017, 2014 Pearson Education, Inc. Slide 27 Copyright © 2017, 2014 Pearson Education, Inc. Slide 27 Computing r Using StatCrunch Enter the data into StatCrunch. STAT > Regression > Simple Linear Select the x-variable, select the y-variable, select COMPUTE.
28
Copyright © 2017, 2014 Pearson Education, Inc. Slide 28 Copyright © 2017, 2014 Pearson Education, Inc. Slide 28 StatCrunch Output Simple linear regression results: Dependent Variable: Weight Independent Variable: Height Weight = -442.88235 + 9.0294118 Height Sample size: 6 R (correlation coefficient) = 0.88093363 R-sq = 0.77604407 Page 1 of the output has a lot of information, but we can see r = 0.881. Since r is close to 1, we would say there is a strong linear association between height and weight.
29
Copyright © 2017, 2014 Pearson Education, Inc. Slide 29 Copyright © 2017, 2014 Pearson Education, Inc. Slide 29 StatCrunch Output Page 2 provides a graph of the data, including a graph of the line that best fits the data.
30
Copyright © 2017, 2014 Pearson Education, Inc. Slide 30 Copyright © 2017, 2014 Pearson Education, Inc. Slide 30 Using a TI-84 Calculator To find r, the correlation coefficient, on the TI-84 calculator: 1.Enter your data, using one list for x-variable and one list for the y-variable. Remember to enter your data in order as pairs. 2.Push STAT > CALC then select option 8: LINREG(A+BX). 3.Enter the lists where you entered your data separated by a comma after the LinReg command. For example, LinReg(a + bx) L1, L2 and press ENTER.
31
Copyright © 2017, 2014 Pearson Education, Inc. Slide 31 Copyright © 2017, 2014 Pearson Education, Inc. Slide 31 Notes about the Correlation Coefficient Changing the order of the variables does not change r. Adding a constant or multiplying by a positive constant does not affect r. r is unitless. r is only useful to measure a linear trend – always graph your data first before computing r to make sure the association is linear!
32
Copyright © 2017, 2014 Pearson Education, Inc. Slide 32 MODELING LINEAR TRENDS Section 4.3 Use Technology to Write the Regression Equation Use the Regression Equation to Make Appropriate Predictions Krom1975. Shutterstock
33
Copyright © 2017, 2014 Pearson Education, Inc. Slide 33 Copyright © 2017, 2014 Pearson Education, Inc. Slide 33 Regression Line A tool for making predictions about future observed values Has the form y = a + bx, where a is the y-intercept and b is the slope Usually generated using appropriate technology
34
Copyright © 2017, 2014 Pearson Education, Inc. Slide 34 Copyright © 2017, 2014 Pearson Education, Inc. Slide 34 Example: Regression Equation The scatterplot shows a fairly strong positive linear trend. The regression equation has a slope of 2.16 and a y-intercept of 30.46. The positive trend indicates that players who hit more home runs tend to have more RBIs.
35
Copyright © 2017, 2014 Pearson Education, Inc. Slide 35 Copyright © 2017, 2014 Pearson Education, Inc. Slide 35 Example: Using a Regression Equation The scatterplot shows a negative linear trend. As age of car increases, value tends to decrease. The regression equation is: predicted value = 21375 – 1215 age
36
Copyright © 2017, 2014 Pearson Education, Inc. Slide 36 Copyright © 2017, 2014 Pearson Education, Inc. Slide 36 Example: Using the Regression Equation predicted value = 21375 – 1215 age Use the regression equation to predict the value of a car that is 12 years old. predicted value = 21375 – 1215 age predicted value = 21375 – 1215(12) predicted value =$6795
37
Copyright © 2017, 2014 Pearson Education, Inc. Slide 37 Copyright © 2017, 2014 Pearson Education, Inc. Slide 37 Finding the Regression Equation To find the regression equation using technology, follow the same steps as for finding the correlation coefficient.
38
Copyright © 2017, 2014 Pearson Education, Inc. Slide 38 Copyright © 2017, 2014 Pearson Education, Inc. Slide 38 Example The table below shows the heights and weights for six women. Find the regression equation that describes the relationship between height and weight. NOTE: We previously determined that this data followed a linear trend, so it is appropriate to find the regression equation. Height616263646668 Weight104110141125170160
39
Copyright © 2017, 2014 Pearson Education, Inc. Slide 39 Copyright © 2017, 2014 Pearson Education, Inc. Slide 39 Finding the Regression Equation Using StatCrunch Enter the data into StatCrunch. STAT > Regression > Simple Linear Select the x-variable, select the y-variable, select COMPUTE.
40
Copyright © 2017, 2014 Pearson Education, Inc. Slide 40 Copyright © 2017, 2014 Pearson Education, Inc. Slide 40 StatCrunch Output Simple linear regression results: Dependent Variable: Weight Independent Variable: Height Weight = -442.88235 + 9.0294118 Height Sample size: 6 R (correlation coefficient) = 0.88093363 R-sq = 0.77604407
41
Copyright © 2017, 2014 Pearson Education, Inc. Slide 41 Copyright © 2017, 2014 Pearson Education, Inc. Slide 41 Example: Using the Regression Equation Weight = -442.882 + 9.03 Height Use the regression equation to predict the weight of a woman who is 65 inches tall. Weight = -442.882 + 9.03 Height Weight = -442.882 + 9.03 (65) Weight = 144.07 inches
42
Copyright © 2017, 2014 Pearson Education, Inc. Slide 42 Copyright © 2017, 2014 Pearson Education, Inc. Slide 42 Notes about the Regression Equation Order matters. If x and y are switched, the regression equation will change. We use the x-variable to make predictions about the y-variable, so the x-variable is called the explanatory or predictor variable. It is also called the independent variable. The y-variable is the response or predicted variable. It is also called the dependent variable.
43
Copyright © 2017, 2014 Pearson Education, Inc. Slide 43 Copyright © 2017, 2014 Pearson Education, Inc. Slide 43 Example The table below shows the heights and weights for six women. Find the regression equation that describes the relationship between height and weight. This time use weight as the predictor or explanatory variable (x) and height as the predicted or response variable (y). Height616263646668 Weight104110141125170160
44
Copyright © 2017, 2014 Pearson Education, Inc. Slide 44 Copyright © 2017, 2014 Pearson Education, Inc. Slide 44 Example Simple linear regression results: Dependent Variable: Height Independent Variable: Weight Height = 52.397256 + 0.085946249 Weight Sample size: 6 R (correlation coefficient) = 0.88093363 R-sq = 0.77604407 Note: r (correlation coefficient) remains the same; however, the regression equation is different from our previous result.
45
Copyright © 2017, 2014 Pearson Education, Inc. Slide 45 Copyright © 2017, 2014 Pearson Education, Inc. Slide 45 Interpreting the Slope of the Regression Equation Slope tells us how much the y-variable changes when the x-variable is increased by 1 unit. A slope close to 0 means there is no linear relationship between x and y.
46
Copyright © 2017, 2014 Pearson Education, Inc. Slide 46 Copyright © 2017, 2014 Pearson Education, Inc. Slide 46 Example: Interpreting the Slope Weight = -442.882 + 9.03 Height The slope of this line is 9.03. The y-variable is weight and the x-variable is height. Interpretation: For every additional inch in height, weight tends to increase by 9.03 pounds. Every increase of 1 inch in height is associated with an increase in weight of 9.03 pounds.
47
Copyright © 2017, 2014 Pearson Education, Inc. Slide 47 Copyright © 2017, 2014 Pearson Education, Inc. Slide 47 Example: Interpreting Slope In a previous example on the association between age of car and value of car, the regression equation was: predicted value = 21375 – 1215 age Interpret the slope of the regression equation. Slope = -1215, x-variable is age, y-variable is value. Interpretation: For each additional year of age, value of car tends to decrease by $1215. Each additional year of age is associated with a decrease of $1215 in value.
48
Copyright © 2017, 2014 Pearson Education, Inc. Slide 48 Copyright © 2017, 2014 Pearson Education, Inc. Slide 48 Interpreting the y-Intercept of the Regression Equation The y-intercept is the predicted value when x is 0. The y-intercept is meaningful only if it makes sense for x to equal 0.
49
Copyright © 2017, 2014 Pearson Education, Inc. Slide 49 Copyright © 2017, 2014 Pearson Education, Inc. Slide 49 Example: Interpreting the y-Intercept In a previous example on the association between age of car and value of car, the regression equation was: predicted value = 21375 – 1215 age Interpret the y-intercept of the equation, if appropriate. y-intercept = 21375. It is the predicted value when x (age) is 0. In other words, when the car is new, its value is $21,375.
50
Copyright © 2017, 2014 Pearson Education, Inc. Slide 50 Copyright © 2017, 2014 Pearson Education, Inc. Slide 50 Example: Interpreting the y-Intercept In a previous example on the association between height and weight in women, the regression equation was: Weight = -442.882 + 9.03 Height Interpret the y-intercept, if appropriate. y-intercept = -442.882. It is the predicted value for weight if x (height) is 0. It is impossible to weigh -442 pounds and it is impossible for a woman to be 0 inches tall, so in this case the y-intercept is meaningless.
51
Copyright © 2017, 2014 Pearson Education, Inc. Slide 51 EVALUATING THE LINEAR MODEL Section 4.4 Use Linear Models to Describe Associations Only When Appropriate Compute and Interpret the Coefficient of Determination violetkaipa. Shutterstock
52
Copyright © 2017, 2014 Pearson Education, Inc. Slide 52 Copyright © 2017, 2014 Pearson Education, Inc. Slide 52 Cautionary Notes Regarding Regression Don’t use linear models to describe non-linear associations. Always look at a scatterplot first! Correlation is not causation! An association between two variables is not sufficient evidence to conclude that a cause-and-effect relationship exists between the variables. Beware of outliers that can have a big effect on r. Always check the scatterplot for outliers first. Don’t extrapolate! Don’t make predictions beyond the range of the data, because we are not sure that the linear trend will continue beyond the range of the data.
53
Copyright © 2017, 2014 Pearson Education, Inc. Slide 53 Copyright © 2017, 2014 Pearson Education, Inc. Slide 53 Example: Extrapolation In a previous example we found there was a strong linear relationship between heights and weights in women, and the regression equation is Weight = -442.882 + 9.03 Height. What weight does this equation predict for a woman who is 36 inches tall?
54
Copyright © 2017, 2014 Pearson Education, Inc. Slide 54 Copyright © 2017, 2014 Pearson Education, Inc. Slide 54 Example: Extrapolation Weight = -442.882 + 9.03 Height Weight = -442.882 + 9.03(36) = -117.8 pounds Note: The range of the data was for women 61 to 68 inches tall. It is not appropriate to use the regression equation to predict the height for a 36 inch tall woman since 36 is beyond the range of the data (extrapolation).
55
Copyright © 2017, 2014 Pearson Education, Inc. Slide 55 Copyright © 2017, 2014 Pearson Education, Inc. Slide 55 Coefficient of Determination: r 2 The square of r, the correlation coefficient Usually converted to a percentage, so always between 0% and 100% Measures how much variation in the response variable is explained by the explanatory variable The larger r 2, the smaller the amount of variation or scatter about the regression line.
56
Copyright © 2017, 2014 Pearson Education, Inc. Slide 56 Copyright © 2017, 2014 Pearson Education, Inc. Slide 56 Example: r 2 For the data on car age and predicted value, r = -0.778. Compute and interpret r 2. r 2 = (-0.778) 2 =.605, so r 2 = 60.5%. Car age explains about 60.5% of the variation in car value.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.