Download presentation
Presentation is loading. Please wait.
1
Summarizing Bivariate Data
Describing relationships
2
Response variable (y) measures an outcome (dependent) of a study
Variables: Response variable (y) measures an outcome (dependent) of a study Explanatory variable (x) helps explain or influence changes in a response variable (independent)
3
Identify the Explanatory and Response variables:
Amount of rain Weed growth Resting pulse rate Amount of daily exercise Amount of sleep Amount of time watching TV Winning percentage of basketball team Attendance at games
4
How to Make a Scatterplot
Displaying Relationships: Scatterplots The most useful graph for displaying the relationship between two quantitative variables is a scatterplot. Definition: A scatterplot shows the relationship between two quantitative variables measured on the same individuals. The values of one variable appear on the horizontal axis, and the values of the other variable appear on the vertical axis. Each individual in the data appears as a point on the graph. How to Make a Scatterplot Decide which variable should go on each axis. Remember, the eXplanatory variable goes on the X-axis! Label and scale your axes. Plot individual data values.
5
Scatterplots and Correlation
Displaying Relationships: Scatterplots Make a scatterplot of the relationship between body weight and pack weight. Since Body weight is our eXplanatory variable, be sure to place it on the X-axis! Scatterplots and Correlation Body weight (lb) 120 187 109 103 131 165 158 116 Backpack weight (lb) 26 30 24 29 35 31 28
6
Suppose we found the age and weight of a sample of 10 adults.
Create a scatterplot of the data below. Is there any relationship between the age and weight of these adults? Age 24 30 41 28 50 46 49 35 20 39 Wt 256 124 320 185 158 129 103 196 110 130
7
Does there seem to be a relationship between age and weight of these adults?
8
Create a scatterplot of the data below.
Suppose we found the height and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship between the height and weight of these adults? Is it positive or negative? Weak or strong? Ht 74 65 77 72 68 60 62 73 61 64 Wt 256 124 320 185 158 129 103 196 110 130
9
Does there seem to be a relationship between height and weight of these adults?
10
And ALWAYS in context of the problem!
When describing relationships between two variables, you should address: Direction: the overall pattern moving left to right (positive, negative, or neither) Strength of the relationship: how closely the points follow a clear form (how much scattering?) Form:overall pattern (linear or some other pattern) Unusual features: individual values that fall outside overall pattern (outliers or influential points) And ALWAYS in context of the problem!
11
Identify as having a positive association, a negative association, or no association.
+ Heights of mothers & heights of their adult daughters - Age of a car in years and its current value + Weight of a person and calories consumed NO Height of a person and the person’s birth month - Number of hours spent in safety training and the number of accidents that occur
12
The farther away from a straight line – the weaker the relationship
The closer the points in a scatterplot are to a straight line - the stronger the relationship. The farther away from a straight line – the weaker the relationship
13
Interpreting Scatterplots
Outlier There is one possible outlier, the hiker with the body weight of 187 pounds seems to be carrying relatively less weight than are the other group members. Strength Direction Form There is a moderately strong, positive, linear relationship between body weight and pack weight. It appears that lighter students are carrying lighter backpacks.
14
Interpreting Scatterplots
Definition: Two variables have a positive association when above-average values of one tend to accompany above-average values of the other, and when below- average values also tend to occur together. Two variables have a negative association when above-average values of one tend to accompany below-average values of the other. Strength Direction Form There is a moderately strong, negative, curved relationship between the percent of students in a state who take the SAT and the mean SAT math score. Further, there are two distinct clusters of states and two possible outliers that fall outside the overall pattern.
15
What explains the clusters? What do the two circled points mean?
Interpreting Scatterplots There is a moderately strong, negative, curved relationship between the percent of students in a state who take the SAT and the mean SAT math score. Further, there are two distinct clusters of states and two possible outliers that fall outside the overall pattern. W. V. Maine What explains the clusters? What do the two circled points mean?
16
Association does not imply causation
17
Correlation Measures the direction and the strength of a linear relationship between 2 quantitative variables. Correlation coefficient (r): r is always a number between -1 and 1 r > 0 indicates a positive association. r < 0 indicates a negative association. Values of r near 0 indicate a very weak linear relationship. The strength of the linear relationship increases as r moves away from 0 towards -1 or 1. The extreme values r = -1 and r = 1 occur only in the case of a perfect linear relationship.
18
Properties of r (correlation coefficient)
legitimate values of r is [-1,1] Strong correlation No Correlation Moderate Correlation Weak correlation
19
Some Correlation Pictures
20
Some Correlation Pictures
21
Some Correlation Pictures
22
Some Correlation Pictures
23
Some Correlation Pictures
24
Some Correlation Pictures
25
GUESS THAT CORRELATION!
Correlation Practice For each graph, estimate the correlation r and interpret it in context. GUESS THAT CORRELATION!
26
Correlation Coefficient (r)
A quantitative assessment of the strength & direction of the linear relationship between bivariate, quantitative data Pearson’s sample correlation is used most parameter - r (rho) statistic - r
27
Heights and weights of 10 students:
Which variable is the explanatory variable? Response variable? Why? Make a scatterplot. Describe the relationship. Height (in) Weight (lb) 74 256 65 124 77 320 72 185 68 158 60 129 62 103 73 196 61 110 64 130 =
28
Find the mean and standard deviation of the heights and weights of the 10 students:
Height (in) Weight (lb) 74 256 65 124 77 320 72 185 68 158 60 129 62 103 73 196 61 110 64 130 =
29
Find the mean and standard deviation of the heights and weights of the 10 students:
Height (in) Weight (lb) 74 256 1.06 1.21 1.28 65 124 -.43 -.67 .29 77 320 1.55 2.12 3.29 72 185 .73 .20 .14 68 158 .07 -.19 -.01 60 129 -1.25 -.60 .75 62 103 -.92 -.97 .90 73 196 .89 .35 .32 61 110 -1.09 -.87 .95 64 130 -.59 = 8.24/9= .9157
30
Heights and weights of 10 students:
Horizontal line through mean of weights Vertical line through mean of heights Height (in) Weight (lb) 74 256 65 124 77 320 72 185 68 158 60 129 62 103 73 196 61 110 64 130 =
31
Calculate r. Interpret r in context.
Speed Limit (mph) 55 50 45 40 30 20 Avg. # of accidents (weekly) 28 25 21 17 11 6 Calculate r. Interpret r in context. r = .9964 There is a strong, positive, linear relationship between speed limit and average number of accidents per week.
32
Find r. Interpret r in context. x (in mm) 12 15 21 32 26 19 24
y Find r. Interpret r in context. .9181
33
value of r is not changed by any transformations
x (in mm) y Find r. Change to cm & find r. Do the following transformations & calculate r 1) 5(x + 14) 2) (y + 30) ÷ 4 .9181 .9181 The correlations are the same. STILL = .9181
34
value of r does not depend on which of the two variables is labeled x
Switch x & y & find r. Type: LinReg L2, L1 The correlations are the same.
35
value of r is non-resistant
x y Find r. Outliers affect the correlation coefficient
36
r = 0, but has a definite relationship!
value of r is a measure of the extent to which x & y are linearly related Find the correlation for these points: x Y What does this correlation mean? Sketch the scatterplot r = 0, but has a definite relationship!
37
4) Correlation requires both variables to be quantitative.
Correlation makes no distinction between explanatory and response variable. 2) Correlation coefficient has not unit of measurement. It is just a number. 3) Correlation does not change when we change the units of measurement of x, y, or both. 4) Correlation requires both variables to be quantitative. 5) Correlation does not describe curved relationship between variables, no matter how strong. Only the linear relationship between variables. 6) Like the mean and standard deviation, the correlation is not resistant: r is strongly affected by a few outlying observations.
38
Correlation does not imply causation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.