Download presentation
Presentation is loading. Please wait.
Published byClementine Anderson Modified over 9 years ago
1
The Big Picture Where we are coming from and where we are headed… Chapter 5 showed us methods for summarizing data using descriptive statistics, but only one variable at a time. In Chapter 6, we learn how to analyze the relationship between two quantitative variables using scatterplots, correlation, and regression. In Chapter 7, we will learn about probability, which we will need in order to perform statistical inference.
2
Section 6.1 Scatterplots and Correlation Objectives:
Construct and interpret scatterplots for two quantitative variables. Calculate and interpret the correlation coefficient. Determine whether a linear correlation exists between two variables.
3
Explanatory and Response Variables
Response variable measures an outcome of a study. An explanatory variable explains, influences or causes change in a response variable. Independent variable and dependent variable. Be careful!! The relationship between two variables can be strongly influenced by other variables that are lurking in the background.
4
Explanatory and response variables
In each of the following examples, determine if there is a clear explanatory and response variable, or if it is just best to explore the relationship. Price of a house and square footage of a house The arm span and height of a person Amount of snow in the Colorado mountains and the volume of water in area rivers Explanatory-square feet Response-price Explanatory-arm span response-height Explore the relationship
5
Displaying relationships: Scatterplots
A scatterplot displays the relationship between two quantitative variables measured on the same individuals. It is the most common way to display the relation between two quantitative variables. It displays the form, direction, and strength of the relationship between two quantitative variables. The values of one variable appear on the horizontal axis, and the values of the other variable appear on the vertical axis. Each individual in the data appears as the point in the plot fixed by the values of both variables for that individual.
6
Example: Lot x=square footage (100s of sq ft) y=sales price ($1000s)
Harding St 75 155 Newton Ave 125 210 Stacy Ct 290 Eastern Ave 175 360 Second St 250 Sunnybrook Rd 225 450 Ahlstrand Rd 530 275 635
7
Scatterplots The relationship between two quantitative variables can take many different forms. Four of the most common are: Positive linear relationship: As x increases, y also tends to increase. Negative linear relationship: As x increases, y tends to decrease. No apparent relationship: As x increases, y tends to remain unchanged. Nonlinear relationship: The x and y variable are related, but not in a way that can be approximated using a straight line.
8
Interpreting scatterplots
How to examine a scatterplot: Determine the overall pattern showing: The form, direction, and strength of the relationship Identify any outliers or other deviations from this pattern.
9
Interpreting scatterplots
Overall Pattern Form: Linear relationships, where the points show a straight-line pattern, are an important form of relationship between two variables. Curved relationships and clusters (a number of similar individuals that occur together) are other forms to watch for. Direction: If the relationship has a clear direction, we speak of either positive association (the more the x, the more the y) or negative association (the more the x, the less the y). Strength: The strength of a relationship is determined by how close the points in the scatterplot lie to a line.
10
Describe the scatterplot:
Strong positive linear Strong negative linear
11
Strong negative linear
Strong positive linear Strong negative curved
12
Sketch a scatterplot of the data and then describe the overall pattern.
Is there an obvious explanatory and response variable?
14
Exercises: Pg. 337/6.1,6.3,6.4 Pg /6.5,6.6,6.8
15
Scatterplot & Correlation
Scatterplots provide a visual tool for looking at the relationship between two variables. Unfortunately, our eyes are not good tools for judging the strength of the relationship. Changes in the scale or the amount of white space in the graph can easily change our judgment of the strength of the relationship. Correlation is a numerical measure we use to show the strength of linear association.
16
A scatter plot is helpful in understanding the form, direction, and strength of the relationship between two variables. Correlation allows us to quantify the direction and strength of the relationship.
17
Ex 1: Describe the correlation illustrated by the scatter plot.
There is a positive correlation between the two data sets. As the average daily temperature increased, the number of visitors increased.
18
Ex. 2: Describe the correlation illustrated by the scatter plot.
There is a negative correlation between elevation and mean annual temp. As the elevation in Nevada increases, the mean annual temperature decreases.
19
Facts about correlation
What kind of variables do we use? 1. No distinction between explanatory and response variables. 2. Both variables must be quantitative Numerical properties 1. 2. r > 0: positive association between variables 3. r < 0: negative association between variables 4. If r = 1 or r = - 1, it indicates a perfect linear relationship 5. As |r| is getting close to 1, much stronger relationship 6. Effected by a few outliers not resistant. 7. It doesn’t describe curved relationships 8. It is not affected by changing units
21
Don’t worry, that’s why we have
Measuring linear association: correlation r (The Pearson Product-Moment Correlation Coefficient or Correlation Coefficient) Don’t worry, that’s why we have graphing calculators!!!
22
You can use a graphing calculator to perform a linear regression and find the correlation coefficient r. To display the correlation coefficient r, you may have to turn on the diagnostic mode. To do this, press and choose the DiagnosticOn mode. Press enter, and then press enter again to activate it.
23
Example 1: 1.) Sketch a scatterplot 2.) State the overall pattern 3.) Are there any outliers? 4.) Calculate the correlation coefficient
24
Example 2: In one of the Boston city parks, there has been a problem with muggings in the summer months. A police officer took a random sample of 10 days (out of the 90-day summer) and compiled the following data. For each day, x represents the number of police officers on duty in the park and y represents the number of reported muggings on that day. X 10 15 16 1 4 6 18 12 14 y 5 2 9 7 8 3 7 6
25
Construct a scatterplot
Estimate a value for r. Calculate the actual r value.
26
A Caution The correlation coefficient measures the strength of the relationship between two variables. A strong correlation does not imply a cause and effect relationship. A correlation between two variables may be caused by other (either known or unknown) variables called lurking variables.
27
Example Cause-Effect Relationship
During the months of March and April, the weekly weight increases of a puppy in New York were collected. For the same time frame, the retail price increases of snowshoes in Alaska were collected. The weight of a The retail price of Growing puppy in NY snowshoes in Alaska 8 pounds $32.45 $32.95 9 $33.45 $34.00 $34.50 $35.10 $35.63
28
Example Cause-Effect Relationship cont.
The data was examined and was found to have a very strong linear correlation. So, this must mean that the weight increase of a puppy in New York is causing snowshoe prices in Alaska to increase. Of course this is not true! The moral of this example is: "be careful what you infer from your statistical analyses." Be sure your relationship makes sense. Also keep in mind that other factors may be involved in a cause-effect relationship
29
Exercises: Pg.350/ Pg.355/ Pg.359/ (section review)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.