Scatterplots Chapter 6.1 Notes
~Relationships between variables are often at the heart of what we would like to learn from data. · Are grades actually higher than they used to be? · Do people tend to reach puberty at a younger age than in previous generations? · Does applying strong magnets to parts of the body relieve pain? If so, are stronger magnets more effective? · Do students learn better with the use of computer technology? ~Questions like these relate two quantitative variables and ask whether there is an association between them.
Scatterplot - A graph of the relationship between 2 variables If you suspect that the changes in one variable explains the changes in the other variable, then label each axis accordingly x axis (explanatory variable or independent variable) y axis (response variable or dependent variable) **If there is no distinction then either variable can go on either axis.
Explanatory vs Response Examples – ~Can we view one of the variables as explanatory (independent) and the other as a response (dependent) variable? If so, which is which? a. The amount of time spent studying for a statistics exam and the grade on the exam b. The weight in kilograms and height in centimeters of a person c. Inches of rain in the growing season and the yield of corn in bushels per acre d. A student’s score on the SAT math exam and the SAT verbal exam e. A family’s income and the years of education their eldest child completes
Examining a Scatterplot ~Look for overall pattern and striking deviations from the pattern. 1. Form · Straight line (linear) - will appear as a cloud or swarm of points stretched out in a generally consistent, straight form · Are there clusters? (there can be curved patterns but we will not focus on those) 2. Direction · Positive - slopes upward from left to right · Negative - slope downward from left to right
3. Strength Strong - if the points lie close to a straight line Moderate - not close to a straight line but not widely scattered Weak – if the points are widely scattered about a line 4. Outliers – any deviation from the rest of the data
Correlation (r) – how we measure the strength of a scatterplot · Describes the direction and strength of a straight-line relationship (does not describe curved relationships) · Positive association – positive “r” value ; scatterplot slopes upward as we move from left to right · Negative association - negative “r” value; scatterplot slopes downward from left to right
The correlation “r” always falls between -1 and 1 strength increases as “r” moves toward either -1 or 1 When “r” is near zero (0) - very weak relationship; points do not lie close to a straight line When “r” is close to -1 or 1 – a strong relationship; points lie close to a straight line When “r” = -1 or 1 - occurs only when points lie exactly on straight line (very rare) Other information about “r” If we change units (inches vs. centimeters) “r” does not change because standard scores are used “r” has no units “r” will not change even if we reverse our explanatory and response variables “r” is strongly affected by outliers (like as with mean and standard deviation)
r-squared r-squared is the statistical measure of how well a regression line (line of best fit) approximates real data points Example: Lets say we are trying to find a relationship between time spent studying and grades on a test. Assume that r = .89, then r-squared = .79 The fact that r = .89 tells us that we have a strong relationship between time spent studying and scores on a test. (i.e. the more a student studies the better score they will get) r-squared = .79 tells us that 79% of the variation of scores on a test can be explained by time spent studying.
How to calculate “r” (see page 350 for formula) · Find the mean and standard deviation of both sets of data and plug into formula · Find the standard scores for each value in each set of data · The correlation is the average of the products of these standard scores (as with standard deviation, we average by dividing by n-1) OR Use a calculator--steps to be given at a later date CW/HW: page 355-356 Ex. #13, 15-17