Ch 3 – Examining Relationships YMS – 3.1 Scatterplots
Some Vocabulary Response Variable Explanatory Variable Scatterplot Measures an outcome of a study AKA dependent variable Explanatory Variable Attempts to explain the observed outcomes AKA independent variable Scatterplot Shows the relationship between two quantitative variables measured on the same individuals
Scatterplots Examining Drawing Categorical Variables Look for overall pattern and any deviations Describe pattern with form, strength, and direction Drawing Uniformly scale the vertical and horizontal axes Label both axes Adopt a scale that uses the entire available grid Categorical Variables Add a different color/shape to distinguish between categorical variables Classwork p125 #3.7, 3.10-3.11 Homework: #3.16, 3.22 and 3.2 Blueprint
YMS – 3.2 Correlation
Correlation Measures the direction and strength of the linear relationship between two quantitative variables
Facts About Correlation Makes no distinction between explanatory and response variables Requires both variables be quantitative Does not change units when we change units of measurement Sign of r indicates positive or negative association r is inclusive from -1 to 1 Only measures strength of linear relationships Is not resistant
Correlation Guessing Game In Class Exercises p146 #3.28, 3.34 and 3.37 Correlation Guessing Game Homework 3.3 Blueprint
Least-Square Regression YMS – 3.3 Least-Square Regression
Regression Regression Line LSRL of y on x Describes how a response variable y changes as an explanatory variable x changes LSRL of y on x Makes the sum of the squares of the vertical distances of the data points from the line as small as possible Line should be as close as possible to the points in the vertical direction Error = Observed (Actual) – Predicted
LSRL Equation of the LSRL Slope Intercept
Coefficient of determination – r2 The fraction of the variation in the values of y that is explained by the least-squares regression of y on x Measures the contribution of x in predicting y If x is a poor predictor of y, then the sum of the squares of the deviations about the mean (SST) and the sum of the squares of deviations about the regression line (SSE) would be approximately the same.
Understanding r-squared: A single point simplification Al Coons Buckingham Browne & Nichols School Cambridge, MA al_coons@bbns.org
y Error w.r.t. mean model Error eliminated by y-hat model Proportion of error eliminated by Y-hat model Error eliminated by y-hat model = Error w.r.t. mean model r2 = proportion of variability accounted for by the given model (w.r.t the mean model).
y ~ Error w.r.t. mean model Error eliminated by y-hat model Proportion of error eliminated by Y-hat model Error eliminated by y-hat model = Error w.r.t. mean model = ~
Facts about Least-Squares Regression Distinction between explanatory and response variables is essential A change of one standard deviation in x corresponds to a change of r standard deviations in y LSRL always passes through the point The square of the correlation is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x Classwork: Transformations and LSRL WS Homework: #3.39 and ABS Matching to Plots Extension Question (we’ll finish the others in class)
Residuals observed y – predicted y or Positive values show that data point lies above the LSRL The mean of residuals is always zero
Residual Plots A scatterplot of the regression residuals against the explanatory variable Helps us assess the fit of a regression line Want a random pattern Watch for individual points with large residuals or that are extreme in the x direction
Outliers vs. Influential Observations An observation that lies outside the overall pattern of the other observations Influential observation Removing this point would markedly change the result of the calculation Classwork: Residual Plots WS Homework: p177 #3.52 and 3.61
Doctor’s for the Poor This will be graded for accuracy Doctor’s for the Poor This will be graded for accuracy! Ch 3 Review p176 #3.50-3.51, 3.56, 3.59, 3.69, 3.76-3.77