M22- Regression & Correlation 1 Department of ISM, University of Alabama, Lesson Objectives Know what the equation of a straight line is, in terms of slope and y-intercept. Learn how find the equation of the least squares regression line. Know how to draw a regression line on a scatterplot. Know how to use the regression equation to estimate the mean of Y for a given value of X.
M22- Regression & Correlation 2 Department of ISM, University of Alabama, Best graphical tool for “seeing” the relationship between two quantitative variables. Use to identify: Patterns (relationships) Unusual data (outliers) Scatterplot
M22- Regression & Correlation 3 Department of ISM, University of Alabama, Y X Y X Y X Y X Y Positive Linear Relationship Negative Linear Relationship Nonlinear Relationship, need to change the model No Relationship (X is not useful)
M22- Regression & Correlation 4 Department of ISM, University of Alabama, Regression Analysis mechanics
M22- Regression & Correlation 5 Department of ISM, University of Alabama, Equation of a straight line. Y = mx + b m = slope = “rate of change” b = the “y” intercept. Y = a + bx ^ b = slope a = the “y” intercept. Days of algebra Statistics form Y = estimate of the mean of Y for some X value. ^
M22- Regression & Correlation 6 Department of ISM, University of Alabama, by “eyeball”. by using equations by hand. by hand calculator. by computer: Minitab, Excel, etc. Equation of a straight line. How are the slope and y-intercept determined?
M22- Regression & Correlation 7 Department of ISM, University of Alabama, Equation of a straight line. Y = a + bx ^ X-axis 0 rise run a “y” intercept b =
M22- Regression & Correlation 8 Department of ISM, University of Alabama, Equation of a straight line. Y = a + bx ^ X-axis 0 rise run a “y” intercept b =
M22- Regression & Correlation 9 Department of ISM, University of Alabama, Population: All ST 260 students Each value of X defines a subpopulation of “height” values. The goal is to estimate the true mean weight for each of the infinite number of subpopulations. Example 1: Y = Weight in pounds, X = Height in inches. Measure: Is height a good estimator of mean weight?
M22- Regression & Correlation 10 Department of ISM, University of Alabama, Sample of n = 5 students Y = Weight in pounds, X = Height in inches Ht Wt Case Example 1: Step 1?
M22- Regression & Correlation 11 Department of ISM, University of Alabama, DTDP
M22- Regression & Correlation 12 Department of ISM, University of Alabama, HEIGHT . .... WEIGHT Where should the line go? X Y X Y Example 1
M22- Regression & Correlation 13 Department of ISM, University of Alabama, page 615 Equation of Least Squares Regression Line Slope: y-intercept These are not the preferred computational equations.
M22- Regression & Correlation 14 Department of ISM, University of Alabama, Basic intermediate calculations (x i - x)(y i - y) (x i - x) 2 (y i - y) 2 = S xy = = S xx = = S yy = Numerator part of S 2 Look at your formula sheet
M22- Regression & Correlation 15 Department of ISM, University of Alabama, = S xy = xy ( x )( y ) n = S xx = = S yy = y2y2 n y)y) 2 (( x2x2 n x)x) 2 (( Alternate intermediate calculations Look at your formula sheet Numerator part of S 2
Case x y Ht Wt x y xy Ht*Wt __.___ xy x 2 Ht _.___ x2x _ _.___ y 2 Wt 2 y2y2 Example 1
M22- Regression & Correlation 17 Department of ISM, University of Alabama, Intermediate Summary Values xy ( x )( y ) n (342)(795) 5 1 = x2x2 n x)x) 2 (( ( 342 ) 2 5 = y2y2 n y)y) 2 (( ( 795 ) 2 5 = Example 1
M22- Regression & Correlation 18 Department of ISM, University of Alabama, Intermediate Summary Values Example = = 77.2 = Once these values are calculated, the rest is easy!
M22- Regression & Correlation 19 Department of ISM, University of Alabama, Least Squares Regression Line where ^ Y = a + b X b a ybx 1 2 Prediction equation Estimated Slope Estimated Y - intercept
M22- Regression & Correlation 20 Department of ISM, University of Alabama, Slope, for Weight vs. Height b = = Example 1
M22- Regression & Correlation 21 Department of ISM, University of Alabama, Intercept, for Weight vs. Height a b y x – = = y= x == 68.4 = 159 a (+7.189) 68.4 Example 1
M22- Regression & Correlation 22 Department of ISM, University of Alabama, Prediction equation ^ Y = a + b X Wt = – Ht ^ Y = – X ^^ Example 1
M22- Regression & Correlation 23 Department of ISM, University of Alabama, HEIGHT Y = – X ^ WEIGHT Example 1 Draw the line on the plot
M22- Regression & Correlation 24 Department of ISM, University of Alabama, HEIGHT Y = – ^ ^ Y = X Y = – ^ ^ Y = X WEIGHT Example 1 Draw the line on the plot
M22- Regression & Correlation 25 Department of ISM, University of Alabama, What a regression equation gives you: The “line of means” for the Y population. A prediction of the mean of the population of Y-values defined by a specific value of X. Each value of X defines a subpopulation of Y-values; the value of regression equation is the “least squares” estimate of the mean of that Y subpopulation.
M22- Regression & Correlation 26 Department of ISM, University of Alabama, Example 2:Estimate the weight of a student 5’ 5” tall. Y = a + b X = – X ^
M22- Regression & Correlation 27 Department of ISM, University of Alabama, HEIGHT Y = – (65) = ^ WEIGHT Example 2
M22- Regression & Correlation 28 Department of ISM, University of Alabama, Calculate your own weight. Why was your estimate not exact?
M22- Regression & Correlation 29 Department of ISM, University of Alabama, Calculate the least squares regression line. 2. Plot the data and draw the line through the data. 3. Predict Y for a given X. 4. Interpret the meaning of the regression line. Regression: Know How To:
M22- Regression & Correlation 30 Department of ISM, University of Alabama,
M22- Regression & Correlation 31 Department of ISM, University of Alabama, Correlation
M22- Regression & Correlation 32 Department of ISM, University of Alabama, Sample Correlation Coefficient, r A numerical summary statistic that measures the strength of the linear association between two quantitative variables.
M22- Regression & Correlation 33 Department of ISM, University of Alabama, Notation: r = sample correlation. = population correlation, “rho”. r is an “estimator” of
M22- Regression & Correlation 34 Department of ISM, University of Alabama, Interpreting correlation: -1.0 r +1.0 r > 0.0 Pattern runs upward from left to right; “positive” trend. r < 0.0 Pattern runs downward from left to right; “negative” trend.
M22- Regression & Correlation 35 Department of ISM, University of Alabama, Upward & downward trends: r > 0.0r < 0.0 Y X-axis Y Slope and correlation must have the same sign.
M22- Regression & Correlation 36 Department of ISM, University of Alabama, All data exactly on a straight line: r = _____ Perfect positive relationship Perfect negative relationship Y X-axis Y
M22- Regression & Correlation 37 Department of ISM, University of Alabama, r = _____________ Which has stronger correlation? Y X-axis Y
M22- Regression & Correlation 38 Department of ISM, University of Alabama, r close to -1 or +1 means _________________________ linear relation. r close to 0 means _________________________ linear relation. "Strength": How tightly the data follow a straight line.
M22- Regression & Correlation 39 Department of ISM, University of Alabama, r = ________________ Which has stronger correlation? Y X-axis Y
M22- Regression & Correlation 40 Department of ISM, University of Alabama, Y X-axis Y Which has stronger correlation? Strong parabolic pattern! We can fix it. r = ________________
M22- Regression & Correlation 41 Department of ISM, University of Alabama, Computing Correlation by hand using the formula using a calculator (built-in) using a computer: Excel, Minitab,....
M22- Regression & Correlation 42 Department of ISM, University of Alabama, Formula for Sample Correlation (Page 627) r S xy S yy S xx Look at your formula sheet
M22- Regression & Correlation 43 Department of ISM, University of Alabama, Calculating Correlation r =r = Look at your formula sheet Example 1; Weight versus Height = “Go to Slide 18 for values.”
M22- Regression & Correlation 44 Department of ISM, University of Alabama, Positive Linear Relationship Example 6 Real estate data, previous section r =
M22- Regression & Correlation 45 Department of ISM, University of Alabama, Negative Linear Relationship Example 7 AL school data, previous section r =
M22- Regression & Correlation 46 Department of ISM, University of Alabama, No linear Relationship Example 9 Rainfall data, previous section r =
M22- Regression & Correlation 47 Department of ISM, University of Alabama, Size of “r” does NOT reflect the steepness of the slope, “b”; but “r” and “b” must have the same sign. r = b s x s y and = br s y s x Comment 1:
M22- Regression & Correlation 48 Department of ISM, University of Alabama, Changing the units of Y and X does not affect the size of r. Comment 2: Inchestocentimeters Poundstokilograms CelsiustoFahrenheit X to Z (standardized)
M22- Regression & Correlation 49 Department of ISM, University of Alabama, Comment 3: High correlation does not always imply causation. Example: X = dryer temperature Y = drying time for clothes Causation: Changes in X actually do cause changes in Y. Consistency, responsiveness, mechanism
M22- Regression & Correlation 50 Department of ISM, University of Alabama, Common Response Both X and Y change as some unobserved third variable changes. Comment 4: Example: In basketball, there is a high correlation between points scored and personal fouls committed over a season. Third variable is ___?
M22- Regression & Correlation 51 Department of ISM, University of Alabama, Confounding The effect of X on Y is "hopelessly" mixed up with the effects of other variables on Y. Example: Is adult behavior most affected by environment or genetics? Comment 5:
M22- Regression & Correlation 52 Department of ISM, University of Alabama, The end