Download presentation
Presentation is loading. Please wait.
Published byShonda Warren Modified over 9 years ago
1
BPS - 3rd Ed. Chapter 51 Regression
2
BPS - 3rd Ed. Chapter 52 u To describe the change in Y per unit X u To predict the average level of Y at a given level of X Objectives of Regression
3
BPS - 3rd Ed. Chapter 53 “Returning Birds” Example Plot data first to see if relation can be described by straight line (important!) Illustrative data from Exercise 4.4 Y = adult birds joining colony X = percent of birds returning, prior year
4
BPS - 3rd Ed. Chapter 54 If data can be described by straight line u … describe relationship with equation Y = (intercept) + (slope)(X) u May also be written: Y = (slope)(X) + (intercept) Intercept where line crosses Y axis Slope “angle” of line
5
BPS - 3rd Ed. Chapter 55 Linear Regression u Algebraic line every point falls on line: exact y = intercept + (slope)(X) u Statistical line scatter cloud suggests a linear trend: “predicted y” = intercept + (slope)(X)
6
BPS - 3rd Ed. Chapter 56 Regression Equation ŷ = a + bx, where –ŷ (“y-hat”) is the predicted value of Y –a is the intercept –b is the slope –x is a value for X u Determine a & b for “best fitting line” The TI calculators reverse a & b!
7
BPS - 3rd Ed. Chapter 57 What Line Fits Best? If we try to draw the line by eye, different people will draw different lines We need a method to draw the “best line” This method is called “least squares”
8
BPS - 3rd Ed. Chapter 58 The “least squares” regression line Each point has: Residual = observed y – predicted y = distance of point from prediction line The least squares line minimizes the sum of the square residuals
9
BPS - 3rd Ed. Chapter 59 Calculating Least Squares Regression Coefficients u Formula (next slide) u Technology –TI-30XIIS –Two variable Applet –Other
10
BPS - 3rd Ed. Chapter 510 u b = slope coefficient u a = intercept coefficient Formulas where s x and s y are the standard deviations of the two variables, and r is their correlation
11
BPS - 3rd Ed. Chapter 511 Technology: Calculator BEWARE! TI calculators label the slope and intercept backwards!
12
BPS - 3rd Ed. Chapter 512 Regression Line u For the “bird data”: u a = 31.9343 u b = 0.3040 u The linear regression equation is: ŷ = 31.9343 0.3040x The slope (-0.3040) represents the average change in Y per unit X
13
BPS - 3rd Ed. Chapter 513 Use of Regression for Prediction Suppose an individual colony has 60% returning (x = 60). What is the predicted number of new birds for this colony? Answer: ŷ = a + bx = 31.9343 (0.3040)(60) = 13.69 Interpretation: the regression model predicts 13.69 new birds (ŷ) for a colony with x = 60.
14
BPS - 3rd Ed. Chapter 514 Prediction via Regression Line Number of new birds and Percent returning When X = 60, the regression model predicts Y = 13.69
15
BPS - 3rd Ed. Chapter 515 Case Study Per Capita Gross Domestic Product and Average Life Expectancy for Countries in Western Europe
16
BPS - 3rd Ed. Chapter 516 CountryPer Capita GDP (x)Life Expectancy (y) Austria21.477.48 Belgium23.277.53 Finland20.077.32 France22.778.63 Germany20.877.17 Ireland18.676.39 Italy21.578.51 Netherlands22.078.15 Switzerland23.878.99 United Kingdom21.277.37 Regression Calculation Case Study
17
BPS - 3rd Ed. Chapter 517 Life Expectancy and GDP (Europe)
18
BPS - 3rd Ed. Chapter 518 Calculations: ŷ = 68.716 + 0.420x Regression Calculation by Hand (Life Expectancy Study)
19
BPS - 3rd Ed. Chapter 519 BPS/3e Two Variable Applet
20
BPS - 3rd Ed. Chapter 520 Applet: Data Entry
21
BPS - 3rd Ed. Chapter 521 Applet: Calculations
22
BPS - 3rd Ed. Chapter 522 Applet: Scatterplot
23
BPS - 3rd Ed. Chapter 523 Applet: least squares line
24
BPS - 3rd Ed. Chapter 524 Interpretation Life Expectancy Case Study u Model: ŷ = 68.716 + (0.420)X u Slope: For each increase in GDP 0.420 years increase in life expectancy u Prediction example: What is the life expectancy in a country with a GDP of 20.0? ANSWER: ŷ = 68.716 + (0.420)(20.0) = 77.12
25
BPS - 3rd Ed. Chapter 525 Coefficient of Determination (R 2 ) (Fact 4 on p. 111) u “Coefficient of determination, (R 2 ) Quantifies the fraction of the Y “mathematically explained” by X Examples: v r=1: R 2 =1:regression line explains all (100%) of the variation in Y v r=.7: R 2 =.49:regression line explains almost half (49%) of the variation in Y
26
BPS - 3rd Ed. Chapter 526 We are NOT going to cover the analysis of residual plots (pp. 113-116)
27
BPS - 3rd Ed. Chapter 527 Outliers and Influential Points u An outlier is an observation that lies far from the regression line u Outliers in the y direction have large residuals u Outliers in the x direction are influential –removal of influential point would markedly change the regression and correlation values
28
BPS - 3rd Ed. Chapter 528 Outliers: Case Study Gesell Adaptive Score and Age at First Word From all the data r 2 = 41% r 2 = 11% After removing child 18
29
BPS - 3rd Ed. Chapter 529 Cautions About Correlation and Regression u Describe only linear relationships u Are influenced by outliers u Cannot be used to predict beyond the range of X (do not extrapolate) u Beware of lurking variables (variables other than X and Y) –Association does not always equal causation!
30
BPS - 3rd Ed. Chapter 530 Do not extrapolate (Sarah’s height) u Sarah’s height is plotted against her age u Can you predict her height at age 42 months? u Can you predict her height at age 30 years (360 months)?
31
BPS - 3rd Ed. Chapter 531 Do not extrapolate (Sarah’s height) u Regression equation: ŷ = 71.95 +.383(X) u At age 42 months: ŷ = 71.95 +.383(42) = 88 (Reasonable) u At age 360 months: ŷ = 71.95 +.383(360) = 209.8 (That’s over 17 feet tall!)
32
BPS - 3rd Ed. Chapter 532 Even very strong correlations may not correspond to a causal relationship between x and y (Beware of the lurking variable!) Caution: C orrelation does not always mean causation
33
BPS - 3rd Ed. Chapter 533 House, J., Landis, K., and Umberson, D. “Social Relationships and Health,” Science, Vol. 241 (1988), pp 540-545. Social Relationships and Health u Strong correlation between lack of social relationships and illness u Does lack of social relationships cause people to become ill? u Maybe(?) –but perhaps unhealthy people are less likely to establish and maintain social relationships (reversed relationship) –Or, some other factor (lurking variable) predisposes people both to have lower social activity and become ill? Caution: Correlation Does Not Imply Causation
34
BPS - 3rd Ed. Chapter 534 Criteria for causation (skip) u Do not rely on statistical evidence alone for causal inference u Here are some criteria to consider when trying to determine causality –Strong relationships more likely to be causal –Properly executed experiments needed (chapter 8) –Replication under varying conditions needed –Dose-response relationship found –Cause precedes effect in time –Plausible biological explanations needed
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.