3.2 (part 2) 9.26.2017
An Example Test 1 Test 2 Student 1 83.73 90.57 Student 2 88.8 74.59 Student 3 82.65 76.23 Student 4 75.42 69.67 Let’s use scores on the chapter 1 test to predict scores on the chapter 2 test First step: we need to find the mean and st. dev for each variable
An Example Test 1 Test 2 Student 1 83.73 90.57 Student 2 88.8 74.59 Student 3 82.65 76.23 Student 4 75.42 69.67 First step: we need to find the mean and st. dev for each variable Test 1 𝑥 =82.65 𝑠 𝑥 =5.52 Test 2 𝑥 =77.77 𝑠 𝑥 =8.98
An Example Second step: we need to find the correlation Test 1 Test 2 Student 1 83.73 90.57 Student 2 88.8 74.59 Student 3 82.65 76.23 Student 4 75.42 69.67 𝑥 77.77 𝑠 𝑥 5.52 8.98 Second step: we need to find the correlation For now, I’m just going to give it to you: r=.36 Now we can find the slope: 𝑏=(.36) 8.98 5.52 𝑏=.59
An Example Now we can find the y-intercept 𝑎=77.77− .59 82.65 𝑎=28.94 Test 1 Test 2 Student 1 83.73 90.57 Student 2 88.8 74.59 Student 3 82.65 76.23 Student 4 75.42 69.67 𝑥 77.77 𝑠 𝑥 5.52 8.98 Now we can find the y-intercept 𝑎=77.77− .59 82.65 𝑎=28.94 So our regression equation is: 𝑇𝑒𝑠𝑡2 =28.94+.59(𝑇𝑒𝑠𝑡1) r=.36 b=.59
An Example So our regression equation is: 𝑇𝑒𝑠𝑡2 =28.94+.59(𝑇𝑒𝑠𝑡1) Test 1 Test 2 Student 1 83.73 90.57 Student 2 88.8 74.59 Student 3 82.65 76.23 Student 4 75.42 69.67 𝑥 77.77 𝑠 𝑥 5.52 8.98 So our regression equation is: 𝑇𝑒𝑠𝑡2 =28.94+.59(𝑇𝑒𝑠𝑡1) A fifth student got a 79.04 on the Chapter 1 Test. What would we expect his/her grade to be on the chapter 2 test? r=.36 b=.59
An Example So our regression equation is: 𝑇𝑒𝑠𝑡2 =28.94+.59(𝑇𝑒𝑠𝑡1) Test 1 Test 2 Student 1 83.73 90.57 Student 2 88.8 74.59 Student 3 82.65 76.23 Student 4 75.42 69.67 𝑥 77.77 𝑠 𝑥 5.52 8.98 So our regression equation is: 𝑇𝑒𝑠𝑡2 =28.94+.59(𝑇𝑒𝑠𝑡1) A fifth student got a 79.04 on the Chapter 1 Test. What would we expect his/her grade to be on the chapter 2 test? 28.94+(.59)(79.04) We would expect him/her to get a 75.57 r=.36 b=.59
An Example Let’s do the same thing on our calculator Test 1 Test 2 Student 1 83.73 90.57 Student 2 88.8 74.59 Student 3 82.65 76.23 Student 4 75.42 69.67 𝑥 77.77 𝑠 𝑥 5.52 8.98 Let’s do the same thing on our calculator Enter our x-values (Test1 scores) into L1 and our y-values (Test2 scores) into L2 𝑇𝑒𝑠𝑡2 =28.94+.59(𝑇𝑒𝑠𝑡1)
An Example Let’s do the same thing on our calculator Test 1 Test 2 Student 1 83.73 90.57 Student 2 88.8 74.59 Student 3 82.65 76.23 Student 4 75.42 69.67 𝑥 77.77 𝑠 𝑥 5.52 8.98 Let’s do the same thing on our calculator Enter our x-values (Test1 scores) into L1 and our y-values (Test2 scores) into L2 Stat---Calc---Linreg(a+bx) 𝑇𝑒𝑠𝑡2 =28.94+.59(𝑇𝑒𝑠𝑡1)
An Example Let’s do the same thing on our calculator Test 1 Test 2 Student 1 83.73 90.57 Student 2 88.8 74.59 Student 3 82.65 76.23 Student 4 75.42 69.67 𝑥 77.77 𝑠 𝑥 5.52 8.98 Let’s do the same thing on our calculator Stat---Calc---Linreg(a+bx) A=29.92 B=.58 Notice: very similar to our equation Why is it a bit different? 𝑇𝑒𝑠𝑡2 =28.94+.59(𝑇𝑒𝑠𝑡1)
A New Example: Stadium Size Stadium Size (thousands) 2016 Win Pct Texas 100.1 .417 Old Dominion 20.1 .769 Auburn 87.5 .615 Hawaii 50.0 .500 Kansas 50.1 .167 Colorado 50.2 .714 Colorado State 41.2 .538 Using your calculator, find the regression line that uses stadium size to predict 2016 Winning Percentage Use the regression line to predict the 2016 winning percentage for Tulane (stadium seats 30,000)
A New Example: Stadium Size Stadium Size (thousands) 2016 Win Pct Texas 100.1 .417 Old Dominion 20.1 .769 Auburn 87.5 .615 Hawaii 50.0 .500 Kansas 50.1 .167 Colorado 50.2 .714 Colorado State 41.2 .538 Using your calculator, find the regression line that uses stadium size to predict 2016 Winning Percentage 𝑤𝑖𝑛 =.654−.002(Stadium) Use the regression line to predict the 2016 winning percentage for Tulane (stadium seats 30,000) .594
A New Example: Stadium Size Stadium Size (thousands) 2016 Win Pct Texas 100.1 .417 Old Dominion 20.1 .769 Auburn 87.5 .615 Hawaii 50.0 .500 Kansas 50.1 .167 Colorado 50.2 .714 Colorado State 41.2 .538 𝑤𝑖𝑛 =.654−.002(Stadium) Use the regression line to predict the 2016 winning percentage for Tulane (stadium seats 30,000) .594 Tulane’s actual winning percentage last year was .333. Did our model do a very good job of predicting this?
No Pattern—Good!
Residual Plots What if the residual plot DOES have a pattern? Linear regression is probably not appropriate For now, we just stop there There are non-linear regression methods (your calculator can even do some of them) But for now we are focused only on linear regression
Using Technology—making a scatterplot We still have these data in L1 and L2 of our calculator We can use L1 and L2 to make a scatterplot Go to Y= and delete any equations you have in there Go to 2nd—Stat plot
Using Technology—making a scatterplot Go to Y= and delete any equations you have in there Go to 2nd—Stat plot Turn Plot1 on (make sure you choose L1 and L2) Hit Graph—you will probably see no points. Why?
Using Technology—making a scatterplot Hit Graph—you will probably see no points. Why? Click Window—change xmin, xmax, ymin, and ymax Now let’s add our regression line: 𝑤𝑖𝑛 =.654−.002(Stadium)
Using Technology—Making a Residual Plot (Page 178) Go to Stat---Edit Leave data in L1 and L2 Scroll over to L3 and highlight “L3” Type: .654− .002∗𝐿1 Now L3 is holding predicted values Now highlight “L4” Type: L2-L3 Now L4 is holding residuals Go back to Stat Plot Turn Plot 1 off, and turn plot 2 on Specify L1 as X variable, and L4 as y-variable Go to Window and change ymin (-1 works in this case) Go to Y= and delete your regression line (or hide it) Hit graph
HW Time