Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPM and CCSS Statistics

Similar presentations


Presentation on theme: "CPM and CCSS Statistics"— Presentation transcript:

1 CPM and CCSS Statistics
Residual Plots, r and r2  What are they? Core Connections Algebra (CCA)

2 Mike Long mikelong@cpm.org
CPM Regional Coordinator for Hawaii Math Dept DH/CC for Kapolei High School 27th year of teaching in Hawaii DOE. Helped open Kapolei High School (the newest high school in Hawaii) in 2000 I currently teach AP Calculus and AP Statistics T3 Regional Instructor for Hawaii – Texas Instruments

3 Earlier in Chapter 4…

4 Lesson Math Notes

5

6 Lesson Closure

7

8 Residuals

9 Battle Creek Cereal Scatterplot

10 Lesson Closure The first step in any statistical analysis is to graph the data. Graphs do not necessarily start at the origin; indeed, frequently in statistical analyses they do not. A residual is a measure of how far our prediction using the best-fit model is from what was actually observed. A residual has the same units as the y-axis. A residual can be graphed with a vertical segment. The length of this segment (in the units of the y-axis) is the residual. A positive residual means the actual observed y-value of a piece of data is greater than the y‑value that was predicted by the LSRL. A negative residual means the actual data is less than predicted. Extrapolation of a statistical model can lead to nonsensical results.

11 Total Points Scored in a Season
This table shows data for one season of the El Toro professional basketball team. El Toro team member Antonio Kusoc was inadvertently left off of the list. Antonio Kusoc played for 2103 minutes. We would like to predict how many points he scored in the season. Player Name Minutes Played Total Points Scored in a Season Sordan, Scottie 3090 2491 Lippen, Mike 2825 1496 Karper, Don 1886 594 Shortley, Luc 1641 564 Gerr, Bill 1919 688 Jodman, Dennis 2088 351 Kennington, Steve 1065 376 Bailey, John 7 5 Bookler, Jack 740 278 Dimkins, Rickie 685 216 Edwards, Jason 274 98 Gaffey, James 545 182 Black, Sandy 671 185 Talley, Dan 191 36 checksum 17627 checksum 7560

12 Your Task (6-30) a. Obtain a Lesson Resource Page from your teacher. Draw a line of best fit for the data and then use it to write an equation that models the relationship between total points in the season and minutes played. b. Which data point is an outlier for this data? Whose data does that point represent? What is his residual? c. Would a player be more proud of a negative or positive residual? d. Predict how many points Antonio Kusoc made.

13 LSRL on a Calculator 6-33. A least squares regression line (LSRL) is a unique line that has the smallest possible value for the sum of the squares of the residuals. a. Your teacher will show you how to use your calculator to make a scatterplot. (Graphing calculator instructions can also be downloaded from Be sure to use the checksum at the bottom of the table in problem 6-30 to verify that you entered the data into your calculator accurately. b. Your teacher will show you how to find the LSRL and graph it on your calculator. Sketch your scatterplot and LSRL on your paper.

14 Lesson Closure This is a two-day lesson. Problem 6-34 is a Least Squares Demo that can be teacher led to summarize their understanding of Least Square Regression Lines. LeastSquaresDemo.html

15

16 Find the Correlation Coefficient for the El Toro Basketball team
Describe the form, direction, strength, and outliers of the association. Form could be linear but the residual plot indicates a another model might be better. Direction is positive with a slope of 0.59; an increase of one minute played produced 0.59 points scored on the average. Strength is a fairly strong and positive linear association because r = 0.865. Outliers: Scottie Sordan (a.k.a. Michael Jordan) TI-Nspire

17 6-50. Help Giulia analyze the residuals for the pizza parlors in problem 6-48 as described below. Explore ideas using the 6-48 & 6-50 Student eTool (Desmos). Mark the residuals on the scatterplot you created in problem If you want to purchase an inexpensive pizza, should you go to a store with a positive or negative residual?  What is the sum of the residuals? Are you surprised at this result?  Your teacher will show you how to make a residual plot using your calculator, with the x-axis representing the number of pizza toppings, and the y-axis representing the residuals. Interpret the scatter of the points on the residual plot. Is a linear model a good fit for the data? 

18 What is r? How Do I Make Sense of It?
Select any three points in the first quadrant. Use your calculator to make a scatterplot and find the LSRL. Make a sketch of the scatterplot in your notebook, and record the value of the correlation coefficient r. Your teacher will show you how to use your calculator to find r. Continue to investigate different combinations of three points and graph each of their scatterplots. Work with your team to discuss and record all of your conclusions about the value of r from this investigation

19 Math Notes Correlation Coefficient
The correlation coefficient, r, is a measure of how much or how little data is scattered around the LSRL; it is a measure of the strength of a linear association. The correlation coefficient can take on values between –1 and 1. If r = 1 or r = –1 the association is perfectly linear. There is no scatter about the LSRL at all.  A positive correlation coefficient means the trend is increasing (slope is positive), while a negative correlation means the opposite. A correlation coefficient of zero means the slope of the LSRL is horizontal and there is no linear association whatsoever between the variables.  The correlation coefficient does not have units, so it is a useful way to compare scatter from situation to situation no matter what the units of the variables are. The correlation coefficient does not have a real-world meaning other than as an arbitrary measure of strength.

20 Math Notes (continued)
The value of the correlation coefficient squared, however, does have a real-world meaning. R-squared, the correlation coefficient squared, is written as R² and expressed as a percent. Its meaning is that R²% of the variability in the dependent variable can be explained by a linear relationship with the independent variable. For example, if the association between the amount of fertilizer and plant height has correlation coefficient r = 0.60, we can say that 36% of the variability in plant height can be explained by a linear relationship with the amount of fertilizer used. The rest of the variation in plant height is explained by other variables: amount of water, amount of sunlight, soil type, and so forth. The correlation coefficient, along with the interpretation of  R², is used to describe the strength of a linear association. 

21 Use your calculator to determine the value of r and r2 for this data: Interpret each in context.

22 How Can Residual Plots and r Be Used When Curve Fitting?
Let’s take a look at two problems involving the relationship between the circumference of a piece of a fruit and it’s mass The length and weight of several alligators I use the TI-Nspire for this activity.

23 Session Feedback Form: http://tinyurl.com/cpmcon2017
Session Number: 2D Fill in your session number This is what the Feedback From Asks:


Download ppt "CPM and CCSS Statistics"

Similar presentations


Ads by Google