Introduction to Correlation and Regression Introduction to Correlation and Regression Ginger Holmes Rowell, Ph. D. Associate Professor of Mathematics.

Slides:



Advertisements
Similar presentations
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 12.2.
Advertisements

Least Squares Regression
Linear Statistical Model
Correlation & Regression Chapter 10. Outline Section 10-1Introduction Section 10-2Scatter Plots Section 10-3Correlation Section 10-4Regression Section.
Chapter 4 The Relation between Two Variables
Lesson Diagnostics on the Least- Squares Regression Line.
2nd Day: Bear Example Length (in) Weight (lb)
CHAPTER 3 Describing Relationships
Haroon Alam, Mitchell Sanders, Chuck McAllister- Ashley, and Arjun Patel.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
Introduction to Correlation and Regression Introduction to Correlation and Regression Ginger Holmes Rowell, Ph. D. Associate Professor of Mathematics.
Introduction to Correlation and Regression Introduction to Correlation and Regression Ginger Holmes Rowell, Ph. D. Associate Professor of Mathematics.
5-7 Scatter Plots. _______________ plots are graphs that relate two different sets of data by displaying them as ordered pairs. Usually scatter plots.
Researchers, such as anthropologists, are often interested in how two measurements are related. The statistical study of the relationship between variables.
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
Warm Up 1.What is a scatterplot? 2.How do I create one? 3.What does it tell me? 4.What 3 things do we discuss when analyzing a scatterplot?
Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In.
LECTURE UNIT 7 Understanding Relationships Among Variables Scatterplots and correlation Fitting a straight line to bivariate data.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Lesson Least-Squares Regression. Knowledge Objectives Explain what is meant by a regression line. Explain what is meant by extrapolation. Explain.
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Section 2.2 Correlation A numerical measure to supplement the graph. Will give us an indication of “how closely” the data points fit a particular line.
3.3 Least-Squares Regression.  Calculate the least squares regression line  Predict data using your LSRL  Determine and interpret the coefficient of.
Chapter 10 Correlation and Regression
Objectives (IPS Chapter 2.1)
Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
3.2 Least Squares Regression Line. Regression Line Describes how a response variable changes as an explanatory variable changes Formula sheet: Calculator.
Modeling a Linear Relationship Lecture 47 Secs – Tue, Apr 25, 2006.
Chapters 8 & 9 Linear Regression & Regression Wisdom.
Introduction to Correlation and Regression Introduction to Correlation and Regression Rahul Jain.
Section 3.2C. The regression line can be found using the calculator Put the data in L1 and L2. Press Stat – Calc - #8 (or 4) - enter To get the correlation.
Chapter 11 Correlation and Simple Linear Regression Statistics for Business (Econ) 1.
Scatter Diagrams Objective: Draw and interpret scatter diagrams. Distinguish between linear and nonlinear relations. Use a graphing utility to find the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
CORRELATION. Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson’s coefficient of correlation.
Modeling a Linear Relationship Lecture 44 Secs – Tue, Apr 24, 2007.
S CATTERPLOTS Correlation, Least-Squares Regression, Residuals Get That Program ANSCOMBE CRICKETS GESSEL.
9.1 - Correlation Correlation = relationship between 2 variables (x,y): x= independent or explanatory variable y= dependent or response variable Types.
AP Statistics HW: p. 165 #42, 44, 45 Obj: to understand the meaning of r 2 and to use residual plots Do Now: On your calculator select: 2 ND ; 0; DIAGNOSTIC.
Linear Regression Day 1 – (pg )
^ y = a + bx Stats Chapter 5 - Least Squares Regression
CHAPTER 3 Describing Relationships
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Entry Task Write the equation in slope intercept form and find the x and y intercepts. 1. 4x + 6y = 12.
Scatter Plots and Correlations. Is there a relationship between the amount of gas put in a car and the number of miles that can be driven?
Response Variable: measures the outcome of a study (aka Dependent Variable) Explanatory Variable: helps explain or influences the change in the response.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
Section 1.3 Scatter Plots and Correlation.  Graph a scatter plot and identify the data correlation.  Use a graphing calculator to find the correlation.
BUSINESS MATHEMATICS & STATISTICS. Module 6 Correlation ( Lecture 28-29) Line Fitting ( Lectures 30-31) Time Series and Exponential Smoothing ( Lectures.
Copyright © 2009 Pearson Education, Inc. Chapter 8 Linear Regression.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Page 1 Introduction to Correlation and Regression ECONOMICS OF ICMAP, ICAP, MA-ECONOMICS, B.COM. FINANCIAL ACCOUNTING OF ICMAP STAGE 1,3,4 ICAP MODULE.
Two-Variable Data Analysis
CHAPTER 3 Describing Relationships
Warm Up Practice 6-5 (p. 78) #13, 15, 16, 18, 22, 25, 26, 27, 31 – 36
SIMPLE LINEAR REGRESSION MODEL
Introduction to Correlation and Regression
2.5 Scatterplots and Lines of Regression
Regression and Residual Plots
CHAPTER 3 Describing Relationships
11C Line of Best Fit By Eye, 11D Linear Regression
CHAPTER 3 Describing Relationships
Modeling a Linear Relationship
Presentation transcript:

Introduction to Correlation and Regression Introduction to Correlation and Regression Ginger Holmes Rowell, Ph. D. Associate Professor of Mathematics Middle Tennessee State University

Outline Outline Introduction Introduction Linear Correlation Linear Correlation Regression Regression Simple Linear Regression Simple Linear Regression Using the TI-83 Using the TI-83 Model/Formulas Model/Formulas

Outline continued Applications Applications Real-life Applications Real-life Applications Practice Problems Practice Problems Internet Resources Internet Resources Applets Applets Data Sources Data Sources

Correlation Correlation Correlation A measure of association between two numerical variables. A measure of association between two numerical variables. Example (positive correlation) Example (positive correlation) Typically, in the summer as the temperature increases people are thirstier. Typically, in the summer as the temperature increases people are thirstier.

Specific Example Specific Example For seven random summer days, a person recorded the temperature and their water consumption, during a three-hour period spent outside. For seven random summer days, a person recorded the temperature and their water consumption, during a three-hour period spent outside. Temperature (F) Water Consumption (ounces)

How would you describe the graph?

How “strong” is the linear relationship?

Measuring the Relationship Pearson’s Sample Correlation Coefficient, r measures the direction and the strength of the linear association between two numerical paired variables. measures the direction and the strength of the linear association between two numerical paired variables.

Direction of Association Positive Correlation Positive CorrelationNegative Correlation

Strength of Linear Association r value Interpretation 1 perfect positive linear relationship 0no linear relationship perfect negative linear relationship

Strength of Linear Association

Other Strengths of Association r value Interpretation 0.9strong association 0.5moderate association 0.25weak association

Other Strengths of Association

Formula = the sum n = number of paired items x i = input variabley i = output variable x = x-bar = mean of x’s y = y-bar = mean of y’s s x = standard deviation of x’s s y = standard deviation of y’s

Internet Resources Correlation Correlation Guessing Correlations - An interactive site that allows you to try to match correlation coefficients to scatterplots. University of Illinois, Urbanna Champaign Statistics Program. ava/guess/GCApplet.html Guessing Correlations - An interactive site that allows you to try to match correlation coefficients to scatterplots. University of Illinois, Urbanna Champaign Statistics Program. ava/guess/GCApplet.html ava/guess/GCApplet.html ava/guess/GCApplet.html

Regression Regression Regression Specific statistical methods for finding the “line of best fit” for one response (dependent) numerical variable based on one or more explanatory (independent) variables. Specific statistical methods for finding the “line of best fit” for one response (dependent) numerical variable based on one or more explanatory (independent) variables.

Curve Fitting vs. Regression Regression Regression Includes using statistical methods to assess the "goodness of fit" of the model. (ex. Correlation Coefficient) Includes using statistical methods to assess the "goodness of fit" of the model. (ex. Correlation Coefficient)

Regression: 3 Main Purposes To describe (or model) To describe (or model) To predict ( or estimate) To predict ( or estimate) To control (or administer) To control (or administer)

Simple Linear Regression Statistical method for finding Statistical method for finding the “line of best fit” the “line of best fit” for one response (dependent) numerical variable for one response (dependent) numerical variable based on one explanatory (independent) variable. based on one explanatory (independent) variable.

Least Squares Regression GOAL - minimize the sum of the square of the errors of the data points. GOAL - minimize the sum of the square of the errors of the data points. This minimizes the Mean Square Error This minimizes the Mean Square Error

Applet Demonstration Least “Squares” Error Least “Squares” Error Understanding the Least-Squares Regression Line with a Visual Model: Measuring Error in a Linear Model, NCTM Principles and Standards, Electronic Examples

Internet Resources Regression Regression Estimate the Regression Line. Compare the mean square error from different regression lines. Can you find the minimum mean square error? Rice University Virtual Statistics Lab. m/reg_by_eye/index.html Estimate the Regression Line. Compare the mean square error from different regression lines. Can you find the minimum mean square error? Rice University Virtual Statistics Lab. m/reg_by_eye/index.html m/reg_by_eye/index.html m/reg_by_eye/index.html

Example Example Plan an outdoor party. Plan an outdoor party. Estimate number of soft drinks to buy per person, based on how hot the weather is. Estimate number of soft drinks to buy per person, based on how hot the weather is. Use Temperature/Water data and regression. Use Temperature/Water data and regression.

Steps to Reaching a Solution Draw a scatterplot of the data. Draw a scatterplot of the data.

Steps to Reaching a Solution Draw a scatterplot of the data. Draw a scatterplot of the data. Visually, consider the strength of the linear relationship. Visually, consider the strength of the linear relationship.

Steps to Reaching a Solution Draw a scatterplot of the data. Draw a scatterplot of the data. Visually, consider the strength of the linear relationship. Visually, consider the strength of the linear relationship. If the relationship appears relatively strong, find the correlation coefficient as a numerical verification. If the relationship appears relatively strong, find the correlation coefficient as a numerical verification.

Steps to Reaching a Solution Draw a scatterplot of the data. Draw a scatterplot of the data. Visually, consider the strength of the linear relationship. Visually, consider the strength of the linear relationship. If the relationship appears relatively strong, find the correlation coefficient as a numerical verification. If the relationship appears relatively strong, find the correlation coefficient as a numerical verification. If the correlation is still relatively strong, then find the simple linear regression line. If the correlation is still relatively strong, then find the simple linear regression line.

Our Next Steps Estimate the line using algebra (i.e. practice equation of lines) Estimate the line using algebra (i.e. practice equation of lines) Learn to Use the TI-83/84 for Correlation and Regression. Learn to Use the TI-83/84 for Correlation and Regression. Interpret the Results (in the Context of the Problem). Interpret the Results (in the Context of the Problem).

Example Temperature (F) Water Consumption (ounces)

Using Pencil and Paper Draw a scatterplot Draw a scatterplot Draw your estimate of the line of best fit on the scatterplot Draw your estimate of the line of best fit on the scatterplot Find the equation of YOUR line Find the equation of YOUR line

Finding the Solution: TI-83/84 Using the TI- 83/84 calculator Using the TI- 83/84 calculator Turn on the calculator diagnostics. Turn on the calculator diagnostics. Enter the data. Enter the data. Graph a scatterplot of the data. Graph a scatterplot of the data. Find the equation of the regression line and the correlation coefficient. Find the equation of the regression line and the correlation coefficient. Graph the regression line on a graph with the scatterplot. Graph the regression line on a graph with the scatterplot.

Preliminary Step Turn the Diagnostics On. Turn the Diagnostics On. Press 2nd 0 (for Catalog). Press 2nd 0 (for Catalog). Scroll down to DiagnosticOn. The marker points to the right of the words. Scroll down to DiagnosticOn. The marker points to the right of the words. Press ENTER. Press ENTER again. Press ENTER. Press ENTER again. The word Done should appear on the right hand side of the screen. The word Done should appear on the right hand side of the screen.

Example Temperature (F) Water Consumption (ounces)

1. Enter the Data into Lists Press STAT. Press STAT. Under EDIT, select 1: Edit. Under EDIT, select 1: Edit. Enter x-values (input) into L1 Enter x-values (input) into L1 Enter y-values (output) into L2. Enter y-values (output) into L2. After data is entered in the lists, go to 2nd MODE to quit and return to the home screen. After data is entered in the lists, go to 2nd MODE to quit and return to the home screen. Note: If you need to clear out a list, for example list 1, place the cursor on L1 then hit CLEAR and ENTER. Note: If you need to clear out a list, for example list 1, place the cursor on L1 then hit CLEAR and ENTER.

2. Set up the Scatterplot. Press 2nd Y= (STAT PLOTS). Press 2nd Y= (STAT PLOTS). Select 1: PLOT 1 and hit ENTER. Select 1: PLOT 1 and hit ENTER. Use the arrow keys to move the cursor down to On and hit ENTER. Use the arrow keys to move the cursor down to On and hit ENTER. Arrow down to Type: and select the first graph under Type. Arrow down to Type: and select the first graph under Type. Under Xlist: Enter L1. Under Xlist: Enter L1. Under Ylist: Enter L2. Under Ylist: Enter L2. Under Mark: select any of these. Under Mark: select any of these.

3. View the Scatterplot Press 2nd MODE to quit and return to the home screen. Press 2nd MODE to quit and return to the home screen. To plot the points, press ZOOM and select 9: ZoomStat. To plot the points, press ZOOM and select 9: ZoomStat. The scatterplot will then be graphed. The scatterplot will then be graphed.

4. Find the regression line. Press STAT. Press STAT. Press CALC. Press CALC. Select 4: LinReg(ax + b). Select 4: LinReg(ax + b). Press 2nd 1 (for List 1) Press 2nd 1 (for List 1) Press the comma key, Press the comma key, Press 2nd 2 (for List 2) Press 2nd 2 (for List 2) Press ENTER. Press ENTER.

5. Interpreting and Visualizing Interpreting the result: Interpreting the result: y = ax + b y = ax + b The value of a is the slope The value of a is the slope The value of b is the y-intercept The value of b is the y-intercept r is the correlation coefficient r is the correlation coefficient r 2 is the coefficient of determination r 2 is the coefficient of determination

5. Interpreting and Visualizing Write down the equation of the line in slope intercept form. Write down the equation of the line in slope intercept form. Press Y= and enter the equation under Y1. (Clear all other equations.) Press Y= and enter the equation under Y1. (Clear all other equations.) Press GRAPH and the line will be graphed through the data points. Press GRAPH and the line will be graphed through the data points.

Questions ???

Interpretation in Context Regression Equation: Regression Equation: y=1.5*x y=1.5*x Water Consumption = 1.5*Temperature *Temperature

Interpretation in Context Slope = 1.5 (ounces)/(degrees F) Slope = 1.5 (ounces)/(degrees F) for each 1 degree F increase in temperature, you expect an increase of 1.5 ounces of water drank. for each 1 degree F increase in temperature, you expect an increase of 1.5 ounces of water drank.

Interpretation in Context y-intercept = y-intercept = For this example, when the temperature is 0 degrees F, then a person would drink about -97 ounces of water. For this example, when the temperature is 0 degrees F, then a person would drink about -97 ounces of water. That does not make any sense! That does not make any sense! Our model is not applicable for x=0. Our model is not applicable for x=0.

Prediction Example Predict the amount of water a person would drink when the temperature is 95 degrees F. Predict the amount of water a person would drink when the temperature is 95 degrees F. Solution: Substitute the value of x=95 (degrees F) into the regression equation and solve for y (water consumption). If x=95, y=1.5* = 45.6 ounces. Solution: Substitute the value of x=95 (degrees F) into the regression equation and solve for y (water consumption). If x=95, y=1.5* = 45.6 ounces.

Strength of the Association: r 2 Coefficient of Determination = r 2 Coefficient of Determination = r 2 General Interpretation: The coefficient of determination tells the percent of the variation in the response variable that is explained (determined) by the model and the explanatory variable. General Interpretation: The coefficient of determination tells the percent of the variation in the response variable that is explained (determined) by the model and the explanatory variable.

Interpretation of r 2 Example: r 2 =92.7%. Example: r 2 =92.7%. Interpretation: Interpretation: Almost 93% of the variability in the amount of water consumed is explained by outside temperature using this model. Almost 93% of the variability in the amount of water consumed is explained by outside temperature using this model. Note: Therefore 7% of the variation in the amount of water consumed is not explained by this model using temperature. Note: Therefore 7% of the variation in the amount of water consumed is not explained by this model using temperature.

Questions ???

Simple Linear Regression Model The model for simple linear regression is There are mathematical assumptions behind the concepts that we are covering today. we are covering today.

Formulas Prediction Equation:

Real Life Applications Cost Estimating for Future Space Flight Vehicles (Multiple Regression)

Nonlinear Application Predicting when Solar Maximum Will Occur solar/predict.htm solar/predict.htm

Real Life Applications Estimating Seasonal Sales for Department Stores (Periodic) Estimating Seasonal Sales for Department Stores (Periodic)

Real Life Applications Predicting Student Grades Based on Time Spent Studying Predicting Student Grades Based on Time Spent Studying

Real Life Applications What ideas can you think of? What ideas can you think of? What ideas can you think of that your students will relate to? What ideas can you think of that your students will relate to?

Practice Problems Measure Height vs. Arm Span Measure Height vs. Arm Span Find line of best fit for height. Find line of best fit for height. Predict height for one student not in data set. Check predictability of model. Predict height for one student not in data set. Check predictability of model.

Practice Problems Is there any correlation between shoe size and height? Is there any correlation between shoe size and height? Does gender make a difference in this analysis? Does gender make a difference in this analysis?

Practice Problems Can the number of points scored in a basketball game be predicted by Can the number of points scored in a basketball game be predicted by The time a player plays in the game? The time a player plays in the game? By the player’s height? By the player’s height? Idea modified from Steven King, Aiken, SC. NCTM presentation 1997.) Idea modified from Steven King, Aiken, SC. NCTM presentation 1997.)

Resources Data Analysis and Statistics. Curriculum and Evaluation Standards for School Mathematics. Addenda Series, Grades NCTM Data Analysis and Statistics. Curriculum and Evaluation Standards for School Mathematics. Addenda Series, Grades NCTM Data and Story Library. Internet Website. ASL/ Data and Story Library. Internet Website. ASL/ ASL/ ASL/

Internet Resources Regression Regression Effects of adding an Outlier. Effects of adding an Outlier. W. West, University of South Carolina. W. West, University of South Carolina. javahtml/Regression.html javahtml/Regression.html javahtml/Regression.html javahtml/Regression.html

Internet Resources: Data Sets Data and Story Library. Data and Story Library. Excellent source for small data sets. Search for specific statistical methods (e.g. boxplots, regression) or for data concerning a specific field of interest (e.g. health, environment, sports). Excellent source for small data sets. Search for specific statistical methods (e.g. boxplots, regression) or for data concerning a specific field of interest (e.g. health, environment, sports).

Internet Resources: Data Sets FEDSTATS. "The gateway to statistics from over 100 U.S. Federal agencies" FEDSTATS. "The gateway to statistics from over 100 U.S. Federal agencies" "Kid's Pages." (not all related to statistics) "Kid's Pages." (not all related to statistics)

Internet Resources Other Other Statistics Applets. Using Web Applets to Assist in Statistics Instruction. Robin Lock, St. Lawrence University. Statistics Applets. Using Web Applets to Assist in Statistics Instruction. Robin Lock, St. Lawrence University.

Internet Resources Other Other Ten Websites Every Statistics Instructor Should Bookmark. Robin Lock, St. Lawrence University. s.html Ten Websites Every Statistics Instructor Should Bookmark. Robin Lock, St. Lawrence University. s.html s.html s.html CAUSEweb – digital library for undergraduate statistics education CAUSEweb – digital library for undergraduate statistics education

For More Information… On-line version of this presentation /corregpres/index.html More information about regression Visit MTSU web site

Homework: Correlation/Regression Book 1-a Book 1-a Read Sections Read Sections Work Problems Work Problems Section 4.4 # 1, 2a & 2b, 3a Section 4.4 # 1, 2a & 2b, 3a Section 4.5 # 1, 2, 3 Section 4.5 # 1, 2, 3

Homework Preview Next Counting and Probability Next Counting and Probability If you want to read ahead: If you want to read ahead: Read Sections Read Sections 7.1, 7.2, 7.3, 7.4, , 8.2, 8.3, , 8.2, 8.3, 8.4