Chapter 20 Linear Regression. What if… We believe that an important relation between two measures exists? For example, we ask 5 people about their salary.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

1 Functions and Applications
A student wonders if tall women tend to date taller men than do short women. She measures herself, her dormitory roommate, and the women in the adjoining.
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
Overview Correlation Regression -Definition
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
LSRL Least Squares Regression Line
LINEAR REGRESSION: What it Is and How it Works Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r.
LINEAR REGRESSION: What it Is and How it Works. Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r.
Chapter 21 Correlation. Correlation A measure of the strength of a linear relationship Although there are at least 6 methods for measuring correlation,
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
Descriptive Methods in Regression and Correlation
Linear Regression.
HAWKES LEARNING SYSTEMS math courseware specialists Discovering Relationships Chapter 5 Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc.
Introduction to Linear Regression and Correlation Analysis
Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Chapter 6 & 7 Linear Regression & Correlation
A student wonders if tall women tend to date taller men than do short women. She measures herself, her dormitory roommate, and the women in the adjoining.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
Warsaw Summer School 2015, OSU Study Abroad Program Regression.
Regression Lines. Today’s Aim: To learn the method for calculating the most accurate Line of Best Fit for a set of data.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Intro to Regression POL 242. Summary Regression is the process by which we fit a line to depict the relationship between two variables (usually both interval.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. DosageHeart rate
STA291 Statistical Methods Lecture LINEar Association o r measures “closeness” of data to the “best” line. What line is that? And best in what terms.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
1 Data Analysis Linear Regression Data Analysis Linear Regression Ernesto A. Diaz Department of Mathematics Redwood High School.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
Psychology 202a Advanced Psychological Statistics October 22, 2015.
Least Squares Regression Lines Text: Chapter 3.3 Unit 4: Notes page 58.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
1 ES9 Chapter 5 ~ Regression. 2 ES9 Chapter Goals To be able to present bivariate data in tabular and graphic form To gain an understanding of the distinction.
Unit 4 Lesson 3 (5.3) Summarizing Bivariate Data 5.3: LSRL.
Chapter 7 Linear Regression. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Correlation and Regression Elementary Statistics Larson Farber Chapter 9 Hours of Training Accidents.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
Part II Exploring Relationships Between Variables.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
Chapter 3 LSRL. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict.
Chapter 5 LSRL. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict.
The simple linear regression model and parameter estimation
Department of Mathematics
CHAPTER 3 Describing Relationships
Unit 4 LSRL.
Least Squares Regression Line.
CHAPTER 3 Describing Relationships
Regression Chapter 6 I Introduction to Regression
Chapter 5 LSRL.
Multiple Regression.
Chapter 3.2 LSRL.
Describing Bivariate Relationships
Least Squares Regression Line LSRL Chapter 7-continued
Least-Squares Regression
The Weather Turbulence
Chapter 5 LSRL.
Chapter 5 LSRL.
Chapter 5 LSRL.
Correlation and Regression
Least-Squares Regression
Presentation transcript:

Chapter 20 Linear Regression

What if… We believe that an important relation between two measures exists? For example, we ask 5 people about their salary and education level For each observation we have two measures, and those two measures came from the same person

What would we “predict”? Does more education mean more salary? Does more salary mean more education? Does more education mean less salary? Does more salary mean less education? Are salary and education related?

Regression Descriptive vs. Inferential Bivariate data - measurements on two variables for each observation –Heights (X) and weights (Y) –IQ (X) and SAT(Y) scores –Years of educ. (X) and Annual salary (Y) –Number of Policemen (X) and Number of crimes (Y) in US cities

Regression How are the two sets of scores related? Using a scatterplot we can “look” at the relationship Constructed by plotting each of the bivariate observations (X, Y)

Regression Which one’s X and which one’s Y? That’s up to you, but… Generally, the X variable is thought of as the “predictor” variable We try to predict a Y score given an X score

Regression If the scores seem to “line up,” we call this a “linear relationship”

Interpreting Scatterplots If the following relations hold: low x - high y mid x - mid y high x - low y, “A negative linear relationship”

Interpreting Scatterplots If the following relations hold: low x - low y mid x - mid y high x - high y, “A positive linear relationship”

Interpreting Scatterplots However, there also can be “no relation” also

Interpreting Scatterplots Curvelinear

Measuring Linear Relationships The first measure of a linear relationship (not in the book) is COVARIANCE (s XY )

Or SP XY is known as the “Sum of Products” or the sum of the products of the deviations of X and Y from their means

Easy Calculation

Covariance Interpretation: –positive = positive linear relationship –negative = negative linear relationship –zero = no relationship Magnitude (strength of the relationship)? –Uninterpretable –for example, a large covariance does not necessarily mean strong relationship

But, we can use covariance Which line best fits our data? Do we just draw one that looks good? No, we can use something called “least squares regression” to find the equation of the best-fit line (“Best-fit linear regression”)

Linear Equations Y i = mX i + b m = slope b = y-intercept

Finding the Slope

Or…

Finding the y-intercept (b) After finding the slope (m), find b using:

Least Squares Criterion The best line has the property of least squares The sum of the squared deviations of the points from the line are a minimum

What’s the “least” again? What are we trying to minimize? –The best fit line will be described by the function Y i = mX i + b –Thus, for any X i, we can estimate a corresponding Y i value –Problem: for some X i ’s we already have Y i ’s –So, let’s call the estimated value (“Y-sub-I-hat”), to differentiate it from the “real” Y i

Least Squares Criterion For example, when X i = 15 we would estimate that = 44,000 But, we have a “real” Y i value corresponding to X i =15 (35,000) When X i = 15 Our estimated Y value is 44,000 A “ real ” Y value of 35,000

Minimize this… For every X i, we have the a value Y i, and an estimate of Y i ( ) Consider the quantity: –Which is the deviation of the real score from the estimated score, for any give X i value The sum of these deviations will be zero

But, by squaring those deviations and summing, We want the line that makes the above quantity the minimum (the least squares criterion) This is also called the sums of squares error or SSE (how much do our estimates “err” from our real values?)

How accurate are our Estimates? Two ways to measure how “good” our estimates are: –Standard Error of the Estimate –Coefficient of Determination (not covered in our book, yet)

Standard Error of the Estimate but, this term is very hard to interpret. (Hurrah, there are better ways to measure the goodness of the fit!)

Coefficient of Determination cd = r 2

Now You: IDINCOMENUMDRK

Practice: IDINCOMENUMDRK XY Σ n M SS(X)

Practice: IDINCOMENUMDRK XY Σ n55 M4.43 SS(X)17.234

Practice: IDINCOMENUMDRK XY Σ n55 M4.43 SS(X)17.234