Regression Basics Predicting a DV with a Single IV.

Slides:



Advertisements
Similar presentations
Kin 304 Regression Linear Regression Least Sum of Squares
Advertisements

BA 275 Quantitative Business Methods
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Overview Correlation Regression -Definition
Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.
Regression Regression: Mathematical method for determining the best equation that reproduces a data set Linear Regression: Regression method applied with.
Cal State Northridge  320 Andrew Ainsworth PhD Regression.
LINEAR REGRESSION: What it Is and How it Works Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r.
Lecture 3 Cameron Kaplan
LINEAR REGRESSION: What it Is and How it Works. Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r.
Reading – Linear Regression Le (Chapter 8 through 8.1.6) C &S (Chapter 5:F,G,H)
Bivariate Regression CJ 526 Statistical Analysis in Criminal Justice.
Chapter 12 Simple Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Linear Regression and Correlation Analysis
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
SIMPLE LINEAR REGRESSION
Ch. 14: The Multiple Regression Model building
Introduction to Linear and Logistic Regression. Basic Ideas Linear Transformation Finding the Regression Line Minimize sum of the quadratic residuals.
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Multiple Regression Research Methods and Statistics.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Lecture 15 Basics of Regression Analysis
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Regression with 2 IVs Generalization of Regression from 1 to 2 Independent Variables.
Section #6 November 13 th 2009 Regression. First, Review Scatter Plots A scatter plot (x, y) x y A scatter plot is a graph of the ordered pairs (x, y)
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
Regression. Correlation and regression are closely related in use and in math. Correlation summarizes the relations b/t 2 variables. Regression is used.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Multivariate Analysis. One-way ANOVA Tests the difference in the means of 2 or more nominal groups Tests the difference in the means of 2 or more nominal.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Regression. Population Covariance and Correlation.
Warsaw Summer School 2015, OSU Study Abroad Program Regression.
Multiple Linear Regression Partial Regression Coefficients.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Midterm Review Ch 7-8. Requests for Help by Chapter.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
1 Experimental Statistics - week 11 Chapter 11: Linear Regression and Correlation.
Chapter 11 REGRESSION Multiple Regression  Uses  Explanation  Prediction.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
Kin 304 Regression Linear Regression Least Sum of Squares
BPK 304W Regression Linear Regression Least Sum of Squares
Regression Analysis Week 4.
No notecard for this quiz!!
Introduction to Regression
Linear Regression and Correlation
Product moment correlation
Linear Regression and Correlation
Introduction to Regression
Chapter 14 Multiple Regression
Presentation transcript:

Regression Basics Predicting a DV with a Single IV

Questions What are predictors and criteria? Write an equation for the linear regression. Describe each term. How do changes in the slope and intercept affect (move) the regression line? What does it mean to test the significance of the regression sum of squares? R-square? What is R-square? What does it mean to choose a regression line to satisfy the loss function of least squares? How do we find the slope and intercept for the regression line with a single independent variable? (Either formula for the slope is acceptable.) Why does testing for the regression sum of squares turn out to have the same result as testing for R- square?

Basic Ideas Jargon –IV = X = Predictor (pl. predictors) –DV = Y = Criterion (pl. criteria) –Regression of Y on X e.g., GPA on SAT Linear Model = relations between IV and DV represented by straight line. A score on Y has 2 parts – (1) linear function of X and (2) error. (population values)

Basic Ideas (2) Sample value: Intercept – place where X=0 Slope – change in Y if X changes 1 unit. Rise over run. If error is removed, we have a predicted value for each person at X (the line): Suppose on average houses are worth about $75.00 a square foot. Then the equation relating price to size would be Y’=0+75X. The predicted price for a 2000 square foot house would be $150,000.

Linear Transformation 1 to 1 mapping of variables via line Permissible operations are addition and multiplication (interval data) Add a constantMultiply by a constant

Linear Transformation (2) Centigrade to Fahrenheit Note 1 to 1 map Intercept? Slope? Degrees C Degrees F 32 degrees F, 0 degrees C 212 degrees F, 100 degrees C Intercept is 32. When X (Cent) is 0, Y (Fahr) is 32. Slope is 1.8. When Cent goes from 0 to 100 (run), Fahr goes from 32 to 212 (rise), and = 180. Then 180/100 =1.8 is rise over run is the slope. Y = X. F=32+1.8C.

Review What are predictors and criteria? Write an equation for the linear regression with 1 IV. Describe each term. How do changes in the slope and intercept affect (move) the regression line?

Regression of Weight on Height HtWt N=10 M=67M=150 SD=4.57SD= Correlation (r) =.94. Regression equation: Y’= X

Illustration of the Linear Model. This concept is vital! Consider Y as a deviation from the mean. Part of that deviation can be associated with X (the linear part) and part cannot (the error).

Predicted Values & Residuals NHtWtY'Resid M SD V Numbers for linear part and error. Note M of Y’ and Residuals. Note variance of Y is V(Y’) + V(res).

Finding the Regression Line Need to know the correlation, SDs and means of X and Y. The correlation is the slope when both X and Y are expressed as z scores. To translate to raw scores, just bring back original SDs for both. To find the intercept, use: (rise over run) Suppose r =.50, SD X =.5, M X = 10, SD Y = 2, M Y = 5. Slope InterceptEquation

Line of Least Squares We have some points. Assume linear relations is reasonable, so the 2 vbls can be represented by a line. Where should the line go? Place the line so errors (residuals) are small. The line we calculate has a sum of errors = 0. It has a sum of squared errors that are as small as possible; the line provides the smallest sum of squared errors or least squares.

Least Squares (2)

Review What does it mean to choose a regression line to satisfy the loss function of least squares? What are predicted values and residuals? Suppose r =.25, SD X = 1, M X = 10, SD Y = 2, M Y = 5. What is the regression equation (line)?

Partitioning the Sum of Squares Definitions = y, deviation from mean Sum of squares (cross products drop out) Sum of squared deviations from the mean = Sum of squares due to regression + Sum of squared residuals reg error Analog: SS tot =SS B +SS W

Partitioning SS (2) SS Y =SS Reg + SS Res Total SS is regression SS plus residual SS. Can also get proportions of each. Can get variance by dividing SS by N if you want. Proportion of total SS due to regression = proportion of total variance due to regression = R 2 (R-square).

Partitioning SS (3) Wt (Y) M=150 Y'Resid (Y-Y') Resid Sum = Variance

Partitioning SS (4) TotalRegressResidual SS Variance Proportion of SS Proportion of Variance R 2 =.88 Note Y’ is linear function of X, so.

Significance Testing Testing for the SS due to regression = testing for the variance due to regression = testing the significance of R 2. All are the same. k=number of IVs (here it’s 1) and N is the sample size (# people). F with k and (N-k-1) df. Equivalent test using R-square instead of SS. Results will be same within rounding error.

Review What does it mean to test the significance of the regression sum of squares? R-square? What is R-square? Why does testing for the regression sum of squares turn out to have the same result as testing for R-square?