Basic linear regression and multiple regression Psych 437 - Fraley.

Slides:



Advertisements
Similar presentations
Things to do in Lecture 1 Outline basic concepts of causality
Advertisements

Testing Theories: Three Reasons Why Data Might not Match the Theory.
Some terminology When the relation between variables are expressed in this manner, we call the relevant equation(s) mathematical models The intercept and.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Review ? ? ? I am examining differences in the mean between groups
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Greg C Elvers.
Quantitative Methods 2 Lecture 3 The Simple Linear Regression Model Edmund Malesky, Ph.D., UCSD.
Introduction: The General Linear Model b b The General Linear Model is a phrase used to indicate a class of statistical models which include simple linear.
Evaluating Theoretical Models R-squared represents the proportion of the variance in Y that is accounted for by the model. When the model doesn’t do any.
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Linear Regression and Binary Variables The independent variable does not necessarily need to be continuous. If the independent variable is binary (e.g.,
Turning Point At the beginning of the course, we discussed three ways in which mathematics and statistics can be used to facilitate psychological science.
Understanding the General Linear Model
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Statistics for the Social Sciences
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
The Basics of Regression continued
REGRESSION AND CORRELATION
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Relationships Among Variables
Smith/Davis (c) 2005 Prentice Hall Chapter Eight Correlation and Prediction PowerPoint Presentation created by Dr. Susan R. Burns Morningside College.
Calibration & Curve Fitting
Correlation and Regression
Chapter 8: Bivariate Regression and Correlation
Lecture 15 Basics of Regression Analysis
Objectives of Multiple Regression
Correlation and Linear Regression
Theory testing Part of what differentiates science from non-science is the process of theory testing. When a theory has been articulated carefully, it.
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Testing Theories: Three Reasons Why Data Might not Match the Theory Psych 437.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
Linear Functions 2 Sociology 5811 Lecture 18 Copyright © 2004 by Evan Schofer Do not copy or distribute without permission.
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Physics 114: Exam 2 Review Lectures 11-16
Business Research Methods William G. Zikmund Chapter 23 Bivariate Analysis: Measures of Associations.
1 Psych 5510/6510 Chapter 10. Interactions and Polynomial Regression: Models with Products of Continuous Predictors Spring, 2009.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Chapter 10 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 A perfect correlation implies the ability to predict one score from another perfectly.
CORRELATION. Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson’s coefficient of correlation.
Regression Understanding relationships and predicting outcomes.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Correlation & Regression Analysis
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
LESSON 6: REGRESSION 2/21/12 EDUC 502: Introduction to Statistics.
Example x y We wish to check for a non zero correlation.
Regression. Outline of Today’s Discussion 1.Coefficient of Determination 2.Regression Analysis: Introduction 3.Regression Analysis: SPSS 4.Regression.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
OLS Regression What is it? Closely allied with correlation – interested in the strength of the linear relationship between two variables One variable is.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Multiple Independent Variables POLS 300 Butz. Multivariate Analysis Problem with bivariate analysis in nonexperimental designs: –Spuriousness and Causality.
Copyright © Cengage Learning. All rights reserved. 8 9 Correlation and Regression.
Linear Regression 1 Sociology 5811 Lecture 19 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Stats Methods at IC Lecture 3: Regression.
Multiple Regression.
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Product moment correlation
Introduction to Regression
Presentation transcript:

Basic linear regression and multiple regression Psych Fraley

Example Let’s say we wish to model the relationship between coffee consumption and happiness

Some Possible Functions

Lines Linear relationships Y = a + bX –a = Y-intercept (the value of Y when X = 0) –b = slope (the “rise over the run”, the steepness of the line); a weight Y = 1 + 2X

Lines and intercepts Y = a + 2X Notice that the implied values of Y go up as we increase a. By changing a, we are changing the elevation of the line. Y = 1 + 2X Y = 3 + 2X Y = 5 + 2X

Lines and slopes Slope as “rise over run”: how much of a change in Y is there given a 1 unit increase in X. As we move up 1 unit on X, we go up 2 units on Y 2/1 = 2 (the slope) Y = 1 + 2X run rise move from 0 to 1 rise from 1 to 3 (a 2 unit change)

Lines and slopes Notice that as we increase the slope, b, we increase the steepness of the line COFFEE HAPPINESS Y = 1 + 2X Y = 1 + 4X

COFFEE HAPPINESS Lines and slopes We can also have negative slopes and slopes of zero. When the slope is zero, the predicted values of Y are equal to a. Y = a + 0X Y = a b=2 b=4 b=0 b=-2 b=-4

Other functions Quadratic function Y = a + bX 2 –a still represents the intercept (value of Y when X = 0) –b still represents a weight, and influences the magnitude of the squaring function

Quadratic and intercepts As we increase a, the elevation of the curve increases COFFEE HAPPINESS Y = 0 + 1X 2 Y = 5 + 1X 2

Quadratic and Weight When we increase the weight, b, the quadratic effect is accentuated COFFEE HAPPINESS Y = 0 + 1X 2 Y = 0 + 5X 2

COFFEE HAPPINESS Quadratic and Weight As before, we can have negative weights for quadratic functions. In this case, negative values of b flip the curve upside-down. As before, when b = 0, the value of Y = a for all values of X. Y = 0 – 5X 2 Y = 0 – 1X 2 Y = 0 + 1X 2 Y = 0 + 5X 2 Y = 0 + 0X 2

Linear & Quadratic Combinations When linear and quadratic terms are present in the same equation, one can derive j-shaped curves Y = a + b 1 X + b 2 X 2 linear weight (b1) quadratic weight (b2) + 0 -

Some terminology When the relation between variables are expressed in this manner, we call the relevant equation(s) mathematical models The intercept and weight values are called parameters of the model. Although one can describe the relationship between two variables in the way we have done here, for now on we’ll assume that our models are causal models, such that the variable on the left-hand side of the equation is assumed to be caused by the variable(s) on the right-hand side.

Terminology The values of Y in these models are often called predicted values, sometimes abbreviated as Y-hat or. Why? They are the values of Y that are implied by the specific parameters of the model.

Estimation Up to this point, we have assumed that our models are correct. There are two important issues we need to deal with, however: –Assuming the basic model is correct (e.g., linear), what are the correct parameters for the model? –Is the basic form of the model correct? That is, is a linear, as opposed to a quadratic, model the appropriate model for characterizing the relationship between variables?

Estimation The process of obtaining the correct parameter values (assuming we are working with the right model) is called parameter estimation.

Parameter Estimation example Let’s assume that we believe there is a linear relationship between X and Y. Assume we have collected the following data Which set of parameter values will bring us closest to representing the data accurately?

Estimation example We begin by picking some values, plugging them into the linear equation, and seeing how well the implied values correspond to the observed values We can quantify what we mean by “how well” by examining the difference between the model-implied Y and the actual Y value this difference,, is often called error in prediction

Estimation example Let’s try a different value of b and see what happens Now the implied values of Y are getting closer to the actual values of Y, but we’re still off by quite a bit

Estimation example Things are getting better, but certainly things could improve

Estimation example Ah, much better

Estimation example Now that’s very nice There is a perfect correspondence between the implied values of Y and the actual values of Y

Estimation example Whoa. That’s a little worse. Simply increasing b doesn’t seem to make things increasingly better

Estimation example Ugg. Things are getting worse again.

Parameter Estimation example Here is one way to think about what we’re doing: –We are trying to find a set of parameter values that will give us a small—the smallest—discrepancy between the predicted Y values and the actual values of Y. How can we quantify this?

Parameter Estimation example One way to do so is to find the difference between each value of Y and the corresponding predicted value (we called these differences “errors” before), square these differences, and average them together

Parameter Estimation example The form of this equation should be familiar. Notice that it represents some kind of average of squared deviations This average is often called error variance.

Parameter Estimation example In estimating the parameters of our model, we are trying to find a set of parameters that minimizes the error variance. In other words, we want to be as small as it possibly can be. The process of finding this minimum value is called least- squares estimation.

Parameter Estimation example In this graph I have plotted the error variance as a function of the different parameter values we chose for b. Notice that our error was large at first (at b = -2), but got smaller as we made b larger. Eventually, the error reached a minimum when b = 2 and, then, began to increase again as we made b larger. Different values of b

Parameter Estimation example The minimum in this example occurred when b = 2. This is the “best” value of b, when we define “best” as the value that minimizes the error variance. There is no other value of b that will make the error smaller. (0 is as low as you can go.) Different values of b

Ways to estimate parameters The method we just used is sometimes called the brute force or gradient descent method to estimating parameters. –More formally, gradient decent involves starting with viable parameter value, calculating the error using slightly different value, moving the best guess parameter value in the direction of the smallest error, then repeating this process until the error is as small as it can be. Analytic methods –With simple linear models, the equation is so simple that brute force methods are unnecessary.

Analytic least-squares estimation Specifically, one can use calculus to find the values of a and b that will minimize the error function

Analytic least-squares estimation When this is done (we won’t actually do the calculus here ), the obtain the following equations:

Analytic least-squares estimation Thus, we can easily find the least-squares estimates of a and b from simple knowledge of (1) the correlation between X and Y, (2) the SD’s of X and Y, and (3) the means of X and Y:

A neat fact Notice what happens when X and Y are in standard score form Thus,

In the parameter estimation example, we dealt with a situation in which a linear model of the form Y = 2 + 2X perfectly accounted for the data. (That is, there was no discrepancy between the values implied by the model and the actual data.) Even when this is not the case (i.e., when the model doesn’t explain the data perfectly), we can still find least squares estimates of the parameters.

Error Variance In this example, the value of b that minimizes the error variance is also 2. However, even when b = 2, there are discrepancies between the predictions entailed by the model and the actual data values. Thus, the error variance becomes not only a way to estimate parameters, but a way to evaluate the basic model itself.

R-squared In short, when the model is a good representation of the relationship between Y and X, the error variance of the model should be relatively low. This is typically quantified by an index called the multiple R or the squared version of it, R 2.

R-squared R-squared represents the proportion of the variance in Y that is accounted for by the model When the model doesn’t do any better than guessing the mean, R 2 will equal zero. When the model is perfect (i.e., it accounts for the data perfectly), R 2 will equal 1.00.

Neat fact When dealing with a simple linear model with one X, R 2 is equal to the correlation of X and Y, squared. Why? Keep in mind that R 2 is in a standardized metric in virtue of having divided the error variance by the variance of Y. Previously, when working with standardized scores in simple linear regression equations, we found that the parameter b is equal to r. Since b is estimated via least- squares techniques, it is directly related to R 2.

Why is R 2 useful? R 2 is useful because it is a standard metric for interpreting model fit. –It doesn’t matter how large the variance of Y is because everything is evaluated relative to the variance of Y –Set end-points: 1 is perfect and 0 is as bad as a model can be.

Multiple Regression In many situations in personality psychology we are interested in modeling Y not only as a function of a single X variable, but potentially many X variables. Example: We might attempt to explain variation in academic achievement as a function of SES and maternal education.

Y = a + b 1 *SES + b 2 *MATEDU Notice that “adding” a new variable to the model is simple. This equation states that academic achievement is a function of at least two things, SES and MATEDU.

However, what the regression coefficients now represent is not merely the change in Y expected given a 1 unit increase in X. They represent the change in Y given a 1- unit change in X assuming all the other variables in the equation equal zero. In other words, these coefficients are kind of like partial correlations (technically, they are called semi-partial correlations). We’re statistically controlling SES when estimating the effect of MATEDU.

Estimating regression coefficients in SPSS Correlations SESMATEDUACHIEVEG5 SES MATEDU ACHIEVEG

Note: The regression parameter estimates are in the column labeled B. Constant = a = intercept

Achievement = *MATEDU +.539*SES

These parameter estimates imply that moving up one unit on SES leads to a 1.4 unit increase on achievement. Moreover, moving up 1 unit in maternal education corresponds to a half-unit increase in achievement.

Does this mean that Maternal Education matters more than SES in predicting educational achievement? Not necessarily. As it stands, the two variables might be on very different metrics. (Perhaps MATEDU ranges from 0 to 20 and SES ranges from 0 to 4.) To evaluate their relative contributions to Y, one can standardize both variables or examine standardized regression coefficients.

Z(Achievement) = *Z(MATEDU) +.118*Z(SES)

The multiple R and the R squared for the full model are listed here. This particular model explains 14% of the variance in academic achievement

Adding SES*SES (SES2) improves R-squared by about 1% These parameters suggest that higher SES predicts higher achievement, but in a limiting way. There are diminishing returns on the high end of SES.

SESaB1*MATEDUB2*SESB3*SES*SESY-hat *0.436* *-2* *0.436* *-1* *0.436*0-.320*0* *0.436*1-.320*1* *0.436*2-.320*2*2-0.41

Z(SES) Predicted Z(Achievement)