Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.

Slides:



Advertisements
Similar presentations
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Advertisements

The Regression Equation  A predicted value on the DV in the bi-variate case is found with the following formula: Ŷ = a + B (X1)
Dummy Variables Dummy variables refers to the technique of using a dichotomous variable (coded 0 or 1) to represent the separate categories of a nominal.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Soc 3306a Lecture 6: Introduction to Multivariate Relationships Control with Bivariate Tables Simple Control in Regression.
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Multiple Regression [ Cross-Sectional Data ]
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Statistics for the Social Sciences
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
Department of Applied Economics National Chung Hsing University
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Interactions in Regression.
Lecture 27 Polynomial Terms for Curvature Categorical Variables.
(Correlation and) (Multiple) Regression Friday 5 th March (and Logistic Regression too!)
Ch. 14: The Multiple Regression Model building
Multiple Regression 1 Sociology 8811 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Relationships Among Variables
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Dummies (no, this lecture is not about you) POL 242 Renan Levine February 13/15, 2007.
DUMMY VARIABLES BY HARUNA ISSAHAKU Haruna Issahaku.
Structural Equation Models – Path Analysis
Chapter 8: Bivariate Regression and Correlation
Multiple Regression. In the previous section, we examined simple regression, which has just one independent variable on the right side of the equation.
1 MF-852 Financial Econometrics Lecture 9 Dummy Variables, Functional Form, Trends, and Tests for Structural Change Roy J. Epstein Fall 2003.
Moderation & Mediation
Lecture 3-3 Summarizing r relationships among variables © 1.
Multiple Regression 1 Sociology 5811 Lecture 22 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Linear Functions 2 Sociology 5811 Lecture 18 Copyright © 2004 by Evan Schofer Do not copy or distribute without permission.
Regression Analysis. Scatter plots Regression analysis requires interval and ratio-level data. To see if your data fits the models of regression, it is.
Chapter 18 Four Multivariate Techniques Angela Gillis & Winston Jackson Nursing Research: Methods & Interpretation.
Soc 3306a Multiple Regression Testing a Model and Interpreting Coefficients.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Sociology 5811: Lecture 14: ANOVA 2
Multiple Regression 5 Sociology 5811 Lecture 26 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Interactions POL 242 Renan Levine March 13/15, 2007.
Multiple Regression 3 Sociology 5811 Lecture 24 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Multiple Regression. The test you choose depends on level of measurement: Independent VariableDependentVariableTest DichotomousInterval-Ratio Independent.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
Multiple Regression 4 Sociology 5811 Lecture 25 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Multiple Regression. Multiple Regression  Usually several variables influence the dependent variable  Example: income is influenced by years of education.
Lecture 15: Crosstabulation 1 Sociology 5811 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Review. POL 242 – Strong Correlation. Positive or Negative?
Chapter 16 Data Analysis: Testing for Associations.
Chapter 13 Multiple Regression
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
University of Warwick, Department of Sociology, 2012/13 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 5 Multiple Regression.
Multiple Regression Review Sociology 229A Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.
Sociology 5811: Lecture 3: Measures of Central Tendency and Dispersion Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Examining Relationships in Quantitative Research
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
DTC Quantitative Research Methods Regression I: (Correlation and) Linear Regression Thursday 27 th November 2014.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 6 Regression: ‘Loose Ends’
Chapter 8 Multivariate Regression Analysis 8.3 Multiple Regression with K Independent Variables 8.4 Significance tests of Parameters.
Linear Regression 1 Sociology 5811 Lecture 19 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard)   Week 5 Multiple Regression  
The Correlation Coefficient (r)
Regression 1 Sociology 8811 Copyright © 2007 by Evan Schofer
Multiple Regression 1 Sociology 8811 Copyright © 2007 by Evan Schofer
The Correlation Coefficient (r)
Presentation transcript:

Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission

Announcements Today: More multivariate regression

The Multiple Regression Model A two-independent variable regression model: Note: There are now two X variables And a slope (b) is estimated for each one The full multiple regression model is: For k independent variables

Multiple Regression: Slopes Regression slope for the two variable case: b 1 = slope for X 1 – controlling for the other independent variable X 2 b 2 is computed symmetrically. Swap X 1 s, X 2 s Compare to bivariate slope:

Regression Slopes So, if two variables (X 1, X 2 ) are correlated and both predict Y: The X variable that is more correlated with Y will have a higher slope in multivariate regression –The slope of the less-correlated variable will shrink Thus, slopes for each variable are adjusted to how well the other variable predicts Y –It is the slope “controlling” for other variables

Standardized Regression Coefficients Standardized Coefficients –Also called “Betas” or Beta Weights” –Symbol: Greek b with asterisk:  * –Equivalent to Z-scoring (standardizing) all independent variables before doing the regression Formula of coeficient for X j : Result: The unit is standard deviations Betas: Indicates the effect a 1 standard deviation change in X j on Y

R-Square in Multiple Regression Example: R-square of.272 indicates that education, parents wealth explain 27% of variance in job prestige “Adjusted R-square” is a more conservative, more accurate measure in multiple regression –Generally, you should report Adjusted R-square.

Dummy Variables Question: How can we incorporate nominal variables (e.g., race, gender) into regression? Option 1: Analyze each sub-group separately –Generates different slope, constant for each group Option 2: Dummy variables –“Dummy” = a dichotomous variables coded to indicate the presence or absence of something –Absence coded as zero, presence coded as 1.

Dummy Variables Strategy: Create a separate dummy variable for all nominal categories Ex: Gender – make female & male variables –DFEMALE: coded as 1 for all women, zero for men –DMALE: coded as 1 for all men Next: Include all but one dummy variables into a multiple regression model If two dummies, include 1; If 5 dummies, include 4.

Dummy Variables Question: Why can’t you include DFEMALE and DMALE in the same regression model? Answer: They are perfectly correlated (negatively): r = -1 –Result: Regression model “blows up” For any set of nominal categories, a full set of dummies contains redundant information –DMALE and DFEMALE contain same information –Dropping one removes redundant information.

Dummy Variables: Interpretation Consider the following regression equation: Question: What if the case is a male? Answer: DFEMALE is 0, so the entire term becomes zero. –Result: Males are modeled using the familiar regression model: a + b 1 X + e.

Dummy Variables: Interpretation Consider the following regression equation: Question: What if the case is a female? Answer: DFEMALE is 1, so b 2 (1) stays in the equation (and is added to the constant) –Result: Females are modeled using a different regression line: (a+b 2 ) + b 1 X + e –Thus, the coefficient of b 2 reflects difference in the constant for women.

Dummy Variables: Interpretation Remember, a different constant generates a different line, either higher or lower –Variable: DFEMALE (women = 1, men = 0) –A positive coefficient (b) indicates that women are consistently higher compared to men (on dep. var.) –A negative coefficient indicated women are lower Example: If DFEMALE coeff = 1.2: –“Women are on average 1.2 points higher than men”.

Dummy Variables: Interpretation Visually: Women = blue, Men = red INCOME HAPPY Overall slope for all data points Note: Line for men, women have same slope… but one is high other is lower. The constant differs! If women=1, men=0: The constant (a) reflects men only. Dummy coefficient (b) reflects increase for women (relative to men)

Dummy Variables What if you want to compare more than 2 groups? Example: Race –Coded 1=white, 2=black, 3=other (like GSS) Make 3 dummy variables: –“DWHITE” is 1 for whites, 0 for everyone else –“DBLACK” is 1 for Af. Am., 0 for everyone else –“DOTHER” is 1 for “others”, 0 for everyone else Then, include two of the three variables in the multiple regression model.

Dummy Variables: Interpretation Ex: Job Prestige Negative coefficient for DBLACK indicates a lower level of job prestige compared to whites –T- and P-values indicate if difference is significant.

Dummy Variables: Interpretation Comments: 1. Dummy coefficients shouldn’t be called slopes –Referring to the “slope” of gender doesn’t make sense –Rather, it is the difference in the constant (or “level”) 2. The contrast is always with the nominal category that was left out of the equation –If DFEMALE is included, the contrast is with males –If DBLACK, DOTHER are included, coefficients reflect difference in constant compared to whites.

Interaction Terms Question: What if you suspect that a variable has a totally different slope for two different sub- groups in your data? Example: Income and Happiness –Perhaps men are more materialistic -- an extra dollar increases their happiness a lot –If women are less materialistic, each dollar has a smaller effect on income (compared to men) Issue isn’t men = “more” or “less” than women –Rather, the slope of a variable (income) differs across groups

Interaction Terms Issue isn’t men = “more” or “less” than women –Rather, the slope of a variable coefficient (for income) differs across groups Again, we want to specify a different regression line for each group –We want lines with different slopes, not parallel lines that are higher or lower.

Interaction Terms Visually: Women = blue, Men = red INCOME HAPPY Overall slope for all data points Note: Here, the slope for men and women differs. The effect of income on happiness (X1 on Y) varies with gender (X2). This is called an “interaction effect”

Interaction Terms Interaction effects: Differences in the relationship (slope) between two variables for each category of a third variable Option #1: Analyze each group separately Option #2: Multiply the two variables of interest: (DFEMALE, INCOME) to create a new variable –Called: DFEMALE*INCOME –Add that variable to the multiple regression model.

Interaction Terms Consider the following regression equation: Question: What if the case is male? Answer: DFEMALE is 0, so b 2 (DFEM*INC) drops out of the equation –Result: Males are modeled using the ordinary regression equation: a + b 1 X + e.

Interaction Terms Consider the following regression equation: Question: What if the case is male? Answer: DFEMALE is 1, so b 2 (DFEM*INC) becomes b 2 *INCOME, which is added to b 1 –Result: Females are modeled using a different regression line: a + (b 1 +b 2 ) X + e –Thus, the coefficient of b 2 reflects difference in the slope of INCOME for women.

Interaction Terms Interpreting interaction terms: A positive b for DFEMALE*INCOME indicates the slope for income is higher for women vs. men –A negative effect indicates the slope is lower –Size of coefficient indicates actual difference in slope Example: DFEMALE*INCOME. Observed b’s: –Income: b =.5 –DFEMALE * INCOME: b = -.2 Interpretation: Slope is.5 for men,.3 for women.

Interaction Terms Continuous variable can also interact Example: Effect of education and income on happiness –Perhaps highly educated people are less materialistic –As education increases, the slope between between income and happiness would decrease Simply multiply Education and Income to create the interaction term “EDUCATION*INCOME” –And add it to the model

Interaction Terms How do you interpret continuous variable interactions? Example: EDUCATION*INCOME: Coefficient = 2.0 Answer: For each unit change in education, the slope of income vs. happiness increases by 2 –Note: coefficient is symmetrical: For each unit change in income, education slope increases by 2 –Dummy interactions result in slopes for each group –Continuous interactions result in many slopes Each category of education*income has a different slope.

Interaction Terms Comments: 1. If you make an interaction you should also include the component variables in the model: –A model with “DFEMALE * INCOME” should also include DFEMALE and INCOME –There is some debate on this issue… but that is the safest course of action 2. Sometimes interaction terms are highly correlated with its components Watch out for that.

Interaction Terms Question: Can you think of examples of two variables that might interact? Either from your final project? Or anything else?