Multiple Regression in SPSS GV917. Multiple Regression Multiple Regression involves more than one predictor variable. For example in the turnout model.


Similar presentations
Objectives 10.1 Simple linear regression

Correlation and Regression By Walden University Statsupport Team March 2011.
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Linear regression models
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Multiple Regression Analysis
Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Linear Regression.
The Simple Regression Model
Intro to Statistics for the Behavioral Sciences PSYC 1900
Topic 3: Regression.
An Introduction to Logistic Regression
Ch. 14: The Multiple Regression Model building
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Simple Linear Regression Analysis
Linear Regression.  Uses correlations  Predicts value of one variable from the value of another  ***computes UKNOWN outcomes from present, known outcomes.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Relationships Among Variables
Example of Simple and Multiple Regression
Lecture 16 Correlation and Coefficient of Correlation
Lecture 15 Basics of Regression Analysis
Chapter 13: Inference in Regression
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Examining Relationships in Quantitative Research
Go to Table of Content Single Variable Regression Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
Political Science 30: Political Inquiry. Linear Regression II: Making Sense of Regression Results Interpreting SPSS regression output Coefficients for.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Correlation & Regression Analysis
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
1 Correlation and Regression Analysis Lecture 11.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
رگرسیون چندگانه Multiple Regression
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Regression Analysis.
Regression Analysis AGEC 784.
Correlation and Simple Linear Regression
Political Science 30: Political Inquiry
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Simple Linear Regression
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Correlation and Simple Linear Regression
Analysis of Variance: Some Review and Some New Ideas
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
Introduction to Regression
Regression Part II.
MGS 3100 Business Analysis Regression Feb 18, 2016
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Multiple Regression in SPSS GV917

Multiple Regression Multiple Regression involves more than one predictor variable. For example in the turnout model Y i = a + b 1 X i1 + b 2 X i2 + e i If Ŷ = a + b 1 X i1 + b 2 X i2 Then Y i – Ŷ = e i Where Y i is the observed value of Reported Turnout X i1 is the observed value of Actual Turnout X i2 is the Effective Number of Parties Index a is the intercept and b j are the slope coefficients of the relationship between Reported and Actual Turnout and Reported Turnout and Electoral Distortion Ŷ is the predicted value of Reported Turnout from the linear relationship with Actual Turnout and Electoral Distortion e i is the residual or error term

Add an Effective Number of Parties Index to the Turnout Model This measure was devised by Laakso and Taagepera (Comparative Political Studies 1979). It is designed to summarize the degree of fragmentation of the party system in a country. It is defined as: Σ (P v ) 2 Where Pv is each party’s proportion of the total vote

Two Examples Suppose there is a two party system in a country and the votes are shared 60% to 40%. This is not a fragmented system so that: = = 1.92 Σ (P v ) 2 (0.60) 2 + (0.40) 2 Intuitively this means that the party system contains 1.92 ‘equally sized’ parties. But suppose in the country next door the vote is divided among four parties as follows: 35%, 30%, 20%, 15%. This is much more fragmented: = = 3.64 Σ (P v ) 2 (0.35) 2 + (0.30) 2 + (0.20) 2 + (0.15) 2 In this case there are 3.64 ‘equally sized’ parties.

CountryReported Turnout Actual TurnoutEffective No Parties Austria Belgium Switzerland Czech Republic Germany Denmark Spain Finland France Britain Greece Hungary Ireland Israel Italy Luxembourg Netherlands Norwary Poland Portugal Slovenia

Reported Turnout Regression with Two Predictors

Why this effect? Note that the fragmentation of parties tends to reduce reported turnout. This effect has been attributed to information processing costs. If the average citizen has to make choices among a lot of alternatives before voting, this raises the costs of voting and it has the effect of reducing turnout The parties effect is independent of the actual turnout effect – since in multiple regression we identify the effects of one predictor controlling for all other predictors.

In the Turnout model we are fitting a regression plane to a Three Dimensional Scattergram

How Does Controlling Work? Step One: Regress the Effective Number of Parties on Reported Turnout: Y i = a + b 1 X i2 + v i Note that the v i represents the variation in Reported Turnout NOT accounted for by the Effective Number of Parties. We have removed the number of parties as an influence on reported turnout. Step Two: Regress the Effective Number of Parties on Actual Turnout X i1 = a + b 2 X i2 + u i Thus u i represents the variation in Actual Turnout NOT accounted for by the Effective Number of Parties. We have removed the number of parties as an influence on Actual Turnout

Controlling in Multiple Regression Step Three: In the Multiple Regression Model Y i = a + b 1 X i1 + b 2 X i2 + e i b 1 or the effect of actual turnout on reported turnout can be found by regressing the residuals v i on the residuals u i because both are independent of the Effective Number of Parties. This is in effect what multiple regression does. Actual Turnout Effective Number of Parties Reported Turnout

Controlling in Regression In this model we are regressing the residuals of the Effective Number of Parties (v i ) on the residuals of the Actual Number of Parties (u i ). This produces the same regression coefficient (0.636) as in the earlier multivariate model

Another Look at ANOVA and the F test in Multiple Regression The F test compares the Mean Square with the Residual Mean Square. If it has a high value then the regression explains a lot more variation than is left unexplained. If it has a low value then the regression explains very little variation The theoretical F distribution measures the probability that the F statistics will take on a particular value if the Null Hypothesis (the regression explains nothing) is correct

F Test in Multiple Regression Mean Square = Regression Sum of Squares _________________ = ______ = Degrees of Freedom 2 Residual Mean Square = Residual Sum of Squares = ____________________ _____ = Degrees of Freedom 18 F = Mean Square/ Residual Mean Square = / = 55.86

What are Degrees of Freedom? – They are useable bits of information Total: If we had one observation we could not say anything about the total variation – we need more than one case. This is why the degrees of freedom or usable bits of information is n-1 or 20 (given 21 cases). Residual: If we had two observations we could fit the regression line in a bivariate model since the shortest distance between two points is a straight line, but there would be no residuals since the line would fit perfectly. In a three variable model we would need three observations to fit the regression line since it is a three dimensional space. So to define residuals we need n-3 degrees of freedom or 18 degrees of freedom Since the Total Variation = Explained Variation + Residual Variation Then Explained Variation = Total Variation – Residual Variation Explained Variation = (N-1) – (N-3) = 2 Degrees of freedom

The F test F = Mean Square/ Residual Mean Square is an F distribution. If we start by assuming that the regression explains nothing then the F ratio will not be zero, because by chance we might get a small positive value The F distribution maps the probability that a ratio of a given size will occur if the regression actually explains nothing The larger the value of F, the smaller the likelihood that it will occur by chance if the regression explains nothing. In this case an F of occurring due to chance is much smaller than 0.05, so we can say that the F statistic is significant at the 0.05 level.

The F Distribution – (named after Ronald Fisher)

Another Model – Explaining Happiness in the ESS 2002 Dataset happy How happy are you FrequencyPercentValid Percent Cumulative Percent Valid0 Extremely unhappy Extremely happy Total Missing77 Refusal Don't know No answer 54.1 Total Total

Income Scale in the European Social Survey 2002 hinctnt Household's total net income, all sources FrequencyPercentValid Percent Cumulative Percent Valid1 J R C M F S K P D H U N Total Missing77 Refusal Don't know No answer Total Total

Does Money Buy Happiness? ModelRR SquareAdjusted R SquareStd. Error of the Estimate a a. Predictors: (Constant), income ANOVA b ModelSum of SquaresdfMean SquareFSig. 1Regression a Residual Total a. Predictors: (Constant), income b. Dependent Variable: happy How happy are you Coefficients a Model Unstandardized Coefficients Standardized Coefficients tSig. BStd. ErrorBeta 1(Constant) income a. Dependent Variable: happy How happy are you

Is the Specification Correct? Perhaps we should use a Quadratic Version of the Income Variable *Calculating Quadratic Functions in the ESS Compute income = hinctnt. compute incomsq = hinctnt*hinctnt. Where incomsq is the square of the hinctnt (household income) variable. If we use incomsq in the model in addition to income this captures a non-linear relationship between income and happiness – more income increases happiness but at a declining rate of change

Regression of Income on Happiness in the ESS 2002 – Does Money Buy Happiness?

Quadratic Relationship Between Two Variables

Suppose we want to use Occupational Status as a predictor in the Happiness model – we would have to create this variable This is done with the assistance of the variable ISCOCO. This is a classification of the many occupations which exist in Europe. For example: iscoco Occupation 100 Armed forces 1100 Legislators and senior officials 1110 Legislators, senior government officials 1140 Senior officials of special-interest org 1141 Senior officials of political-party org 1142 Senior officials of economic-interest org To put this in a form which is useable in the regression model we recode it as follows: recode iscoco (2000 thru 2470=6)(1000 thru 1319=5)(3000 thru 3480=4)(4000 thru 4223=3)(5000 thru 8340=2)(9000 thru 9330=1)(else=sysmis) into occup. value labels occup 1 'unskilled or semi-skilled manual workers' 2 'skilled manual workers' 3 'white collar clerical & administrative workers' 4 'white collar technical workers' 5 'middle managers' 6 'professionals and senior managers'.

The Recoded Occupational Status Variable in the ESS 2002 Data

Suppose we want to add a gender variable – to see if women are happier than men If statements can be used to create new variables in SPSS. These are recodes which are carried out if certain conditions are met. For example: compute female=0. (creates a new variable consisting only of zeroes) if (gndr eq 2) female=1.(changes this new variable to a score of 1 if the existing variable gndr has a score of 2)

If Statements in SPSS – gndr and Female

Revised Happiness Model ANOVA b ModelSum of SquaresdfMean SquareFSig. 1Regression a Residual Total a. Predictors: (Constant), incomsq, female, occup, income b. Dependent Variable: happy How happy are you Coefficients a Model Unstandardized Coefficients Standardized Coefficients tSig. BStd. ErrorBeta 1(Constant) female occup income incomsq a. Dependent Variable: happy How happy are you Model Summary ModelRR SquareAdjusted R SquareStd. Error of the Estimate a a. Predictors: (Constant), incomsq, female, occup, income

Conclusions Multiple Regression is a relatively simple extension of Two variable regression Unlike two variable regression in multiple regression we are controlling for the influence of additional variables when examining the relationship between the independent variable and the dependent variable – it is a bit like a statistical experiment The great majority of social science models are multivariate models and so commonly we used multiple regression