LESSON 4.1. MULTIPLE LINEAR REGRESSION 1 Design and Data Analysis in Psychology II Salvador Chacón Moscoso Susana Sanduvete Chaves
1. INTRODUCTION 2 x1x1 Y x2x2 x3x3 xKxK
Model components: – More than one independent variable (X): Qualitative. Quantitative. – A quantitative dependent variable (Y). – Example: X 1 : educative level. X 2 : economic level. X 3 : personality characteristics. X 4 : gender. Y: drug dependence level. 3
1. INTRODUCTION Reasons why it is interesting to increase the simple linear regression model: – Human behavior is complex (multiple regression is more realistic). – It increases statistical power (probability of rejecting null hypothesis and taking a good decision). 4
1. INTRODUCTION Regression equation: Raw scores 5
1. INTRODUCTION Regression equation: Deviation scores Standard scores 6
2. ASSUMPTIONS 1.Linearity. 2.Independence of errors: 3.Homoscedasticity: the variances are constant. 4.Normality: the punctuations are distributed in a normal way. 5.The predictor variables cannot correlate perfectly between them. 7
3. PROPERTIES 8 The errors do not correlate with the predictor variables or the predicted scores.
4. INTERPRETATION 9 X1X1 Y Example 1: quantitative variables X2X2 X3X3 Maternal stimulation 3-year-old development level Paternal stimulation 6-year-old development level b 0= 20.8
4. INTERPRETATION 10
4. INTERPRETATION 11 X1X1 Y Example 2: quantitative and qualitative variables X2X2 Emotional tiredness Gender 0=woman 1=man Stress symptoms b 0 =1.987
4. INTERPRETATION 12 The same slope, different constant = parallel lines
4. INTERPRETATION 13 X1X1 Y Example 3: two qualitative variables X2X2 Gender 0=woman 1=man Work 0=public 1= private Stress symptoms b 0 =5.206
4. INTERPRETATION Women, public organization: Women, private organization: Men, public organization: Men, private organization: 14
5. COMPONENTS OF VARIATION SS TOTAL = SS EXPLAINED + SS RESIDUAL 15
6. GOODNESS OF FIT= COEFFICIENT OF DETERMINATION 16 2 possibilities: a)r 12 = 0 b) r 12 ≠ 0
6. GOODNESS OF FIT= COEFFICIENT OF DETERMINATION 17 a) r 12 = 0 X1X1 X2X2 Y X2X2 X1X1 ba
6. GOODNESS OF FIT= COEFFICIENT OF DETERMINATION 18 b)r 12 ≠ 0 X1X1 X2X2 Y X2X2 X1X1 ba c (the area c would be summed twice)
6. GOODNESS OF FIT= COEFFICIENT OF DETERMINATION 19 Semi partial correlation coefficient square
7. MODEL VALIDATION Sources of variation Sums of squares dfVariancesF Regression or explained k Residual or unexplained N-k-1 TotalN-1 20
7. MODEL VALIDATION – Null hypothesis is rejected. The variables are related. The model is valid. – Null hypothesis is accepted. The variables are not related. The model is not valid. (k = number of independent variables) 21
7. MODEL VALIDATION: EXAMPLE A linear regression equation was estimated in order to study the possible relationship between the level of familiar cohesion (Y) and the variables gender (X 1 ) and time working outside, instead at home (X 2 ). Some of the most relevant results were the following: 22
7. MODEL VALIDATION: EXAMPLE 23 Sources of variation Sums of squares dfVariancesFSig. Regression or explained Residual or unexplained Total
7. MODEL VALIDATION: EXAMPLE 1.Which is the proportion of unexplained variability by the model? 2.Can the model be considered valid? Justify your answer (α=0.05). 24
7. MODEL VALIDATION: EXAMPLE 1.Which is the proportion of unexplained variability by the model? 2. Can the model be considered valid? Justify your answer (α=0.05). Yes, because the significance (sig.) is lower to α=
8. SIGNIFICANCE OF REGRESSION PARAMETERS 1. 2.Statistic: 26 In SPSS it is called standard error (error típico)
8. SIGNIFICANCE OF REGRESSION PARAMETERS 3. Comparison and conclusions (for each independent variable): – Null hypothesis is rejected. The slope is statistically different to 0. As a conclusion, there is relationship between variables. It is recommended to maintain the variable as part of the model. – Null hypothesis is accepted. The slope is statistically equal to 0. As a conclusion, there is not relationship between variables. It is recommended to remove the variable from the model. 27
8. SIGNIFICANCE OF REGRESSION PARAMETERS: EXAMPLE We studied the relationship between the variables nationality (0: Moroccan, 1: Filipino) and gender (0:man, 1:woman) with the variable depression in a 148-participant sample. We know that F is equal to 8.889, and the values obtained in the following table: 28
8. SIGNIFICANCE OF REGRESSION PARAMETERS: EXAMPLE Non-standard coefficientsStandard c. BStand. errorBetatSig. (Constant)? Gender? Nationality? Calculate R 2. 2.Calculate the regression equation in raw scores. 3.Would you remove any variable from the model? Justify your answer (α=0.05).
8. SIGNIFICANCE OF REGRESSION PARAMETERS: EXAMPLE 30 1.Calculate R 2.
8. SIGNIFICANCE OF REGRESSION PARAMETERS: EXAMPLE Calculate the regression equation in raw scores.
8. SIGNIFICANCE OF REGRESSION PARAMETERS: EXAMPLE Would you remove any variable from the model? Justify your answer (α=0.05). No, because the t of the three parameters present a significance (sig.) lower than α=0.05