Download presentation
Presentation is loading. Please wait.
1
3.2. SIMPLE LINEAR REGRESSION
Design and Data Analysis in Psychology II SalvadorChacón Moscoso Susana Sanduvete Chaves
2
1. Introduction Specifying regression equation is useful to:
Describe variable relations in a simple and precise way. Predict values of one variable as a function of another one. Only linear relationship can be considered. Simple linear regression just specifies the relationship between two variables; the simplest possible relationship between variables.
3
VARIABLES TERMINOLOGY
1. Introduction VARIABLES TERMINOLOGY X Y Predictor, regressor criterion explanatory explained predetermined reply independent dependent exogenous endogenous (explains variability in the other variable) (its variability is explained by the other variable)
4
2. SCATTER PLOT INTERPRETATION
Scatter plot provides information about: Linear relationship between variables. Type (shape) of relationship (linear or another one). Level of intensity of relationship (depending on how close group of scores are represented). Outliers can confound existing relation. Group of scores can or not be uniformed represented (homoscedasticity vs. heteroscedasticity).
5
3. SPECIFYING THE SIMPLE LINEAR REGRESSION MODEL
x
6
3. SPECIFYING THE SIMPLE LINEAR REGRESSION MODEL
: Is usually labeled: Error Perturbation Residual Basically, it depends on: Inaccurate variable measurement. Other variables influence not considered in the model.
7
3. SPECIFYING THE SIMPLE LINEAR REGRESSION MODEL
8
3.1. Model Assumptions Statistical conditions: Linearity.
Homoscedasticity : Y variances in each X value are ‘equal’ (statistically homogeneous). Lack of autocorrelation: Y variables are uncorrelated (important issue in longitudinal studies). Normal distribution.
9
3.1. Model Assumptions Features of the descriptive model:
Adequate model specification: Do not exclude relevant independent variables. Do not include irrelevant independent variables [‘INUS’; insufficient but necessary useful variables]. Independent variable must not have any measurement error.
10
4. Parameters Estimation
α y β. Minimum Least squares (MLS) Using raw scores :
11
4. Parameters Estimation
Using deviation scores : ‘b’ is the same in the formula for raw and deviation scores. Using standard scores :
12
4. An example of parameters estimation
Based on data from previous example (lesson 3.1.), calculate the regression equation in raw, deviation and standard scores. X 2 4 6 8 10 12 14 16 18 20 Y 1 13 22
13
4. An example of parameters estimation
Regression equation for raw scores :
14
4. An example of parameters estimation
Regression equation with deviation scores: Regression equation with standard scores:
15
5. REGRESSION MODEL: INTERPRETATION
In the linear regression model We distinguish the following elements: e error or residuals: randomized part; what is not explained by the model.
16
5. REGRESSION MODEL: INTERPRETATION
predicted score: mean value anticipated for all the participants that obtained a value of Xi in the variable X. b slope: change in Y for each unit of change in X. a constant: mean value of Y when X=0.
17
5. REGRESSION MODEL: INTERPRETATION (EXAMPLE)
We have the following regression equation: Where X is the number of years of professional experience, and Y is the salary. 1. Interpret a and b. 2. Supposing that a person presents 3 years of professional experience, what salary will this person obtain? Interpret the result. 3. If one person with 3 years of professional experience obtained a salary of 1700 €, which would the error be? Interpret the result.
18
5. REGRESSION MODEL: INTERPRETATION (EXAMPLE)
Interpret a and b. b=300 change in Y for each unit of change in X. In each change of professional experience, the salary increases 300 €. a=600 mean value of Y when X=0. Mean salary of those people without any professional experience.
19
5. REGRESSION MODEL: INTERPRETATION (EXAMPLE)
2. Supposing that a person presents 3 years of professional experience, what salary will this person obtain? Interpret the result. predicted score: mean value anticipated for all participants that obtained in the variable X a value of Xi. People that present 3 years of professional experience, have a mean salary of 1500 €.
20
5. REGRESSION MODEL: INTERPRETATION (EXAMPLE)
3. If one person with 3 years of professional experience obtained a salary of 1700 €, which would the error be? Interpret the result. The model estimates a salary of 1500 € for a person with 3 years of professional experience. If his real salary is 1700 €, this difference of 200 € is the error; the part the model does not explain.
21
6. COMPONENTS OF VARIATION
22
6. COMPONENTS OF VARIATION
Total sum of squares = explained sum of squares + residual sum of squares Total variation = explained variation + residual variation
23
6. COMPONENTS OF VARIATION: EXAMPLE
Determine the components of variation from the data used in the first example.
24
6. COMPONENTS OF VARIATION: EXAMPLE
Total sum of squares calculation:
25
6. COMPONENTS OF VARIATION: EXAMPLE
Explained sum of squares calculation:
26
6. COMPONENTS OF VARIATION: EXAMPLE
Residual sum of squares calculation:
27
6. COMPONENTS OF VARIATION: EXAMPLE
Checkout: TotalSS= ExplainedSS+ResidualSS
28
7. GOODNESS OF FIT - Coincides with the coefficient of determination.
The proportion of unexplained variability = 1-R2
29
7. GOODNESS OF FIT
30
7. GOODNESS OF FIT : EXAMPLE
Calculate the goodness of fit (with the three proposed formulas) and the proportion of unexplained variability.
31
7. GOODNESS OF FIT : EXAMPLE
32
8. MODEL VALIDATION How can I tell if my data fit a model?
Sources of variation Sums of squares Degrees of Freedom Variances F Regression or explained k Residual or unexplained N-k-1 Total N-1
33
8. MODEL VALIDATION Null hypothesis is rejected. The variables are related. The model is valid. Null hypothesis is accepted. The variables are not related. The model is not valid. (k = number of independent variables)
34
8. MODEL VALIDATION Other possible formulas of ‘F’ :
‘F’ calculation with raw scores:
35
8. MODEL VALIDATION Using variances: Using coefficient of determination (R2) :
36
8. MODEL VALIDATION: EXAMPLE
With the above data, calculate the ‘F’ (using the 4 proposed formulas) and conclude on the validity of the model.
37
8. MODEL VALIDATION: EXAMPLE
Sources of variation Sums of squares Degrees of Freedom Variances F Regression or explained 1 Residual or unexplained 77.018 8 9.627 Total (approx.) 9 28.908
38
8. MODEL VALIDATION: EXAMPLE
Conclusion: Null hypothesis is rejected. The variables X and Y are related. The model is valid.
39
8. MODEL VALIDATION: EXAMPLE
Other formulas:
40
8. MODEL VALIDATION: EXAMPLE
41
8. MODEL VALIDATION: EXAMPLE Which is the final conclusion?
Significant effect Non-significant effect High effect size (≥ 0.67) The effect probably exists The non-significance can be due to low statistical power Low effect size (≤ 0.18) The statistical significance can be due to an excessive high statistical power The effect probably does not exist
42
8. MODEL VALIDATION: EXAMPLE Which is the final conclusion?
Significant effect Non-significant effect High effect size = 0.704 (≥ 0.67) The effect probably exists The non-significance can be due to low statistical power Low effect size (≤ 0.18) The statistical significance can be due to an excessive high statistical power The effect probably does not exist
43
9. SIGNIFICANCE OF REGRESSION PARAMETERS
b study (related to the independent variable). In simple linear regression, test of significance equivalent to F and to the significance of rXY More interesting in multiple linear regression, where in spite of obtaining a significant global F some of the regression parameters may not be significant.
44
9. SIGNIFICANCE OF REGRESSION PARAMETERS
Hypothesis: H0: β = 0 H1: β = 0
45
9. SIGNIFICANCE OF REGRESSION PARAMETERS
46
9. SIGNIFICANCE OF REGRESSION PARAMETERS
Null hypothesis is rejected. The model is valid. The slope is statistically different from 0. There is, therefore, relationship between variables. Null hypothesis is accepted. The model is not valid. The slope is statistically equal to 0. There is not, therefore, relationship between variables.
47
9. SIGNIFICANCE OF REGRESSION PARAMETERS: EXAMPLE
Using previous data, determine ‘b’ parameter significance.
48
9. SIGNIFICANCE OF REGRESSION PARAMETERS: EXAMPLE
Conclusion: reject the null hypothesis. The model is valid. The slope is statistically different from 0. There is, therefore, relationship between variables.
49
10. PREDICTION A specific value: Which Y value will have a person with X = 4?
50
10. PREDICTION Whithin a confidence interval:
51
10. PREDICTION: EXAMPLE Which will be the score interval Y of the person who obtained X = 4?
52
10. PREDICTION: EXAMPLE
53
10. PREDICTION: EXAMPLE Conclusion: There is a probability of 0.95 that a person having a value of X = 4, obtains a score between and
54
10. PREDICTION: LIMITATIONS
Do not extrapolate values beyond the observational data. What would happen if there were a quadratic relationship? Predicted score Yield Yield Real score Anxiety Anxiety
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.