Download presentation
Presentation is loading. Please wait.
Published byClement Jacobs Modified over 9 years ago
1
Slide 1 Hierarchical Multiple Regression
2
Slide 2 Differences between standard and hierarchical multiple regression Standard multiple regression is used to evaluate the relationship between a set of independent variables and a dependent variable. Hierarchical regression is used to evaluate the relationship between a set of independent variables (predictors) and the dependent variable, controlling for or taking into account the impact of a different set of independent variables (control variables) on the dependent variable. For example, a research hypothesis might state that there are differences in the average salary between male employees and female employees, even after we take into account differences in education levels and prior work experience. In hierarchical regression, the independent variables are entered into the analysis in a sequence of blocks, or groups that may contain one or more variables. In the example above, education and work experience would be entered in the first block and sex would be entered in the second block. Generally, our interest is in R ² change, i.e. the increase when the predictors variables are added to the analysis rather than the overall R ² for the model that includes both controls and predictors. Moreover, the interpretation of individual relationships may focus on the relationship between the predictors and the dependent variables, and ignore the significance and interpretation of control variables. However, in our problems, we will interpret both controls and predictors.
3
Slide 3 Differences in statistical results SPSS shows the statistical results (Model Summary, ANOVA, Coefficients, etc.) as each block of variables is entered into the analysis. In addition (if requested), SPSS prints and tests the key statistic used in evaluating the hierarchical hypothesis: change in R ² for each additional block of variables. The null hypothesis for the addition of each block of variables to the analysis is that the change in R ² (contribution to the explanation of the variance in the dependent variable) is zero. If the null hypothesis is rejected, then our interpretation indicates that the variables in block 2 had a relationship to the dependent variable, after controlling for the relationship of the block 1 variables to the dependent variable, i.e. the variables in block 2 explain something about the dependent variables that was not explained in block 1. The key statistic in hierarchical regression is R ² change (the increase in R ² when the predictors variables are added to the model that included only the control variables). If R ² change is significant, the R ² for the overall model that includes both controls and predictors will usually be significant as well since R ² change is part of overall R ².
4
Slide 4 Variations in hierarchical regression A hierarchical regression can have as many blocks as there are independent variables, i.e. the analyst can specify a hypothesis that specifies an exact order of entry for variables. A more common hierarchical regression specifies two blocks of variables: a set of control variables entered in the first block and a set of predictor variables entered in the second block. Control variables are often demographics which are thought to make a difference in scores on the dependent variable. Predictors are the variables in whose effect our research question is really interested, but whose effect we want to separate out from the control variables. Hierarchical regression specifies the order in which the variables are added to the regression analysis. However, once the regression analysis includes all of the independent variable, the variables will have the same coefficients and significance independent of whether the variables were entered simultaneously or sequentially.
5
Slide 5 Confusion in Terminology over the Meaning of the Designation of a Control Variable Control variables are entered in the first block in a hierarchical regression, and we describe this as “controlling for the effect of some variables…” Sometimes, we mistakenly interpret this to imply that the shared effects between a variable and the control variables is removed from the explanation of the dependent variable credited to the predictor. In fact, the coefficient for each variable is computed to control for all other variables in the analysis, whether the variable is designated as a control or a predictor variable, and regardless of when it was entered into the equation. In a general sense, each variable in a regression analysis is credited only with what it uniquely explains, and the shared variance with all other variables is removed.
6
Slide 6 Research Questions for which Hierarchical Regression is Useful Previous research has found that variables a, b, and c have statistically significant relationships to the dependent variable z, both collectively and individually. You believe that variables d and e should also be included and, in fact, would substantially increase the proportion of variability in z that we can explain. You treat a, b, and c as controls and d and e as predictors in a hierarchical regression. You have found that variables a and b account for a substantial proportion of the differences in the dependent variable z. In presenting your findings, someone asserts that while a and b may have a relationship to z, differences in z are better understood with the relationship that variables d and e have with z. You treat d and e as controls to show that predictors a and b have a significant relationship to z that is not explicable by d and e. You find that there are significant differences in the demographic characteristics (a, b, and c) of groups of subjects in your control and treatment groups (d). To isolate the relationship between group membership (d) and the treatment effect (z), you do a hierarchical regression with a, b, and c as controls and d as the predictor.
7
Slide 7 The Problem in Blackboard The problem statement tells us: the variables included in the analysis whether each variable should be treated as metric or non-metric the reference category for non-metric variables to be dummy coded the alpha for both the statistical relationships and for diagnostic tests NOTE: these problems use a data set, GSS2002_PrejudiceAndAltruism.SAV that we have not used before.
8
Slide 8 The Statement about Level of Measurement The first statement in the problem asks about level of measurement. Hierarchical multiple regression requires the dependent variable and the metric independent variables be interval level, and the non-metric independent variables be dummy-coded if they are not dichotomous. The only way we would violate the level of measurement would be to use a nominal variable as the dependent variable, or to attempt to dummy-code an interval level variable that was not grouped.
9
Slide 9 Marking the Statement about Level of Measurement Mark the check box as a correct statement because: "Accuracy of the description of being a pretty soft-hearted person" [empathy7] is ordinal level, but the problem calls for treating it as metric, applying the common convention of treating ordinal variables as interval level. "Degree of religious fundamentalism" [fund] is ordinal level, but the problem calls for treating it as metric, applying the common convention of treating ordinal variables as interval level. "Description of political views" [polviews] is ordinal level, but the problem calls for treating it as metric, applying the common convention of treating ordinal variables as interval level. The non-metric independent variable "sex" [sex] was dichotomous level, satisfying the requirement for independent variables. The non-metric independent variable "race of the household" [hhrace] was nominal level, but will satisfy the requirement for independent variables when dummy coded.
10
Slide 10 Satisfying the Assumptions of Multiple Regression The next four statements identify the strategies that we will follow to attempt to satisfy the assumptions of multiple regression. If we fail to satisfy the requirement for independence of variables, we halt interpretation of the problem. Once we satisfy the assumptions, we skip the remaining strategies. If we do not satisfy the assumptions with any of the strategies, we will continue to interpret the analysis with all cases and untransformed variables, but we should mention the failure to satisfy assumptions in our findings.
11
Slide 11 Using the Script to Evaluate Assumptions We will use the new script to evaluate regression assumptions for analyses that include metric and non-metric variables. Assuming that you have downloaded the script from the course web site to My Documents, select the Run Script command from the Utilities menu.
12
Slide 12 Opening the Script In the Run Script dialog box, navigate to My Documents and highlight the file named: SatisfyingRegressionAssumptionsWithMetricAnd NonmetricVariables.SBS With the script file highlighted, click on the Run button to use the script.
13
Slide 13 Select the Variables First, move the dependent variable "accuracy of the description of being a pretty soft-hearted person" [empathy7] to the DV text box. Second, move the variables “degree of religious fundamentalism" [fund and "description of political views" [polviews] to the metric independent variables list box. Third, move the variables "sex" [sex] and "race of the household" [hhrace] to the non-metric independent variables list box.
14
Slide 14 Select the Reference Category for Sex With the variable "sex" [sex] highlighted, select 2=FEMALE as the reference category.
15
Slide 15 Select the Reference Category for Race With the variable "race of the household" [hhrace] highlighted, select 3=OTHER as the reference category.
16
Slide 16 Request the Tests of Assumptions for Multiple Regression Having included all of the variables mentioned in the problem and identified the reference categories for non- metric variables, we click on the OK button to request the output.
17
Slide 17 The Variables Included in the Analysis We can look at the variables listed in the table of Descriptive Statistics to make certain we have included the correct variables. The list includes three dummy-coded variables for sex and race. The list includes two independent variables treated as metric.
18
Slide 18 Evaluating the Assumption of Independence of Variables The tolerance values for all of the independent variables are larger than 0.10: "degree of religious fundamentalism" [fund] (0.876), "description of political views" [polviews] (0.952), "survey respondents who were male" [sex_1] (0.976), "survey respondents who were white" [hhrace_1] (0.970) and "survey respondents who were black" [hhrace_2] (0.896). Multicollinearity is not a problem in this regression analysis.
19
Slide 19 Evaluating the Assumption of Linearity In the lack of fit test, the probability of the F test statistic (F=1.66) was p =.004, less than or equal to the alpha level of significance of 0.01. The null hypothesis that "a linear regression model is appropriate" is rejected. The research hypothesis that "a linear regression model is not appropriate" is supported by this test. The assumption of linearity is violated.
20
Slide 20 Evaluating the Assumption of Homoscedasticity The homogeneity of error variance is tested with the Breusch-Pagan test. For this analysis, the Breusch-Pagan statistic was 17.643. The probability of the statistic was p =.003, which was less than or equal to the alpha level for diagnostic tests (p =.010). The null hypothesis that "the variance of the residuals is the same for all values of the independent variable" is rejected. The research hypothesis that "the variance of the residuals is different for some values of the independent variable" is supported. The assumption of homogeneity of error variance is violated.
21
Slide 21 Evaluating the Assumption of Normality Regression analysis assumes that the errors or residuals are normally distributed. The Shapiro- Wilk test of studentized residuals yielded a statistical value of 0.868, which had a probability of p <.001, which was less than or equal to the alpha level for diagnostic tests (p =.010). The null hypothesis that "the distribution of the residuals is normally distributed" is rejected. The research hypothesis that "the distribution of the residuals is not normally distributed" is supported. The assumption of normality of errors is violated.
22
Slide 22 Evaluating the Assumption of Independence of Errors Regression analysis assumes that the errors (residuals) are independent and there is no serial correlation. No serial correlation implies that the size of the residual for one case has no impact on the size of the residual for the next case. The Durbin-Watson statistic tests for the presence of serial correlation among the residuals. The value of the Durbin- Watson statistic ranges from 0 to 4. As a general rule of thumb, the residuals are not correlated if the Durbin-Watson statistic is approximately 2, and an acceptable range is 1.50 - 2.50. The Durbin-Watson statistic for this problem is 1.94 which falls within the acceptable range from 1.50 to 2.50. The analysis satisfies the assumption of independence of errors.
23
Slide 23 Excluding Extreme Outliers to Try to Achieve Normality Three assumptions were violated: linearity, homogeneity of error variance, and normality of the residuals. We will run the analysis without extreme outliers to see if we can satisfy the assumptions with that model. Click on the Exclude extreme outliers button.
24
Slide 24 Feedback on Extreme Outliers When we clicked on the Exclude extreme outliers button, the script provided feedback that there were no extreme outliers to remove.
25
Slide 25 Outliers and Extreme Outliers in SPSS Output If we look back at the output from the previous regression analysis, we see that it produced a list of 16 outliers. The absence of a table of extreme outliers implies that none were found for the analysis.
26
Slide 26 Testing Normality for Variables Treated as Metric: empathy7 Since there are no extreme outliers to remove, we will test transformations of the metric variables. First, highlight the dependent variable, empathy7. Second, click on the Test normality button.
27
Slide 27 Test of Normality in SPSS Output The logarithmic transformation of "accuracy of the description of being a pretty soft-hearted person" [empathy7] was used in this analysis because it had the largest value of the Shapiro- Wilk statistic (0.758).
28
Slide 28 Adding the Transformation of Empathy7 to Data Set First, make certain the variable that we want to transform is still selected. Second, select the option button for the transformation we want to use - Logarithm in this example. Third, click on the Apply transformation button.
29
Slide 29 Testing Normality for Variables Treated as Metric: polviews The name of the variable in the Dependent variable text box is changed to reflect the transformation. Next, we select the first Metric independent variable to test for normality, polviews. And, we click on the Test normality button.
30
Slide 30 Test of Normality in SPSS Output No transformation of "description of political views" [polviews] was used in this analysis. None of the transformations had a value of the Shapiro-Wilk statistic that was at least 0.01 larger than the value for description of political views (0.924).
31
Slide 31 Testing Normality for Variables Treated as Metric: fund Next, we select the second Metric independent variable to test for normality, fund. And, we click on the Test normality button.
32
Slide 32 No transformation of "degree of religious fundamentalism" [fund] was used in this analysis. None of the transformations had a value of the Shapiro-Wilk statistic that was at least 0.01 larger than the value for degree of religious fundamentalism (0.809).
33
Slide 33 The Hierarchical Regression with Transformed Variables Having completed the tests for normality and having transformed empathy7, we click on the OK button to run the regression with the transformed variables. If none of the transformations had improved normality, this would be the same model as the one with untransformed variables and we would conclude that we cannot improve the model with transformations.
34
Slide 34 Evaluating the Assumption of Independence of Variables The tolerance values for all of the independent variables are larger than 0.10: "degree of religious fundamentalism" [fund] (0.876), "description of political views" [polviews] (0.952), "survey respondents who were male" [sex_1] (0.976), "survey respondents who were white" [hhrace_1] (0.970) and "survey respondents who were black" [hhrace_2] (0.896). Multicollinearity is not a problem in this regression analysis.
35
Slide 35 Evaluating the Assumption of Linearity In the lack of fit test, the probability of the F test statistic (F=1.51) was p =.015, greater than the alpha level of significance of 0.01. The null hypothesis that "a linear regression model is appropriate" is not rejected. The research hypothesis that "a linear regression model is not appropriate" is not supported by this test. The assumption of linearity is satisfied.
36
Slide 36 Evaluating the Assumption of Homoscedasticity The homogeneity of error variance is tested with the Breusch-Pagan test. For this analysis, the Breusch- Pagan statistic was 8.811. The probability of the statistic was p =.117, which was greater than the alpha level for diagnostic tests (p =.010). The null hypothesis that "the variance of the residuals is the same for all values of the independent variable" is not rejected. The research hypothesis that "the variance of the residuals is different for some values of the independent variable" is not supported. The assumption of homogeneity of error variance is satisfied.
37
Slide 37 Evaluating the Assumption of Normality Regression analysis assumes that the errors or residuals are normally distributed. The Shapiro- Wilk test of studentized residuals yielded a statistical value of 0.908, which had a probability of p <.001, which was less than or equal to the alpha level for diagnostic tests (p =.010). The null hypothesis that "the distribution of the residuals is normally distributed" is rejected. The research hypothesis that "the distribution of the residuals is not normally distributed" is supported. The assumption of normality of errors is violated.
38
Slide 38 Evaluating the Assumption of Independence of Errors Regression analysis assumes that the errors (residuals) are independent and there is no serial correlation. No serial correlation implies that the size of the residual for one case has no impact on the size of the residual for the next case. The Durbin-Watson statistic tests for the presence of serial correlation among the residuals. The value of the Durbin- Watson statistic ranges from 0 to 4. As a general rule of thumb, the residuals are not correlated if the Durbin-Watson statistic is approximately 2, and an acceptable range is 1.50 - 2.50. The Durbin-Watson statistic for this problem is 1.96 which falls within the acceptable range from 1.50 to 2.50. The analysis satisfies the assumption of independence of errors.
39
Slide 39 Excluding Extreme Outliers One assumption was violated: normality of the residuals. We will run the analysis without extreme outliers to see if we can satisfy all of the assumptions with that model. Click on the Exclude extreme outliers button.
40
Slide 40 Feedback on Extreme Outliers in the Script When we clicked on the Exclude extreme outliers button, the script provided feedback that there were no extreme outliers to remove.
41
Slide 41 Outliers and Extreme Outliers in SPSS Output If we look back at the output from the regression analysis with the transformed dependent variable, we see that it produced a list of 14 outliers. The absence of a table of extreme outliers implies that none were found for the analysis.
42
Slide 42 Marking the Check Boxes for Regression Assumptions The model including the original variables and all cases will be interpreted. The violations of assumptions should be mentioned as limitations to the analysis. None of the models using transformed variables and excluding extreme outliers satisfied all of the assumptions of multiple regression. None of the check boxes are marked.
43
Slide 43 Removing the Transformed Variable Since the transform of the dependent variable did not enable us to satisfy all of the assumptions, we will remove it from the data set. First, make certain the transformed variable is highlighted. Third, click on the Apply transformation button, which will restore the original variable name and delete the transformed variable from the data set. Second, click on the No transformation option button to select it.
44
Slide 44 Retaining the Dummy-coded Variables for the Hierarchical Regression To retain the dummy-coded variables so that they can be used in the hierarchical regression, clear the check box for Delete variables created in this analysis. Click on the Cancel button to close the script dialog box.
45
Slide 45 The Dummy-coded Variables in the Data Editor If we scroll to the right in the data editor, we see that the dummy-coded variables have been retained.
46
Slide 46 Running the Hierarchical Regression Model in SPSS To initiate a hierarchical regression in SPSS, select the Regression > Linear command from the Analyze menu.
47
Slide 47 Including the Dependent and Control Variables in the Analysis First, move the dependent variable, empathy7, to the Dependent text box. Second, move the control variables (hhrace_1, hhrace_2), and polviews to the Independent(s) list box. Third, click on the Next button to start a new block of variables. We will put the predictor independent variables in the second block.
48
Slide 48 Adding the Predictor Independent Variables to the Analysis First, note that the block label changed to Block 2 of 2 when we clicked on the Next button. Second, note that the list box for independent variables is empty. To see the list we had entered in Block 1, click on the Previous button. To add the two predictors (sex_1 and fund) to the analysis, highlight the variable names and click on the right array button.
49
Slide 49 Requesting Additional Statistical Output To request additional statistical output, including the very important R squared change statistic, click on the Statistics button.
50
Slide 50 Additional Statistical Output Estimates and Model fit are selected by default. We mark the check boxes for Durbin-Watson, R squared change, Descriptives, and Collinearity diagnostics. When we have selected the statistics, we click on the Continue button to close the dialog box. We could bypass the selection of Collinearity diagnostics and Durbin-Watson, since the script requested these statistics and we have already evaluated our variables ability to satisfy the assumptions of regression
51
Slide 51 Completing the Request for the Hierarchical Regression We have completed the specifications for the hierarchical regression. Click on the OK button to generate the output.
52
Slide 52 The Statement for Sample Size The statement for sample size indicates that the available data satisfies the requirement.
53
Slide 53 Satisfying the Sample Size Requirement The number of cases available for the analysis was 291, satisfying the requirement for 110 cases based on the rule of thumb that the required number of cases should be the larger of the number of independent variables x 8 + 50 or the number of independent variables + 105. The analysis included 5 independent variables: 2 for the covariates "degree of religious fundamentalism" [fund] and "description of political views" [polviews] plus 1 dummy-coded variables for the factor "sex" [sex] and 2 dummy-coded variables for the factor "race of the household" [hhrace].
54
Slide 54 Marking the Check Box for Sample Size Requirement Since we satisfied the needed sample size for regression, we mark the check box.
55
Slide 55 Statement about Predictors and Controls as Independent Variables The next statement identifies the predictor independent variables and the control independent variables, with the expectation that there will be a statistically significant relationship between the predictors and the dependent variable.
56
Slide 56 Significance of the Predictor Variables The increase in R² (.05) associated with adding "degree of religious fundamentalism" [fund] or "survey respondents who were male" [sex_1] after controlling for "description of political views" [polviews], "survey respondents who were white" [hhrace_1] or "survey respondents who were black" [hhrace_2] was statistically significant (F(2, 285) = 7.32, p <.001. The null hypothesis that "the increase in R² by adding the predictor variables = 0" is rejected, supporting the research hypothesis that "the increase in R² by adding the predictor variables was greater than 0".
57
Slide 57 Strength of the Relationship of the Predictor Variables Applying Cohen's criteria for effect size (less than.10 = trivial;.10 up to 0.30 = weak;.30 up to.50 = moderately strong;.50 or greater = strong), the increase was correctly characterized as small (Multiple R =.220). NOTE: the Multiple R for model 2 includes the Multiple R for model 1, so it does not represent the change in Multiple R. To compute the change in R, find the square root of R² Change (.048), which equals.220.
58
Slide 58 Marking the Statement for Predictors and Controls as Independent Variables The increase in R 2 was statistically significant when the predictors were added to the analysis. In addition, the adjective used to describe the change in Multiple R was correct. We mark the check box.
59
Slide 59 Statements about Individual Relationships The remaining statements are about the relationship between individual independent variables and the dependent variable. Though we are not always concerned with the significance of control variables, we will interpret them in these problems.
60
Slide 60 Statements about Individual Relationships The first statement focuses on the relationship between religious fundamentalism and empathy.
61
Slide 61 Output Before and After Predictors are Included. Though there are research questions that focus on the change in significance for a variable in model 2 versus model 1, our questions are concerned with the relationships when all variables are included. We base our interpretation of the individual relationships on the output for Model 2.
62
Slide 62 Religious Fundamentalism and Empathy - 1 The statement that "survey respondents who were more liberal religiously said that being pretty soft-hearted person was a less accurate description of their feelings" is correct. The individual relationship between the predictor variable "degree of religious fundamentalism" [fund] and the dependent variable "accuracy of the description of being a pretty soft-hearted person" [empathy7] was statistically significant, ß = -.184, t(285) = -3.00, p =.003. We reject the null hypothesis that the partial slope (b coefficient) for the variable degree of religious fundamentalism = 0 and conclude that the partial slope (b coefficient) for the variable degree of religious fundamentalism is not equal to 0.
63
Slide 63 Religious Fundamentalism and Empathy - 2 The negative sign of the b coefficient (-0.270) means that higher values of degree of religious fundamentalism were associated with lower values of accuracy of the description of being a pretty soft-hearted person.
64
Slide 64 Marking the Statement on Religious Fundamentalism and Empathy The relationship in the statement was statistically significant and the direction of the relationship is stated correctly. The check box is marked.
65
Slide 65 Statement on Political Conservatism and Empathy The next statement focuses on the relationship between political conservatism and empathy.
66
Slide 66 Political Conservatism and Empathy The statement that "survey respondents who were more conservative said that being pretty soft-hearted person was a more accurate description of their feelings" is not correct. The individual relationship between the control variable "description of political views" [polviews] and the dependent variable "accuracy of the description of being a pretty soft- hearted person" [empathy7] was not statistically significant, ß =.022, t(285) =.37, p =.713. We are not able to reject the null hypothesis that the partial slope (b coefficient) for the variable "description of political views" [polviews] = 0.
67
Slide 67 Marking the Check Box for Political Conservatism and Empathy Since the relationship was not statistically significant, the check box in not marked.
68
Slide 68 Statement on the Relationship between Sex and Empathy The next statement focuses on the relationship between gender and empathy.
69
Slide 69 Sex and Empathy - 1 The statement that "survey respondents who were male said that being pretty soft-hearted person was a less accurate description of their feelings compared to the average for all survey respondents" is correct. The individual relationship between the predictor variable "survey respondents who were male" [sex_1] and the dependent variable "accuracy of the description of being a pretty soft-hearted person" [empathy7] was statistically significant, ß = -.141, t(285) = -2.43, p =.016. We reject the null hypothesis that the partial slope (b coefficient) for the variable survey respondents who were male = 0 and conclude that the partial slope (b coefficient) for the variable survey respondents who were male is not equal to 0.
70
Slide 70 Sex and Empathy - 2 The negative sign of the b coefficient (-0.160) means that survey respondents who were male said that being pretty soft-hearted person was a less accurate description of their feelings compared to the average for all survey respondents.
71
Slide 71 Marking the Statement on the Relationship between Sex and Empathy7 The relationship in the statement was statistically significant and the direction of the relationship is stated correctly. The check box is marked.
72
Slide 72 Statements on the Relationship between Race and Empathy The next two statement focuses on the relationship between race and empathy. It is possible that one, both, or neither of the statements is correct
73
Slide 73 Race and Empathy - 1 The statement that "survey respondents who were white said that being pretty soft-hearted person was a more accurate description of their feelings compared to the average for all survey respondents" is not correct. The individual relationship between the control variable "survey respondents who were white" [hhrace_1] and the dependent variable "accuracy of the description of being a pretty soft- hearted person" [empathy7] was not statistically significant, ß =.102, t(285) = 1.75, p =.081. We are not able to reject the null hypothesis that the partial slope (b coefficient) for the variable survey respondents who were white = 0.
74
Slide 74 Race and Empathy - 2 The statement that "survey respondents who were black said that being pretty soft-hearted person was a less accurate description of their feelings compared to the average for all survey respondents" is not correct. The individual relationship between the control variable "survey respondents who were black" [hhrace_2] and the dependent variable "accuracy of the description of being a pretty soft-hearted person" [empathy7] was not statistically significant, ß = -.073, t(285) = -1.20, p =.232. We are not able to reject the null hypothesis that the partial slope (b coefficient) for the variable survey respondents who were black = 0.
75
Slide 75 Marking Statements on the Relationship between Race and Empathy Neither of the statements was statistically significant. The check boxes are not marked.
76
Slide 76 The Problem Graded in BlackBoard When the problem was submitted to BlackBoard for grading, all of the answers were found to be correct.
77
Slide 77 Removing the Variables Created by the Script The dummy-coded variables are still in the data set. To remove them, highlight the columns containing them and select the Clear command from the Edit menu.
78
Slide 78 Logic Diagram for Solving Homework Problems: Level of Measurement No Ordinal level variable treated as metric? Do not mark check box Mark: Inappropriate application of the statistic Stop Yes Level of measurement ok? Consider limitation in discussion of findings
79
Slide 79 Regression Assumptions for Model 1: Original Variables, All Cases Run Script to Evaluate Regression Assumptions for Model 1: Original Variables, All Cases Mark check box for Model 1 Don’t mark check box for Model 1 Normality ok? Linearity ok Homoscedasticity ok? Independence of variables ok? Independence of errors ok? No Yes Problem with Multicollinearity? No Yes Stop
80
Slide 80 Regression Assumptions for Model 2: Original Variables, Excluding Extreme Outliers Run Script to Evaluate Regression Assumptions for Model 2: Original Variables, Excluding Extreme Outliers Mark check box for Model 2 Don’t mark check box for Model 2 Normality ok? Linearity ok Homoscedasticity ok? Independence of variables ok? Independence of errors ok? No Yes If no extreme outliers are identified in Model 1, skip this step Problem with Multicollinearity? No Yes Stop
81
Slide 81 Regression Assumptions for Model 3: Transformed Variables, All Cases Run Script to Evaluate Regression Assumptions for Model 3: Transformed Variables, All Cases Mark check box for Model 3 Don’t mark check box for Model 3 Normality ok? Linearity ok Homoscedasticity ok? Independence of variables ok? Independence of errors ok? No Yes Problem with Multicollinearity? No Yes Stop
82
Slide 82 Regression Assumptions for Model 4: Transformed Variables, Excluding Extreme Outliers Run Script to Evaluate Regression Assumptions for Model 4: Transformed Variables, Excluding Extreme Outliers Mark check box for Model 4 Normality ok? Linearity ok Homoscedasticity ok? Independence of variables ok? Independence of errors ok? No Yes If no extreme outliers are identified in Model 3, skip this step Don’t mark check box for Model 4 Interpret Model 1 Mention violations of assumptions as limitations to the analysis Problem with Multicollinearity? Yes Stop No
83
Slide 83 Sample Size Requirement Do not mark check box Consider limitation in discussion of findings No Sample size ok? No Yes Mark check box for correct sample size Run Hierarchical Regression with Control and Predictor Variables
84
Slide 84 Overall Relationship – Change in R 2 Yes Do not mark check box for overall relationship Stop Change in R 2 (F-test Sig ≤ α)? No Mark check box for overall relationship Use α for statistica l tests Correct adjective used to describe effect size? Yes Do not mark check box for overall relationship No
85
Slide 85 Relationships with Individual Controls and Predictors Yes Individual relationship (t-test Sig ≤ α)? No Mark check box for individual relationship Use α for statistica l tests Correct interpretation of direction of relationship? Yes Do not mark check box for individual relationship No Repeat evaluation of individual relationships for other variables
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.