Introduction to testing statistical significance of interactions Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Slides:



Advertisements
Similar presentations
Multiple Regression.
Advertisements

Chapter 10 Analysis of Variance (ANOVA) Part III: Additional Hypothesis Tests Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Planning a speech and designing effective slides Jane E. Miller, PhD.
ANOVA: Analysis of Variance
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
The Simple Regression Model
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Further Inference in the Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Intro to Statistics for the Behavioral Sciences PSYC 1900
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Multiple Linear Regression Analysis
Leedy and Ormrod Ch. 11 Gray Ch. 14
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Organizing data in tables and charts: Different criteria for different tasks Jane.
Logarithmic specifications Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
Hypothesis Testing in Linear Regression Analysis
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Creating effective tables and charts Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Calculating interaction patterns from logit coefficients: Interaction between two.
Multiple Regression. In the previous section, we examined simple regression, which has just one independent variable on the right side of the equation.
Comparing overall goodness of fit across models
Part IV Significantly Different: Using Inferential Statistics
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Calculating the shape of a polynomial from regression coefficients Jane E. Miller,
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
ANOVA Analysis of Variance Most Useful when Conducting Experiments.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Differentiating between statistical significance and substantive importance Jane.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Model Selection1. 1. Regress Y on each k potential X variables. 2. Determine the best single variable model. 3. Regress Y on the best variable and each.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Implementing “generalization, example, exception”: Behind-the-scenes work for summarizing.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Writing prose to present results of interactions Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Criteria for choosing a reference category Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Defining the Goldilocks problem Jane E. Miller, PhD.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 13 Multiple Regression
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Conducting post-hoc tests of compound coefficients using simple slopes for a categorical.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.
Standardized coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Choosing tools to present numbers: Tables, charts, and prose Jane E. Miller, PhD.
The Chicago Guide to Writing about Numbers, 2 nd edition. Choosing a comparison group Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Resolving the Goldilocks problem: Variables and measurement Jane E. Miller, PhD.
Overview of Regression Analysis. Conditional Mean We all know what a mean or average is. E.g. The mean annual earnings for year old working males.
Week 101 ANOVA F Test in Multiple Regression In multiple regression, the ANOVA F test is designed to test the following hypothesis: This test aims to assess.
Testing statistical significance of differences between coefficients Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Visualizing shapes of interaction patterns between two categorical independent.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Conducting post-hoc tests of compound coefficients using simple slopes for a categorical.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Visualizing shapes of interaction patterns with continuous independent variables.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Resolving the Goldilocks problem: Presenting results Jane E. Miller, PhD.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Creating charts to present interactions Jane E. Miller, PhD.
Approaches to testing statistical significance of interactions Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Resolving the Goldilocks problem: Model specification Jane E. Miller, PhD.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Calculating interaction effects from OLS coefficients: Interaction between 1 categorical.
CHAPTER 15: THE NUTS AND BOLTS OF USING STATISTICS.
Chapter 4 Basic Estimation Techniques
Overview of categorical by categorical interactions: Part I: Concepts, definitions, and shapes Interactions in regression models occur when the association.
Calculating interaction effects from OLS coefficients: Interaction between two categorical independent variables Jane E. Miller, PhD As discussed in the.
Using alternative reference categories to test statistical significance of an interaction This podcast is the last in the series on testing statistical.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Creating variables and specifying models to test for interactions between two categorical independent variables This lecture is the third in the series.
CHAPTER 29: Multiple Regression*
Introduction to interactions in regression models: Concepts and equations Jane E. Miller, PhD Interactions in regression models occur when the association.
Overview of categorical by continuous interactions: Part II: Variables, specifications, and calculations Interactions in regression models occur when.
Testing whether a multivariate specification can be simplified
Presentation transcript:

Introduction to testing statistical significance of interactions Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Overview Testing statistical significance of individual coefficients Testing effect of interaction terms on overall model fit Approaches to testing statistical significance of interactions – Alternative model specification – The “TEST” statement – Simple slopes calculations for compound coefficients – Changing the reference category The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Statistical significance of an interaction To evaluate statistical significance of an interaction, use a set of approaches t-tests for individual coefficients F-tests for the collective contribution of a set of terms to the overall fit of an OLS model The corresponding statistics for a logistic model are z-statistics for individual coefficients –2 log likelihood statistic for overall model fit Methods for testing differences among values of variables involved in the interaction – Contrasts within the overall shape of the pattern The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Estimated coefficients from an OLS model of birth weight in grams Model A: Without interactions Model B: With interactions βt- statistic β Main effects terms Race (ref. = non-Hisp. white) Non-Hispanic Black (NHB)–172.6**–9.86–168.1**–5.66 Mexican American (MA)–23.1–1.02–104.2**–2.16 Mother’s ed. (ref. = > HS) Less than high school (<HS)–55.5**–2.88–54.2**–2.35 High school graduate (=HS)–53.9**–3.64–62.0**–3.77 Interactions: race & education NHB_<HS–38.5–0.88 MA_<HS NHB_=HS MA_=HS F-statistic Degrees of freedom (df)9 13

Statistical significance of βs on individual interaction terms Statistical significance of coefficients on each of the interaction terms is assessed as for any other independent variable in a multivariate regression model In the example from the previous slide, none of the βs on the individual interaction terms between race/ethnicity and mother’s education achieve statistical significance as assessed by their t-statistics – E.g., β NHB_<HS = –38.5, with a t-statistic of –0.88 The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Overall shape of an interaction But recall that βs on main effect and interaction terms cannot be interpreted in isolation from one another E.g., in a model of birth weight with an interaction between race and education, the difference in birth weight for non- Hispanic black infants born to mothers with < HS compared to the reference category involves βs on three variables = β NHB + β <HS + β NHB_<HS More than one β is involved in this calculation, so looking only at the statistical significance of each of those three βs does not tell us the statistical significance of differences between groups defined by combinations of the two IVs in the interaction The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

What do inferential statistics for individual terms in a model tell us? If the coefficient on an interaction term is statistically significant in a model that includes the corresponding main effects terms – We know only that that combination of characteristics has a joint effect on the DV over and above the main effects E.g., if  <HS_NHB is statistically significant in a model of birth weight that also includes the main effects of education and race – We know only that that combination of race and education has a different effect on birth weight than would be implied by the  on the main effects of NHB and <HS alone The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Assessing effects of interactions on overall model goodness-of-fit (GOF) To assess whether the interaction terms collectively improve overall model fit, calculate the difference in F-statistics for models with and without those terms – Model A: Main effects only – Model B: Main effects and interactions Compare against critical value of the F-statistic for the number of degrees of freedom (df) for the model. – df for the numerator is based on the difference in number of covariates in models with and without interaction terms – df for the denominator depends on the sample size If the difference in F > the critical value, the interaction terms statistically significantly improves the overall fit of the model The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Example difference in model GOF The difference in F-statistics between models A and B = F model A – F model B = 94.1 – 65.6 = 28.5 The difference in the number of degrees of freedom between models A and B = 13 df – 9 df = 4 df For an F distribution with 4 degrees of freedom for the numerator ∞ degrees of freedom for the denominator (based on the sample size used to estimate the model) The critical value for p = is 10.8 The difference in F exceeds the critical value (28.5 > 10.8) – Thus we conclude that inclusion of interaction terms improves the overall fit of the model at p < The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

How can overall fit improve if individual terms aren’t statistically significant? In models that include several main effect and interaction terms, one or more of those terms may not be needed to capture relevant variation in the DV – Could be collapsed into the reference category or combined with other subgroups based on empirical testing – Might yield statistical significance for some interaction terms Models that include many interaction terms may be affected by multicollinearity – Can explain why the t-statistics show a lack of significance even if the F-statistic indicates statistical significance The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

What do inferential statistics for individual terms in a model tell us? If  <HS_NHB is statistically significant in a model of birth weight that also includes the main effects of education and race – We know only that that combination of race and education has a different effect on birth weight than would be implied by the  s on the main effects of NHB and <HS alone The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

What don’t inferential statistics for individual terms in a model tell us? Based on the separate test statistics for each of the individual main effect and interaction  s alone cannot assess statistical significance of differences in predicted birth weight – For example, for non-Hispanic blacks born to mothers with HS (the reference category) – Across racial/ethnic groups within the < HS group – Across education levels among non-Hispanic blacks Remember: each of these comparisons involves comparing values calculated from more than one , e.g., – For non-Hispanic black < HS:  <HS +  NHB +  <HS_NHB The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Calculating overall effect for non- Hispanic blacks with < HS education –54–39 –54 –39 –168 = β NHB + β <HS + β NHB_<HS = (–168) + (–54) + (–39) = –261 β NHB = β <HS = β NHB_<HS = We want to know whether that sum is statistically significantly different from 0; e.g., no difference in birth weight compared to infants born to non-Hispanic white women with more than a high school education = reference category * p < 0.05 based on t-tests for individual coefficients * *

Substantive question behind the interaction model: “Does race modify the association between education and birth weight?” The bar for each race/education combination involves the sum of the intercept and one to three other coefficients t-tests for individual βs won’t tell us about statistical significance of differences in those sums The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Tests of differences across groups other than the reference category Conduct formal inferential tests of whether the predicted value of the dependent variable is statistically significantly different across categories Possible approaches – Use “TEST” statement to contrast coefficients – Revise the model specification Estimate a model with dummies for all interaction combinations Reestimate the model with different reference categories – See separate podcast on that topic – Conduct post-hoc tests of differences between  s from one model See separate podcast on simple slope The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Summary Inferential tests for individual coefficients in a regression model test whether each β is statistically significantly different from 0 In models using main effects and interaction terms, calculating the overall shape of an interaction requires summing several βs – Tests of the individual component βs don’t address statistical significance of differences in the overall interaction pattern The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Suggested resources Miller, J. E The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. University of Chicago Press, chapters 11, 15, and 16. Cohen, Jacob, Patricia Cohen, Stephen G. West, and Leona S. Aiken Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences, 3rd Edition. Florence, KY: Routledge, chapters 7 and 9. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Suggested online resources Podcasts on – Specifying models to test for interactions – Calculating the overall shape of an interaction pattern from regression coefficients – Comparing overall goodness-of-fit across models – Approaches to testing statistical significance of interactions – Conducting post-hoc tests of compound coefficients using the simple slopes technique – Using alternative reference categories to test statistical significance of interactions The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Contact information Jane E. Miller, PhD Online materials available at The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.