Download presentation
Presentation is loading. Please wait.
Published bySigve Engen Modified over 5 years ago
1
Overview of categorical by continuous interactions: Part II: Variables, specifications, and calculations Interactions in regression models occur when the association between one independent variable and the dependent variable DIFFERS depending on values of a second independent variable. The example we will trace in this series of lectures investigates whether the association between socioeconomic status and birth weight is the same for all racial/ethnic groups, where birth weight is our dependent variable (or outcome), and race and socioeconomic status are our independent variables (or predictors). Later in this lecture, I will mention a couple of other examples of questions that can be addressed using interaction specification in a regression model. Jane E. Miller, PhD
2
Continued from Part I Part I covers
Definitions and concepts for interactions Possible shapes of patterns for interactions between one categorical and one continuous independent variable
3
Creating variables and specifying models to test for interactions involving continuous independent variables This lecture is the third in the series on interactions. Before you watch this one, please watch the introduction to interactions and visualizing shapes of interaction patterns.
4
Interaction between a continuous and a categorical independent variable (IV)
Example: Race and income-to-poverty ratio. Race is a 2-category IV classified non-Hispanic black (NHB), non-Hispanic white (NHW,) IPR is a continuous variable calculated as annual family income (in $) divided by the Federal Poverty Level for a family of that size and age composition. IPR ranges from 0 to more than 10 in this sample. Federal Poverty Level for a family of 2 adults and 2 children in 2010 was about $22,000 Next, we turn to how to define variables for an interaction between one continuous and one categorical independent variable. They are coded as shown here
5
Independent variables: continuous by continuous interaction
Mother’s age at time of child’s birth, years One continuous variable for the main effect: age Family income to poverty ratio, in multiples of the Federal Poverty Level One continuous variable for the main effect: IPR Interaction: Mother’s age and IPR Age_IPR = age × IPR Resulting interaction term variable will also be continuous Finally, the specification for an interaction between two continuous IVs …
6
Model specification to test an interaction between one continuous and one categorical independent variable For a model with an interaction between two independent variables, need all of the ALL of the main effects and interaction term variables related to those two independent variables. E.g., for a model of birth weight by race and IPR, include the main effect and interaction terms related to race and family IPR-to-poverty ratio: BW = f (NHB, IPR, NHB_IPR) Consistent with what we saw for the cat by cat interaction, the model specification for the categorical by continuous interaction includes all of the pertinent main effects and interaction terms, in this case, two main effects variables and one interaction variable.
7
Coding of variables The NHB main effect variable is defined as in the previous example (of categorical by categorical interaction). 1 = non-Hispanic black. 0 = all others, the reference category, in this example, non-Hispanic white. However, for a continuous variable like income that takes on many possible numeric values, it doesn’t make sense to create a lot of dummy variables. Instead, use income-poverty ratio in its continuous form. The source variable RACE has already been used to create the main effect dummy variable NHB, which we can use again.
8
Thus NHB_IPR is the product of NHB and IPR.
Calculating an interaction term from a dummy and a continuous main effects term The value of the interaction term variable is defined as the product of the two component main effects variables: X1_ X2 = X1 × X2 Result will be one continuous interaction term variable. Thus NHB_IPR is the product of NHB and IPR. If NHB = 1 and IPR = 2.3 then the interaction term NHB_IPR = 2.3 If NHB = 0 and IPR = 2.3, then NHB_IPR = 0 The way we implement this is to create a new variable, called an interaction term that is calculated as the product of the two component variables. e.g., the interaction term X1_X2 takes on the value X1 * X2 for each case. I often use the naming convention [read]
9
Coding of main effects and interaction term variables: race and IPR
Case characteristics – SELECTED VALUES Variables Main effects terms Interaction term NHB IPR NHB_IPR Non-H white & IPR = 0.5 0.5 Non-H white & IPR = 1.0 1.0 Non-H white & IPR = 2.0 2.0 Non-H white & IPR = 5.0 5.0 Non-H black & IPR = 0.5 1 Non-H black & IPR = 1.0 Non-H black & IPR = 2.0 Non-H black & IPR = 5.0 This table shows the values taken by the three variables involved in the race by income interaction for our model. Each row is a separate case example, with four rows for whites (top 4 rows of #s) and four for blacks, each with a selected value of IPR to use as an illustration. I repeat the same set of values of IPR for whites and for blacks, so that is NOT an aspect of my example that differs by race. As in the previous example, race is a two category variable, so we create one dummy for NHB, which takes on the value 0 for all non-Hispanic white infants (the reference category), and the value 1 for all non—Hispanic white infants. E.g., IPR = 0.5 means income is half the Federal Poverty Level (FPL); IPR = 2.0 means income is twice the FPL. For a two-category race variable (non-Hispanic white = reference category).
10
Coding of race and IPR variables: Non-Hispanic white infants
Case characteristics Variables Main effects terms Interaction term NHB IPR NHB_IPR Non-H white & IPR = 0.5 0.5 Non-H white & IPR = 1.0 1.0 Non-H white & IPR = 2.0 2.0 Non-H white & IPR = 5.0 5.0 Now let’s look more closely at the values of the three variables involved in the interaction, specifically for white infants. As we saw on the previous slide, the dummy variable NHB = 0 for all non-Hispanic white infants because they are in the reference category for that variable. IPR, being specified as a continuous variable in this model, does not have a reference category, but rather takes on a set of continuous values. In this table, we look specifically at white families earning half the poverty level, exactly at the poverty level, twice, five times and eight times the FPL/ For a family earning half the poverty line, the variable IPR takes on the value 0.5,. The interaction term NHB_inc equals 0 for all non-Hispanic white infants because it is the product of the two variables NHB and income. Since NHB=0 for all whites, the product by definition must = 0 for them as well. E.g., IPR = 0.5 means income is half the Federal Poverty Level (FPL); IPR = 2.0 means income is twice the FPL. For a two-category race variable (non-Hispanic white = reference category).
11
Coding of race and IPR variables: Non-Hispanic black infants
Case characteristics Variables Main effects terms Interaction term NHB IPR NHB_IPR Non-H black & IPR = 0.5 1 0.5 Non-H black & IPR = 1.0 1.0 Non-H black & IPR = 2.0 2.0 Non-H black & IPR = 5.0 5.0 This table shows the values of the three variables involved in the interaction, but this time for black infants. The dummy variable NHB = 1 for all non-Hispanic black infants. The income variable is calculated using the same logic as on the previous slide, ranging in these examples from 0.5 for an infant born into a family earning half the poverty level, up to 5.0 The interaction term NHB_IPR takes on the SAME VALUE AS THE IPR VARIABLE for all non-Hispanic black infants because it is the product of the two variables NHB and income. Since NHB=1 for all whites, the product by definition must be equal to the same value as the IPR variable. For instance, the interaction term NHB_IPR takes on the value 5.0 for a black infant born to a family earning 5 times the FPL. E.g., IPR = 0.5 means income is half the Federal Poverty Level (FPL); IPR = 2.0 means income is twice the FPL. For a two-category race variable (non-Hispanic white = reference category).
12
= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)
General equation for predicted value of DV based on an interaction model The general equation to calculate the predicted value of the dependent variable includes main effects coefficients interaction term coefficients values of the independent variables = β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)
13
= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR)
Calculating overall effect of interaction for specific case characteristics = β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR) Each coefficient is multiplied by the value of the associated variable for cases with the characteristics of interest. To see which coefficients pertain to which cases, fill in values of variables for different combinations of race and the income-to-poverty ratio (IPR).
14
Example: Estimated coefficients
β Intercept 3,106 Main effect terms Non-Hispanic black (NHB) –177 Income-to-poverty ratio (IPR) 23 Interaction term NHB_IPR –5 This table displays the estimated coefficients (abbreviated B) from our OLS model of birth weight in grams. I have used color coding to indicate which coefficients are for main effects terms (yellow) and which are for interaction terms (green). The main effects terms for this interaction include a B for the dummy variable for non-Hispanic black, and a main effect of family IPR measured in $10K increments. This means that the values of the variable used in estimating the model are the original family IPR in $, divided by $10,000. The interaction terms is a continuous variable that is the product of the Hispanic black dummy variable with the continuous measure of IPR in $10K. For a more detailed explanation of how the race by IPR interaction variable is calculated, see the module on creating variables for use in interactions . The footnote reminds us of the reference category, which is the omitted category for race. IPR = family income ($) / Federal Poverty Level for a family of that size and age composition. Reference category: Non-Hispanic whites.
15
Interpreting the intercept
The intercept β0 from an OLS model is an estimate of the level of the dependent variable when continuous variables take the value 0, for infants in the reference category for all categorical variables. In a model where The dependent variable is birth weight in grams. The reference category is specified to be non-Hispanic white infants. β0 is an estimate of birth weight when IPR = 0, for non-Hispanic white infants.
16
Review: Coding of main effect and interaction term variables: race and income
Reference category Case characteristics – SELECTED VALUES Variables Main effects terms Interaction term NHB IPR NHB_IPR Non-H white & IPR = 0.0 0.0 Non-H white & IPR = 0.5 0.5 Non-H white & IPR = 1.0 1.0 Here, we return to a grid we saw in the podcast on creating variables to test for interactions. It shows the values of each of the 3 variables involved in the main effects and interactions specification. For instance, non-Hispanic white infants born into families with an IPR of 0 have values of 0 for each of the three variables because they are in the reference category for both race and educational attainment. E.g., IPR = 0.5 means family income is half the Federal Poverty Level (FPL); IPR = 2.0 means family income is twice the FPL. For a two-category race variable (non-Hispanic white = reference category).
17
Calculating the value of the intercept for one group
= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR) NHB IPR NHB_IPR Non-H white & IPR = 0.0 0.0 The intercept for non-Hispanic whites is calculated: = β0 + (βNHB × 0) + (βIPR × 0.0) + (βNHB_IPR × 0.0) = β0 Lets see what that does for our overall calculation. Here I repeat the general equation showing how to calculate the difference in birth weight for a given case compared to the reference category for race and with an IPR of 0, based on the estimated coefficients and the values of the pertinent independent variables. We see that for NHW with IPR = 0, the equation collapses to B0 since all of the other coefficients are multiplied by 0. In other words, the intercept B0 is the predicted value of the DV (birth weight in this example) for infants in the ref cat of the cat IV and a value of 0 for the 2nd IV in the interaction. Thus, the intercept for non-Hispanic white infants (when IPR = 0) collapses to include only β0 because all of the other coefficients in the formula are multiplied by a value of 0.
18
Interpreting the IPR/birth weight pattern
IPR is a continuous variable The coefficient is an estimate of the effect on the dependent for a 1-unit increase in the continuous IV, with categorical variables set to their reference category values. So βIPR estimates the increment in birth weight for every one-unit increase in IPR (e.g., from family income at the poverty line to twice the poverty line) It is the slope of the IPR/birth weight curve for infants in the reference category, in this case, non-Hispanic white infants.
19
Calculating values for the IPR/birth weight curve for white infants
= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR) NHB IPR NHB_IPR Non-H white & IPR = 1.5 1.5 0.0 = β0 + (βNHB × 0) + (βIPR × 1.5) + (βNHB_IPR × 0) = β0 + (βIPR × 1.5) To see the shape of the IPR/BW CURVE for NHW infants, we need to calculate the predicted value of the DV (birth weight) for several selected values of IPR. The slope of that curve will be equal to BIPR because the interaction coefficient is multiplied by 0. Because non-Hispanic whites are the reference category for race, the equation collapses to include only the IPR main effect (βIPR) because the other coefficients are multiplied by 0. = β0 + (βIPR × IPR)
20
Interpreting the race main effect
The main effect βNHB estimates the difference in birth weight between non-Hispanic black infants and those in the reference category (non-Hispanic whites), when continuous variables are set at the value 0. It is an estimate of the difference in intercept between black and white infants when IPR is 0.
21
Calculating the intercept for different values of the categorical variable
NHB IPR NHB_IPR Non-H white & IPR = 0.0 0.0 As we saw a moment ago, for the intercept for non-Hispanic whites is calculated: = β0 + (βNHB × 0) + (βIPR × 0.0) + (βNHB_IPR × 0.0) = β0 NHB IPR NHB_IPR Non-H black & IPR = 0.0 1 0.0 On this slide, we compare two infants born into families with IPR =0, one of whom is white (top example, repeated from an earlier slide) and the other of whom is black. As before, the equation for calculating the intercept for whites = B0 For blacks, however, the intercept = B0 plus the coefficient on the NHB variable. So BNHB is the difference in predicted BW for blacks compared to whites when IPR =0 For non-Hispanic blacks, the intercept is calculated: = β0 + (βNHB × 1) + (βIPR × 0.0) + (βNHB_IPR × 0.0) = β0 + βNHB
22
More on the race main effect
It is an estimate of the difference in intercept between black and white infants when IPR is 0. = β0 + βNHB = 3,106 + (– 177) = 2,929 In other words, black infants born to families with an IPR of zero have a predicted birth weight of 2,929 grams. or 177 grams LOWER than that of their white counterparts.
23
Calculating values for the IPR/birth weight curve for white infants
= β0 + (βNHB × NHB) + (βIPR × IPR) + (βNHB_IPR × NHB_IPR) = β0 + (βNHB × 0) + (βIPR × IPR) + (βNHB_IPR × 0) = β0 + (βIPR × IPR) To see the shape of the IPR/BW CURVE for NHW infants, we need to calculate the predicted value of the DV (birth weight) for several selected values of IPR. The slope of that curve will be equal to BIPR because the interaction coefficient is multiplied by 0. Because non-Hispanic whites are the reference category for race, the equation collapses to include only the IPR main effect (βIPR) because the other coefficients are multiplied by 0.
24
Calculating values for the IPR birth weight curve for black infants
NHB IPR NHB_IPR Non-H black & IPR = 1.5 1 1.5 = β0 + (βNHB × 1) + (βIPR × 1.5) + (βNHB_IPR × 1.5) When we look to calculate the shape of the IPR/ BW curve for blacks, however, three coefficients remain in the equation: the main effects coefficients for NHB and IPR, and the interaction term between them. As we saw earlier, the intercept for blacks is B0 + BNHB The slope for the IPR/BW curve for blacks is BIPR + BNHB_IPR For Non-Hispanic blacks, the equation includes all three terms (βNHB, βIPR, and βNHB_IPR) because each of those coefficients is multiplied by a non-zero value.
25
Interpreting the coefficient on the interaction between race and IPR
The slope for blacks = βIPR + βNHB_IPR = 23 + (–5) = 18 for whites = βIPR = 23 The race_IPR coefficient tests whether the slope of the IPR/birth weight pattern is different for non-Hispanic black infants than for their non-Hispanic white counterparts. βNHB_IPR is thus the estimated difference in slope for blacks compared to whites. Solving for specific values of the slope and intercept for black infants
26
More on the race/IPR interaction
The estimated coefficients mean that each 1-unit increase in IPR is associated with 23 grams more birth weight among non-Hispanic white infants. 18 grams more birth weight among non-Hispanic black infants. Thos values are the slopes of the respective IPR/BW curves for the two racial/ethnic groups.
27
Preparing to graph the slope of IPR/birth weight by race
For infants in the reference category (non-Hispanic white), Multiply selected values of IPR by βIPR and add to β0 to obtain predicted birth weight at interesting values of IPR. For non-Hispanic black infants, Multiply selected values of IPR by βIPR + βNHB_IPR then add to β0 + βNHB .
28
Calculated birth weight by race for selected values of IPR
IPR (family income in multiples of the FPL) Non-Hispanic white Non-Hispanic black Formula Result = β0 + 0 × βIPR = 3, ×23 3,106 = β0 + βNHB + 0 × (βIPR + βNHB_IPR) = 3,106 – × (23 – 5) 2,929 1 = β0 + 1× βIPR = 3, ×23 = 3, 3,129 = β0 + βNHB + 1 × (βIPR + βNHB_IPR) = 3,106 – × (23 – 5) = 2, × (18) = 2, 2,947 … 6 = β0 + 6 × βIPR = 3, ×23 = 3, 3,244 = β0 + βNHB + 6 × (βIPR + βNHB_IPR) = 3,106 – × (23 – 5) = 2, × (18) = 2, 3,037 Here is a grid to summarize the terms involved in calculating birth weight by race for selected values of IPR, which are shown in the left most column. The middle two columns are for non-Hispanic white infants, with the left column for the formula with coefficients and values typed in, and the right column for the numeric result. The right two columns are for non-Hispanic black infants, again with a column for the formula and one for the result. The coefficients B0, BIPR, BNHB and BNHB_IPR are color coded as shown below the table. Walk through formulas IPR = 0 -> intercept for whites IPR =0 -> intercept for blacks IPR = 1 IPR = 6 β0 = 3,106; βIPR = 23; βNHB = –177; βNHB_IPR = –5
29
Use a spreadsheet to calculate and graph the interaction
Spreadsheets can Store The estimated coefficients The input values of the independent variables The correct generalized formula to calculate the predicted values for many combinations of the IVs involved in the interaction Graph the overall pattern See spreadsheet template and voice-over explanation
30
Predicted birth weight by race/ethnicity and IPR
= βIPR = 23 = slope of IPR/ BW curve for ref cat * 3,300 = β0 = intercept = 3,106 = predicted BW for ref cat * Birth weight (grams) 3,200 = βIPR + βNHB_IPR = 23 – 5 = 18 = slope of IPR/ BW curve for non-Hispanic black infants 3,100 3,000 2,900 = β0 + βNHB = 3,106 + (– 177) = 2,929 = intercept for black infants Graphically, here is the pattern that emerges from the estimated coefficients we’ve been working with. IPR on x-axis Birth weight on y –axis Solid white line for non-Hispanic whites Dashed yellow line for non-Hispanic blacks Start w/ whites, who are the ref cat so involve fewer coefficients Intercept for whites = B0 Slope for whites = BIPR; upward sloping because positive coefficient. Intercept for blacks = B0 + BNHB; lower than intercept for whites because NHB <0 Slope for blacks = BNHB + BNHB_IPR; shallower than for whites because coefficient on NHB_IPR is negative 2,800 6 1 2 4 IPR * Ref cat = Reference category = non-Hispanic white infants.
31
Overall shape of the race/IPR/ birth weight pattern
Based on this set of βs, black infants have a lower birth weight than whites at all IPR levels. Negative coefficient on the NHB main effect yields a lower intercept for blacks than for whites. a slower rate of birth weight increase as IPR rises. Negative coefficient on NHB_IPR, which yields a shallower slope of the IPR/birth weight curve for blacks than for whites. Thus the deficit in birth weight for blacks widens with increasing IPR.
32
Summary An interaction between a continuous and a categorical independent variable will yield differences in the intercept and/or slope of the association between the continuous IV and the DV. Calculating the overall shape of an interaction requires adding together the pertinent main effects and interaction term βs for combinations of the categorical IV and selected values of the continuous IV in the interaction. A spreadsheet can be helpful for storing and organizing the βs, input values, and formulas.
33
Be parsimonious in deciding which interactions to test
The number of variables in the regression model proliferates rapidly with each additional interaction. Specify interactions only between key independent variables. Communicating results becomes unwieldy: Considerable behind-the-scenes calculations. Extra tables or charts to convey the shape of the interaction.
34
Suggested resources Chapter 16, Miller, J. E The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Chapters 8 and 9 of Cohen et al Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences, 3rd Edition. Florence, KY: Routledge.
35
Suggested online resources
Podcasts on Creating charts to present interactions Writing prose to present results of interactions Introduction to testing statistical significance of interactions Approaches to testing statistical significance of interactions Using simple slopes for compound coefficients Using alternative reference categories to test contrasts within interactions
36
Suggested exercises Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Problem set for chapter 16 Suggested course extensions for chapter 16
37
Contact information Jane E. Miller, PhD Online materials available at The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.