Multivariate Relationships Goal: Show a causal relationship between two variables (X Y) Elements of a cause-and-effect relationship: Association between variables (based on methods we’ve covered this semester) Correct time order (X occurs before Y) Elimination of alternative explanations (variable Z that acts on both X and Y, making them appear to be associated) Anecdotal evidence does not rule out causality
Controlling for Other Variables Observational studies: Researchers are unable to control levels of variables and may only observe them as they occur in nature Statistical Control: Identifying individuals (cases) by their level of an alternative explanatory (control) variable (although not assigning subjects the levels) Spurious Association: When both variables of interest are dependent on a third variable, and their association vanishes when controlling for the other variable X1 X2 Y
Controlling for Other Explanatory Variables Categorical (Qualitative) Variables: Partial Tables: Contingency Tables showing X1-Y relationship, separately for each level of X2 Numeric (Quantitative) Variables: Mean and Std. Deviation of Responses (Y) versus groups (X1), separately for each level of X2 Regression of Y on X1, controlling for level of X2 (Multiple Regression)
Types of Multivariate Relationships Chain Relationships: X1 leads to changes in (causes) X2 which in turn leads to changes (causes) Y. X1 has an indirect effect on Y through the intervening variable X2. X1-Y association vanishes after controlling X2 X1 X2 Y Multiple Causes: X1 and X2 each have a direct effect on Y. They can also have direct and indirect effects: X1 X1 Y Y X2 X2
Types of Multivariate Relationships Suppressor Variables: No association appears between X1 and Y until we control X2 Statistical Interaction: The statistical association between X1 and Y depends on the level of X2 X2 X1 Y Simpson’s Paradox: When direction of association between Y and X1 is in opposite direction for all levels of X2 as the direction of association when not controlling X2
Other Inferential Issues Sample Size: When controlling for X2, the sample sizes can be quite small and you may not obtain statistical significance for the X1-Y association (lack of power) Categorization: When X2 is quantitative there can be many partial tables/associations, with few observations. Multiple regression models help avoid this problem. Comparing Measures: Often we wish to compare estimates of a parameter across levels of the control variable. Can use 2-sample z-test (ch. 7)