Multivariate Relationships Goal: Show a causal relationship between two variables (X Y) Elements of a cause-and-effect relationship: –Association between variables (based on methods we’ve covered this semester) –Correct time order (X occurs before Y) –Elimination of alternative explanations (variable Z that acts on both X and Y, making them appear to be associated) –Anecdotal evidence does not rule out causality
Controlling for Other Variables Observational studies: Researchers are unable to control levels of variables and may only observe them as they occur in nature Statistical Control: Identifying individuals (cases) by their level of an alternative explanatory (control) variable (although not assigning subjects the levels) Spurious Association: When both variables of interest are dependent on a third variable, and their association vanishes when controlling for the other variable X2X2 X1X1 Y
Controlling for Other Explanatory Variables Categorical (Qualitative) Variables: –Partial Tables: Contingency Tables showing X 1 -Y relationship, separately for each level of X 2 Numeric (Quantitative) Variables: –Mean and Std. Deviation of Responses (Y) versus groups (X 1 ), separately for each level of X 2 –Regression of Y on X 1, controlling for level of X 2 (Multiple Regression)
Types of Multivariate Relationships Chain Relationships: X 1 leads to changes in (causes) X 2 which in turn leads to changes (causes) Y. X 1 has an indirect effect on Y through the intervening variable X 2. X 1 -Y association vanishes after controlling X 2 X 1 X 2 Y Multiple Causes: X 1 and X 2 each have a direct effect on Y. They can also have direct and indirect effects: X1X1 X2X2 Y X1X1 X2X2 Y
Types of Multivariate Relationships Suppressor Variables: No association appears between X 1 and Y until we control X 2 Statistical Interaction: The statistical association between X 1 and Y depends on the level of X 2 X1X1 X2X2 Y Simpson’s Paradox: When direction of association between Y and X 1 is in opposite direction for all levels of X 2 as the direction of association when not controlling X 2
Other Inferential Issues Sample Size: When controlling for X 2, the sample sizes can be quite small and you may not obtain statistical significance for the X 1 -Y association (lack of power) Categorization: When X 2 is quantitative there can be many partial tables/associations, with few observations. Multiple regression models help avoid this problem. Comparing Measures: Often we wish to compare estimates of a parameter across levels of the control variable. Can use 2-sample z-test (ch. 7)