Path Analysis and Structured Linear Equations Biologists in interested in complex phenomena Entails hypothesis testing –Deriving causal linkages between interacting systems Simple linear causal relations often not realistic Unknown and possibly reticulate correlations among variables Predictor A Intermediate B Response C –Numerous possible interactions –Correlations among variables with differing magnitudes
Path Analysis What tools are available to Ecologists and Evolutionary Biologists for analyzing systems with multiple causality? Multiple Regression? Path Analysis –Increasingly common Two methods are related –Use former to estimate the latter
Goals of Path Analysis Hypothesis Testing Exploratory Data Analysis
Origins of Path Analysis Developed by Sewell Wright –Formulated in series of papers published in 1918, 1921, 1934, 1960 Derived to partition direct and indirect relationships among variables Path Analysis deals with dependency relationships among variables Key is that investigator specified the order of dependency
Mechanics of Path Analysis Derive a model of dependency Partition relationships among the different pathways Not necessarily a simultaneous method Originally did not include overall tests of model fit to the data Recently Path Analysis superceded by SEM –Structured Equation Modelling
Meaning of Path Models Path Models are presumed to represent causal hypotheses A significant path model does not imply causality –Rather one can use the model to test for causality using experimental data or in a confirmatory model with additional data
Indirect and Direct Effects Two ways that a predictor variable may affect a response variable First, there is a direct effect of variable x 1 on y –I.e., x 1 y Second, there is an indirect effect of variable x1 on y through another correlated predictor variable.
General Path Model U2U2 U1U1 YjYj Z XiXi p1p1 p2p2 p3p3 p5p5 p4p4 p6p6 p7p7
Elaboration of the Path Model Path coefficients designated by “p i ” Unexplained variation is given by “U” Correlations are designated by “r i ” Correlations shown by double arrows Paths by single arrows Negative Paths traditionally are designated with dashed lines
Estimation of Path Coefficients Typically use Multiple Regression to estimate path coefficients –Either standardize the x and y variables and then run the regression or –Request the output of standardized partial regression coefficients Decomposition of Correlations Factor Analysis
Assumptions of Path Model Assume linear and additive relationships –Excludes curvilinear and multiplicative models Error terms are uncorrelated with one another Recursive models only – one way causal flows Observed variables measured without error Model is correctly specified –All causal determinants properly included in model –If causal variables excluded it is because they are independent of those that were included
Path Coefficients Can compute from simple correlations –For one x and one y –Path is: p XY = r XY –For two x variables and a single y –Y 1 = p Y1X1 x 1 + p Y1X2 x 2 + e Y1 –r X1Y1 = p X1Y1 + P Y1X2 r X1X2 This shows that the correlation between x and y has a direct and indirect component Residual is given by 1-R 2 yi.jkl…p
Dark Side of Path Analysis Collinearity Unstable beta weights (paths) Incompletely specified path models Use of categorical variables in paths Low sample size
Path Analysis of Morphology Performance Morphological variables from juvenile Urosaurus ornatus Performance variables –Initial Velocity –Maximum Velocity –Stride Length –Stride Frequency