Structural Equation Models: The General Case

Slides:



Advertisements
Similar presentations
STA305 Spring 2014 This started with excerpts from STA2101f13
Advertisements

Structural Equation Modeling Using Mplus Chongming Yang Research Support Center FHSS College.
General Structural Equation (LISREL) Models
Introduction to Regression with Measurement Error STA431: Spring 2015.
Exploratory Factor Analysis
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
Structural Equation Models: The General Case STA431: Spring 2013.
Logistic Regression STA302 F 2014 See last slide for copyright information 1.
Basics of regression analysis
Introduction to Regression with Measurement Error STA431: Spring 2013.
Structural Equation Modeling Intro to SEM Psy 524 Ainsworth.
Structural Equation Modeling Continued: Lecture 2 Psy 524 Ainsworth.
Double Measurement Regression STA431: Spring 2013.
Introduction to Regression with Measurement Error STA302: Fall/Winter 2013.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Slide 10.1 Structural Equation Models MathematicalMarketing Chapter 10 Structural Equation Models In This Chapter We Will Cover The theme of this chapter.
Factorial ANOVA More than one categorical explanatory variable STA305 Spring 2014.
SEM: Basics Byrne Chapter 1 Tabachnick SEM
Logistic Regression STA2101/442 F 2014 See last slide for copyright information.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
CJT 765: Structural Equation Modeling Class 12: Wrap Up: Latent Growth Models, Pitfalls, Critique and Future Directions for SEM.
Confirmatory Factor Analysis Part Two STA431: Spring 2013.
Categorical Independent Variables STA302 Fall 2013.
STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables.
Categorical Independent Variables STA302 Fall 2015.
STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables.
The Sample Variation Method A way to select sample size This slide show is a free open source document. See the last slide for copyright information.
Logistic Regression For a binary response variable: 1=Yes, 0=No This slide show is a free open source document. See the last slide for copyright information.
CJT 765: Structural Equation Modeling Final Lecture: Multiple-Group Models, a Word about Latent Growth Models, Pitfalls, Critique and Future Directions.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
BPS - 5th Ed. Chapter 231 Inference for Regression.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Stochasticity and Probability. A new approach to insight Pose question and think of the answer needed to answer it. Ask: How do the data arise? What is.
Stats Methods at IC Lecture 3: Regression.
Inference: Conclusion with Confidence
Intro to Research Methods
Chapter 14 Introduction to Multiple Regression
Regression and Correlation
Confidence Intervals for Proportions
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Chapter 7. Classification and Prediction
Linear Regression.
Exploratory Factor Analysis
Inference: Conclusion with Confidence
Confirmatory Factor Analysis Part Two STA431: Spring 2017
Logistic Regression STA2101/442 F 2017
Poisson Regression STA 2101/442 Fall 2017
Inference for Regression
Interactions and Factorial ANOVA
Confidence Intervals for Proportions
Logistic Regression Part One
STA302: Regression Analysis
Confirmatory Factor Analysis Part Two STA431: Spring 2013
Multinomial Logit Models
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
6-1 Introduction To Empirical Models
Modelling data and curve fitting
The Simple Linear Regression Model: Specification and Estimation
Elementary Statistics: Looking at the Big Picture
5.4 General Linear Least-Squares
Chapter 4, Regression Diagnostics Detection of Model Violation
Simple Linear Regression
Basic Practice of Statistics - 3rd Edition Inference for Regression
Exploratory Factor Analysis. Factor Analysis: The Measurement Model D1D1 D8D8 D7D7 D6D6 D5D5 D4D4 D3D3 D2D2 F1F1 F2F2.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Structural Equation Modeling
Presentation transcript:

Structural Equation Models: The General Case STA431: Spring 2015 See last slide for copyright information

An Extension of Multiple Regression More than one regression-like equation Includes latent variables Variables can be explanatory in one equation and response in another Modest changes in notation Vocabulary Path diagrams No intercepts, all expected values zero Serious modeling (compared to ordinary statistical models) Parameter identifiability

Variables can be response in one equation and explanatory in another Variables (IQ = Intelligence Quotient): X1 = Mother’s adult IQ X2 = Father’s adult IQ Y1 = Person’s adult IQ Y2 = Child’s IQ in Grade 8 Of course all these variables are measured with error. We will lose the intercepts very soon. Exercise: Make a path diagram for 3 Y_1 &=& \alpha_1 + \gamma_1 X_1 + \gamma_2 X_2 + \epsilon_1 \\ Y_2 &=& \alpha_2 + \beta Y_1 + \epsilon_2 % 30 What about the other parent! Education might be better.

Modest changes in notation Regression coefficients are now called gamma instead of beta Betas are used for links between Y variables Intercepts are alphas but they will soon disappear. We feel free to drop the subscript i; implicitly, everything is independent and identically distributed for i = 1, …, n.

Vocabulary Variables can be Latent or Manifest. Manifest means observable All error terms are latent Variables can be Exogenous or Endogenous Exogenous variables appear only on the right side of the = sign. Think “X” for explanatory variable. All error terms are exogenous Endogenous variables appear on the left of at least one = sign. Think “end” of an arrow pointing from exogenous to endogenous Betas link endogenous variables to other endogenous variables.

Path diagrams This is the IQ example, but with implausible lack of covariance between errors.

Path Diagram Rules Latent variables are enclosed by ovals. Observable (manifest) variables are enclosed by rectangles. Error terms are not enclosed Sometimes the arrows from the error terms seem to come from nowhere. The symbol for the error term does not appear in the path diagram. Sometimes there are no arrows for the error terms at all. It is just assumed that such an arrow points to each endogenous variable. Straight, single-headed arrows point from each variable on the right side of an equation to the endogenous variable on the left side. Sometimes the coefficient is written on the arrow, but sometimes it is not. A curved, double-headed arrow between two variables (always exogenous variables) means they have a non-zero covariance. Sometimes the symbol for the covariance is written on the curved arrow, but sometimes it is not.

Causal Modeling (cause and effect) The arrows deliberately imply that if A  B, we are saying A contributes to B, or partly causes it. There may be other contributing variables. All the ones that are unknown are lumped together in the error term. Are these unknown variables are independent of the variables in the model? Probably not. Sometimes we can get around the problem with instrumental variables.

Serious Modeling Once you accept that model equations are statements about what contributes to what, you realize that structural equation models represent a rough theory of the data, with some parts (the parameter values) unknown. They are somewhere between ordinary statistical models, which are like one-size-fits-all clothing, and true scientific models, which are like tailor made clothing. So they are very flexible and potentially valuable. It is good to combine what the data can tell you with what you already know. But structural equation models can require a lot of input and careful thought to construct. In this course, we will get by mostly on common sense. In general, the parameters of the most reasonable model need not be identifiable. It depends upon the form of the data as well as on the model. Identifiability needs to be checked. Frequently, this can be done by inspection.

Example: Halo Effects in Real Estate Halo in religious art Notation is old. Reported service and reported satisfaction are correlated. Is it because true (perceived) service causes satisfaction, or is it a halo effect? You just can’t tell.

Losing the intercepts and expected values Mostly, the intercepts and expected values are not identifiable anyway, as in multiple regression with measurement error. We have a chance to identify a function of the parameter vector – the parameters that appear in the covariance matrix Σ = V(D). Re-parameterize. The new parameter vector is the set of parameters in Σ, and also μ = E(D). Estimate μ with x-bar, forget it, and concentrate on inference for the parameters in Σ. To make calculation of the covariance matrix easier, write the model equations in centered form. The little letters c over the variables are invisible.

From this point on the models have no means and no intercepts. Now more examples

Multiple Regression X1 Y X2 Path diagrams Implicitly i = 1, …, n Unconditional No intercept Info is in covariances Usually means and intercepts are more trouble than they are worth -- omitted (centering) Y = \gamma_1 X_1 + \gamma_2 X_2 + \epsilon % 36

Regression with measurement error Latent in ovals or circles Y &=& \gamma_1 X_1 + \gamma_2 X_2 + \epsilon \\ W_1 &=& X_1 + e_1 \\ W_2 &=& X_2 + e_2 \\ V &=& Y + e_3 % 24

A Path Model with Measurement Error Latent path model Causal MODELs Y_1 &=& \gamma_1 X + \epsilon_1 \nonumber \\ Y_2 &=& \beta Y_1 + \gamma_2 X + \epsilon_2 \nonumber \\ W &=& X + e_1 \nonumber \\ V_1 &=& Y_1 + e_2 \nonumber \\ V_2 &=& Y_2 + e_3 \nonumber % 24

A Factor Analysis Model X1 X2 X3 X4 X5 Observed variables are mental tests Spearman’s G (1901) Lambdas are called factor loadings Can have more than one factor -- more on this later %32pt X_1 = \lambda_1 F + e_1 \\ X_2 = \lambda_2 F + e_2 \\ X_3 = \lambda_3 F + e_3 \\ X_4 = \lambda_4 F + e_4 \\ X_5 = \lambda_5 F + e_5 \\ General Intelligence

A Longitudinal Model M1 M2 M3 M4 P = Pain M = Mobility P1 P2 P3 P4

Estimation and Testing as Before X Y1 Y2 Consider a nice simple path model for observed variables. Data have been “centered.” This is a toy example, because there is ALWAYS measurement error. Y_1 &=& \gamma X + \epsilon_1 \\ Y_2 &=& \beta Y_1 + \epsilon_2 % 36 \begin{itemize} \item[] All expected values equal zero (data have been ``centered.") \item[] $V(X)=\phi$, $V(\epsilon_1) = \psi_1$, $V(\epsilon_2) = \psi_2$, \item[] $X, \epsilon_1, \epsilon_2$ are all independent. \item[] Everything is normal. \end{itemize} % 24

Distribution of the data Estimate numerically by maximum likelihood (other methods available) Large-sample tests and Confidence intervals STRUCTURAL (At least 3 distinct meaning in the literature) * Arising from from the physical or social structure being modeled (Koopmans and Ryersol, 1950 Annals) * Structural as opposed to functional * Structural model vs measurement model Distribution of the data \left( \begin{array}{c} X_1\mbox{~} \\ Y_{1,1} \\ Y_{1,2} \end{array} \right) \ldots X_n\mbox{~} \\ Y_{n,1} \\ Y_{n,2} \end{array} \right) \mbox{ are independent normal with mean zero.} % 24 and covariance matrix \boldsymbol{\Sigma} = \left( \begin{array}{c c c} \phi & \gamma\phi & \beta\gamma\phi \\ \gamma\phi & \gamma^2\phi+\psi_1 & \beta(\gamma^2\phi+\psi_1) \\ \beta\gamma\phi & \beta(\gamma^2\phi+\psi_1) & \beta^2(\gamma^2\phi+\psi_1) + \psi_2 \boldsymbol{\theta} = (\gamma, \beta, \phi, \psi_1, \psi_2)

A General Two-Stage Model \mathbf{Y}_i &=& \boldsymbol{\beta} \mathbf{Y}_i + \boldsymbol{\Gamma} \mathbf{X}_i + \boldsymbol{\epsilon}_i \\ \mathbf{F}_i &=& \left( \begin{array}{c} \mathbf{X}_i \\ \mathbf{Y}_i \end{array} \right) \\ \mathbf{D}_i &=& \boldsymbol{\Lambda}\mathbf{F}_i + \mathbf{e}_i % 36 \begin{itemize} \item $\mathbf{D}_i$ (the data) are observable. All other variables are latent. \item $\mathbf{Y}_i = \boldsymbol{\beta} \mathbf{Y}_i + \boldsymbol{\Gamma} \mathbf{X}_i + \boldsymbol{\epsilon}_i$ is called the \emph{Latent Variable Model} \item The latent vectors $\mathbf{X}_i$ and $\mathbf{Y}_i$ are collected into a ``factor" $\mathbf{F}_i$. This is \emph{not} a categorical explanatory variable, the usual meaning of factor in experimental design. \item $\mathbf{D}_i = \boldsymbol{\Lambda}\mathbf{F}_i + \mathbf{e}_i$ is called the \emph{Measurement Model}. \end{itemize} % 20

More Details \begin{itemize} \item $\mathbf{Y}_i$ is a $q \times 1$ random vector. \item $\boldsymbol{\beta}$ is a $q \times q$ matrix of constants with zeros on the main diagonal. \item $\boldsymbol{\Gamma}$ is a $q \times p$ matrix of constants. \item $\mathbf{X}_i$ is a $p \times 1$ random vector. \item $\boldsymbol{\epsilon}_i$ is a $q \times 1$ random vector. \item $\mathbf{F}_i$ ($F$ for Factor) is just $\mathbf{X}_i$ stacked on top of $\mathbf{Y}_i$. It is a $(p+q) \times 1$ random vector. \item $\mathbf{D}_i$ is a $k \times 1$ random vector. Sometimes, $\mathbf{D}_i = \left( \begin{array}{c} \mathbf{W}_i \\ \mathbf{V}_i \end{array} \right)$ \item $\boldsymbol{\Lambda}$ is a $k \times (p+q)$ matrix of constants. \item $\mathbf{D}_i$ is a $k \times 1$ random vector. \item $\mathbf{e}_i$ is a $k \times 1$ random vector. \item $\mathbf{X}_i$, $\boldsymbol{\epsilon}_i$ and $\mathbf{e}_i$ are independent. \end{itemize} % 20

\mathbf{Y}_i &=& \boldsymbol{\beta} \mathbf{Y}_i + \boldsymbol{\Gamma} \mathbf{X}_i + \boldsymbol{\epsilon}_i \\ \mathbf{F}_i &=& \left( \begin{array}{c} \mathbf{X}_i \\ \mathbf{Y}_i \end{array} \right) \\ \mathbf{D}_i &=& \boldsymbol{\Lambda}\mathbf{F}_i + \mathbf{e}_i % 24 \begin{itemize} \item $V(\mathbf{X}_i)=\boldsymbol{\Phi}_x$ \item $V(\boldsymbol{\epsilon}_i)=\boldsymbol{\Psi}$ \item $V(\mathbf{F}_i) = V\left( \begin{array}{c} \end{array} \right) = \left( \begin{array}{c c} V(\mathbf{X}_i) & C(\mathbf{X}_i,\mathbf{Y}_i) \\ C(\mathbf{Y}_i,\mathbf{X}_i) & V(\mathbf{Y}_i) =\boldsymbol{\Phi} = \left( \begin{array}{c c} \boldsymbol{\Phi}_{11} & \boldsymbol{\Phi}_{12} \\ \boldsymbol{\Phi}_{12}^\prime & \boldsymbol{\Phi}_{22} \\ \end{array} \right)$ \item $V(\mathbf{e}_i)=\boldsymbol{\Omega}$ \item $V(\mathbf{D}_i)=\boldsymbol{\Sigma}$ \end{itemize} % 20

Recall the example Write the equations up on the board. LaTeX is just copied from earlier

\mathbf{Y} &=& \boldsymbol{\beta} \mathbf{Y} + \boldsymbol{\Gamma} \mathbf{X} + \boldsymbol{\epsilon} \\ \mathbf{D} &=& \boldsymbol{\Lambda}\mathbf{F}_i + \mathbf{e} % 24 \left( \begin{array}{c} Y_1 \\ Y_2 \end{array} \right) = \left( \begin{array}{c c} 0 & 0 \\ \beta & 0 \\ + \left( \begin{array}{c} \gamma_1 \\ \gamma_2 \end{array} \right) X \epsilon_1 \\ \epsilon_2 \end{array} \right) % 24 W \\ V_1 \\ V_2 = \left( \begin{array}{c c c} 1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & 0 & 1 X \\ Y_1 \\ Y_2 e_1 \\ e_2 \\ e_3 \begin{itemize} \item $V(\mathbf{X})=\boldsymbol{\Phi}_x= \phi$ \item $V(\boldsymbol{\epsilon})=\boldsymbol{\Psi} = \left( \begin{array}{c c} \psi_1 & 0 \\ 0 & \psi_2 \end{array} \right)$ \item $V(\mathbf{e})=\boldsymbol{\Omega} = \left( \begin{array}{c c c} \omega_1 & 0 & 0\\ 0 & \omega_2 & 0 \\ 0 & 0 & \omega_3 \item $V(\mathbf{D})=\boldsymbol{\Sigma}$ \end{itemize} % 20

Observable variables in the latent variable model (fairly common) These present no problem Let P(ej=0) = 1, so Var(ej) = 0 And Cov(ei,ej)=0 because if P(ej=0) = 1 So in the covariance matrix Ω=V(e), just set ωij = ωji = 0, i=1,…,k Cov(e_i,e_j) &=& E(e_ie_j) - E(e_i)E(e_j) \\ &=& E(e_i\cdot 0) - E(e_i)\cdot 0 \\ &=& 0-0=0 % 36

What should you be able to do? Given a path diagram, write the model equations and say which exogenous variables are correlated with each other. Given the model equations and information about which exogenous variables are correlated with each other, draw the path diagram. Given either piece of information, write the model in matrix form and say what all the matrices are. Calculate model covariance matrices Check identifiability

Recall the notation Split the task up and develop RULES!

For the latent variable model, calculate Φ = V(F) So, The inverse almost always exists. More details later. Not just “almost …” I think. For the latent variable model, calculate Phi = V(F) & & \mathbf{Y} = \boldsymbol{\beta} \mathbf{Y} + \boldsymbol{\Gamma} \mathbf{X} + \boldsymbol{\epsilon} \\ &\Rightarrow& \mathbf{Y} - \boldsymbol{\beta} \mathbf{Y} = \boldsymbol{\Gamma} \mathbf{X} + \boldsymbol{\epsilon} \\ &\Rightarrow& \mathbf{IY} - \boldsymbol{\beta} \mathbf{Y} = \boldsymbol{\Gamma} \mathbf{X} + \boldsymbol{\epsilon} \\ &\Rightarrow& (\mathbf{I} - \boldsymbol{\beta} )\mathbf{Y} = \boldsymbol{\Gamma} \mathbf{X} + \boldsymbol{\epsilon} \\ &\Rightarrow& (\mathbf{I} - \boldsymbol{\beta} )^{-1}(\mathbf{I} - \boldsymbol{\beta} )\mathbf{Y} = (\mathbf{I} - \boldsymbol{\beta} )^{-1}(\boldsymbol{\Gamma} \mathbf{X} + \boldsymbol{\epsilon}) \\ &\Rightarrow& \mathbf{Y} = (\mathbf{I} - \boldsymbol{\beta} )^{-1}\boldsymbol{\Gamma} \mathbf{X} + (\mathbf{I} - \boldsymbol{\beta} )^{-1}\boldsymbol{\epsilon} % 24 So, V(\mathbf{Y}) &=& V\left((\mathbf{I} - \boldsymbol{\beta} )^{-1}\boldsymbol{\Gamma} \mathbf{X} + (\mathbf{I} - \boldsymbol{\beta} )^{-1}\boldsymbol{\epsilon}\right) \\ &=& V\left((\mathbf{I} - \boldsymbol{\beta} )^{-1}\boldsymbol{\Gamma} \mathbf{X}\right) + V\left((\mathbf{I} - \boldsymbol{\beta} )^{-1}\boldsymbol{\epsilon}\right) \\ &=& (\mathbf{I} - \boldsymbol{\beta})^{-1}\boldsymbol{\Gamma}\, V(\mathbf{X}) \left( (\mathbf{I} - \boldsymbol{\beta})^{-1}\boldsymbol{\Gamma}\right)^\top + (\mathbf{I} - \boldsymbol{\beta} )^{-1}V(\boldsymbol{\epsilon})(\mathbf{I} - \boldsymbol{\beta} )^{-1\top} \\ &=& (\mathbf{I} - \boldsymbol{\beta})^{-1}\boldsymbol{\Gamma}~ \boldsymbol{\Phi}_{11}\boldsymbol{\Gamma}^\top (\mathbf{I} - \boldsymbol{\beta})^{-1\top} (\mathbf{I} - \boldsymbol{\beta} )^{-1}\boldsymbol{\Psi}(\mathbf{I} - \boldsymbol{\beta} )^{-1\top} % 24

For the measurement model, calculate Σ = V(D) \mathbf{D} &=& \boldsymbol{\Lambda}\mathbf{F} + \mathbf{e} \\ \Rightarrow V(\mathbf{D}) &=& V(\boldsymbol{\Lambda}\mathbf{F} + \mathbf{e}) \\ &=& V(\boldsymbol{\Lambda}\mathbf{F}) + V(\mathbf{e}) \\ &=& \boldsymbol{\Lambda}V(\mathbf{F})\boldsymbol{\Lambda}^\top + V(\mathbf{e}) \\ &=& \boldsymbol{\Lambda}\boldsymbol{\Phi}\boldsymbol{\Lambda}^\top + \boldsymbol{\Omega} \\ &=& \boldsymbol{\Sigma}% 36

Two-stage Proofs of Identifiability Show the parameters of the measurement model (Λ, Φ, Ω) can be recovered from Σ= V(D). Show the parameters of the latent variable model (β, Γ, Φ11, Ψ) can be recovered from Φ = V(F). This means all the parameters can be recovered from Σ. Break a big problem into two smaller ones. Develop rules for checking identifiability at each stage. Split the task up and develop RULES!

Copyright Information This slide show was prepared by Jerry Brunner, Department of Statistics, University of Toronto. It is licensed under a Creative Commons Attribution - ShareAlike 3.0 Unported License. Use any part of it as you like and share the result freely. These Powerpoint slides are available from the course website: http://www.utstat.toronto.edu/~brunner/oldclass/431s15