Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced Statistical Methods: Continuous Variables

Similar presentations


Presentation on theme: "Advanced Statistical Methods: Continuous Variables"— Presentation transcript:

1 Advanced Statistical Methods: Continuous Variables http://statisticalmethods.wordpress.com
Structural Equation Modeling_Part I Materials for this session are based on Dr. Bart Meuleman’s introductory lectures on SEM, delivered at the 2011 QMSS2 Summer School in Leuven

2 Ex: how are education, income & threat related?
SEM = Multivariate analytical technique: to gain insight in the relations between multiple variables Ex: how are education, income & threat related? OBS Education Income Ethnic threat

3 Rather than single equations, systems of equations are modeled
Ex. regression: Y = a + b1X1 + b2X2 + e Ex. SEM: Y1 = λ1F + ε1 Y2 = λ2F + ε2 Y3 = λ3F + ε3 F = b1X1 + b2X2 + ε4 X2 = b1X1 + ε5

4 Advantages of modeling systems of equations:
– Latent variable estimation E.g.: values, attitudes, socioeconomic status, IQ Multiple indicators that contain measurement error (random & non‐random) SEM allows to estimate relations btw. latent variables instead of btw. unreliable indicators

5 Ex: perceived economic threat scale in the European Social Survey:
Would you say it is generally bad or good for [country’s] economy that people come to live here from other countries? (0 = bad for the economy ‐ 10 = good for the economy) Would you say that [country’s] cultural life is generally undermined or enriched by people coming to live here from other countries? (0 = cultural life undermined – 10 = cultural life enriched) Is [country] made a worse or a better place to live by people coming to live here from other countries? (0 = worse place to live – 10 = better place to live)

6 X1 = education X2 = income F = perceived economic threat Y1, Y2, Y3 = ESS indicators

7

8 Graphical notation of SEM – Types of variables
Path Diagrams Graphical notation of SEM – Types of variables Latent unobserved variable (measured indirectly) Observed (manifest) variable (measured directly) Measurement/stochastic error F X1 e1

9 Graphical notation of SEM – Types of relations
Uni‐directional relation / Effect Correlation Feedback relation No relation

10 Exogenous vs. Endogenous variables exogenous: not explained by model;
endogeous: explained by model X1 = education X2 = income F = perceived economic threat Y1, Y2, Y3 = ESS indicators

11

12

13

14 Types of Models 1. Path models: direct & indirect effects btw. manifest variables

15 2. Confirmatory Factor Analysis: Construct Validation

16 3. Full SEM: combination of 1 & 2

17 4. Advanced Models

18 Confirmatory Factor Analysis (CFA)
- way of representing latent constructs that are measured through multiple indicators containing measurement error X1 = T + E1 X2 = T + E2 CFA vs. sumscores and indices: CFA tests assumptions which are assumed a priori in sumscores and indices: – equal weighting – equal measurement errors – unidimensionality

19 Notation of CFA model:

20 Parameters in the model:
– Factor loadings: relations btw. latent construct & observed indicators ‐ λ – Residual variances: measurement error in the observed indicators ‐ Var(δ) – Variance of the latent construct – Var(ξ)

21 Model Estimation – 1 SEM (CFA) models covariance (and mean) structures (rathe rthan working with raw data) OBS Edu Inc. Ethnic threat Educ Inc EthnT Educ 2.00 Inc EthnT

22 Model Estimation – 2 Every set of parameters implies a certain covariance (& mean) structure = implied or reproduced covariance (mean) structure Estimate the parameters so that they approach the observed covariance (mean) structure as good as possible Mostly: Maximum Likelihood Estimation

23 Model Estimation – 3

24 Model Estimation – 4

25 Evalutation of Model Fit– 1
Look at discrepancy btw. observed & implied covariance matrices Chi‐square test – tests whether the assumed linear structure holds in the population – if chi2 value is statistically significant, reject the model – degrees of freedom: if there are k observed variables in a linear model with t unconstrained parameters to be estimated, then: df={k(k+1)/2} – t However: sensitive for large sample sizes (& deviations from multivariate normality) – Chi2 difference test for nested models

26 Evalutation of Model Fit– 2
Alternative fit indices:

27 Evalutation of Model Fit– 3
Previous measures are indices of global fit (the fit of the model as a whole) One should also look at measures of local (mis)fit: Modification indices: - available for constrained & fixed parameters (except from the one for scaling the latent variable) - give the expected chi² reduction if the correspoding parameter is set free - Chi² tests with 1 df - also check the corresponing expected parameter change (EPC)!

28 Restrictions & identification – 1
3 possibilities for parameters: – free parameters; – fixed parameters (parameters = fixed to a certain constant); – constrained parameters (parameters that have been set equal) RESTRICTIONS reflect a priori knowledge Functions of restrictions: – identification of the model – statistical assumptions of the model – transformations of theoretical assumptions

29 Restrictions & identification - 2
Is there enough information available in the data to estimate all free parameters? Model needs to be identified! A model is identified if the sample variance and covariances include enough information to obtain unique estimates of all free model parameters.

30 Conditions for model identification:
1: Latent variables should be scaled by… – …constraining the factor loading of one item (i.e.the marker item) to 1 OR – …constraining the variance of the latent variable to 1 2: Statistical identification – the no. of freely estimated parameters (t) should not exceed the number of pieces of information [k(k+1)/2], where k = no. of observed variables – the no. of dfs should not be negative – in terms of equations: the number of unknowns cannot be larger than the number of knowns

31 Under‐identified models:
– less pieces of information than free parameters; df < 0 – infinite number of solutions – model is more complex than the data structure Just‐identified models: – as many pieces of information as free parameters; df = 0 – a unique perfect solution – the model is as complex as the data structure Over‐identified models: – more pieces of information than free parameters; df > 0 – no perfect solution possible – the model is less complex than the data structure

32 -what kind of model would this be?
Excercise: how many degrees of freedom does a CFA model with 1 latent variable & 4 indicators have? -what kind of model would this be? Remember: df={k(k+1)/2} – t, where k = no. of observed variables & t = no of free parameters


Download ppt "Advanced Statistical Methods: Continuous Variables"

Similar presentations


Ads by Google