Ordinal data, matrix algebra & factor analysis

Slides:



Advertisements
Similar presentations
Normal Distribution 2 To be able to transform a normal distribution into Z and use tables To be able to use normal tables to find and To use the normal.
Advertisements

Latent normal models for missing data Harvey Goldstein Centre for Multilevel Modelling University of Bristol.
Multivariate Mx Exercise D Posthuma Files: \\danielle\Multivariate.
Factor analysis Caroline van Baal March 3 rd 2004, Boulder.
Different chi-squares Ulf H. Olsson Professor of Statistics.
Stat 200b. Chapter 8. Linear regression models.. n by 1, n by 2, 2 by 1, n by 1.
(Re)introduction to Mx Sarah Medland. KiwiChinese Gooseberry.
Power and Sample Size + Principles of Simulation Benjamin Neale March 4 th, 2010 International Twin Workshop, Boulder, CO.
Extended sibships Danielle Posthuma Kate Morley Files: \\danielle\ExtSibs.
Statistics 350 Lecture 14. Today Last Day: Matrix results and Chapter 5 Today: More matrix results and Chapter 5 Please read Chapter 5.
(Re)introduction to Mx. Starting at the beginning Data preparation Mx expects 1 line per case/family Almost limitless number of families and variables.
Different chi-squares Ulf H. Olsson Professor of Statistics.
Univariate Analysis in Mx Boulder, Group Structure Title Type: Data/ Calculation/ Constraint Reading Data Matrices Declaration Assigning Specifications/
The General LISREL Model Ulf H. Olsson Professor of statistics.
Thresholds and ordinal data Sarah Medland – Boulder 2010.
Introduction to Multivariate Genetic Analysis Kate Morley and Frühling Rijsdijk 21st Twin and Family Methodology Workshop, March 2008.
Multivariate Threshold Models Specification in Mx.
Factor Analysis Ulf H. Olsson Professor of Statistics.
Slide 1 Detecting Outliers Outliers are cases that have an atypical score either for a single variable (univariate outliers) or for a combination of variables.
Wednesday PM  Presentation of AM results  Multiple linear regression Simultaneous Simultaneous Stepwise Stepwise Hierarchical Hierarchical  Logistic.
Copy the folder… Faculty/Sarah/Tues_merlin to the C Drive C:/Tues_merlin.
G Lecture 121 Analysis of Time to Event Survival Analysis Language Example of time to high anxiety Discrete survival analysis through logistic regression.
Applications The General Linear Model. Transformations.
The Use of Dummy Variables. In the examples so far the independent variables are continuous numerical variables. Suppose that some of the independent.
Two Approaches to Calculating Correlated Reserve Indications Across Multiple Lines of Business Gerald Kirschner Classic Solutions Casualty Loss Reserve.
Continuously moderated effects of A,C, and E in the twin design Conor V Dolan & Sanja Franić (BioPsy VU) Boulder Twin Workshop March 4, 2014 Based on PPTs.
Ordinal (yet again) Sarah Medland – Boulder 2010.
Univariate modeling Sarah Medland. Starting at the beginning… Data preparation – The algebra style used in Mx expects 1 line per case/family – (Almost)
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
F:\sarah\fri_MV. Multivariate Linkage and Association Sarah Medland and Manuel Ferreira -with heavy borrowing from Kate Morley and Frühling Rijsdijk.
Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.
The importance of the “Means Model” in Mx for modeling regression and association Dorret Boomsma, Nick Martin Boulder 2008.
Regression Analysis Part C Confidence Intervals and Hypothesis Testing
Analysis Overheads1 Analyzing Heterogeneous Distributions: Multiple Regression Analysis Analog to the ANOVA is restricted to a single categorical between.
G Lecture 81 Comparing Measurement Models across Groups Reducing Bias with Hybrid Models Setting the Scale of Latent Variables Thinking about Hybrid.
G Lecture 91 Measurement Error Models Bias due to measurement error Adjusting for bias with structural equation models Examples Alternative models.
SP 225 Lecture 5 Percentages, Proportions, Rates of Change and Frequency Distributions.
Estimation Kline Chapter 7 (skip , appendices)
Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.
Means, Thresholds and Moderation Sarah Medland – Boulder 2008 Corrected Version Thanks to Hongyan Du for pointing out the error on the regression examples.
Mx Practical TC20, 2007 Hermine H. Maes Nick Martin, Dorret Boomsma.
Assumptions of Multiple Regression 1. Form of Relationship: –linear vs nonlinear –Main effects vs interaction effects 2. All relevant variables present.
Frühling Rijsdijk & Kate Morley
Categorical Data Frühling Rijsdijk 1 & Caroline van Baal 2 1 IoP, London 2 Vrije Universiteit, A’dam Twin Workshop, Boulder Tuesday March 2, 2004.
Welcome  Log on using the username and password you received at registration  Copy the folder: F:/sarah/mon-morning To your H drive.
Lecture 3 (Chapter 4). Linear Models for Longitudinal Data Linear Regression Model (Review) Ordinary Least Squares (OLS) Maximum Likelihood Estimation.
Linkage in Mx & Merlin Meike Bartels Kate Morley Hermine Maes Based on Posthuma et al., Boulder & Egmond.
Copy folder (and subfolders) F:\sarah\linkage2. Linkage in Mx Sarah Medland.
More on thresholds Sarah Medland. A plug for OpenMx? Very few packages can handle ordinal data adequately… OpenMx can also be used for more than just.
QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
Multivariate Genetic Analysis (Introduction) Frühling Rijsdijk Wednesday March 8, 2006.
Categorical Data HGEN
ScWk 298 Quantitative Review Session
Regression Models for Linkage: Merlin Regress
HGEN Thanks to Fruhling Rijsdijk
Power and p-values Benjamin Neale March 10th, 2016
Ordinal Data Sarah Medland.
Re-introduction to openMx
More on thresholds Sarah Medland.
Univariate modeling Sarah Medland.
Liability Threshold Models
Why general modeling framework?
David M. Evans Sarah E. Medland
(Re)introduction to Mx Sarah Medland
Sarah Medland faculty/sarah/2018/Tuesday
More on thresholds Sarah Medland.
Regression Statistics
Individual Assignment 6
Brad Verhulst & Sarah Medland
Presentation transcript:

Ordinal data, matrix algebra & factor analysis Sarah Medland – Boulder 2008 Thursday morning

This morning Fitting the regression model with ordinal data Factor Modelling Continuous Ordinal

Binary Data… 1 variable Thresholds T ; Standard normal distribution Mean = 0 SD =1 Non Smokers =53% Threshold =.074 Thresholds T ;

Binary Data… adding a regression Thresholds T + D*B ; .0422 51.6%

What about more than 2 categories? Thresholds = L*T; ~15% in each tail Thresholds: ~-1.03 ~1.03 Displacement = ~2.06

What about more than 2 categories? Thresholds = L*T; ~15% in each tail Thresholds: ~-1.03 ~1.03 Displacement = ~2.06

Adding a regression L*T + G@(D*B); maxth =2, ndef=2, nsib=1, nthr=2

Adding a regression

Adding a regression

Multivariate Threshold Models Specification in Mx Thanks Kate Morley for these slides

Thresholds = L*T +G@((\vec(B*K))’) #define nsib 1 ! number of siblings = 1 #define maxth 2 ! Maximum number of thresholds #define nvar 2 ! Number of variables #define ndef 1 ! Number of definition variables #define nthr 2 ! nsib x nvar T Full maxth nthr Free ! Thresholds B Full nvar ndef Free ! Regression betas L lower maxth maxth ! For converting incremental to cumulative thresholds G Full maxth 1 ! For duplicating regression betas across thresholds K Full ndef nsib ! Contains definition variables Thresholds = L*T +G@((\vec(B*K))’)

L*T +G@((\vec(B*K))’) Threshold model for multivariate, multiple category data with definition variables: Part 2 Part 1 L*T +G@((\vec(B*K))’) We will break the algebra into two parts: 1 - Definition variables; 2 - Uncorrected thresholds; and go through it in detail.

Definition variables Threshold correction Twin 1 Variable 1 Twin 1 Twin 2 Threshold correction Twin 1 Variable 1 Threshold correction Twin 2 Variable 1 Threshold correction Twin 1 Variable 2 Threshold correction Twin 2 Variable 2

Transpose:

Thresholds 1 & 2 Twin 2 Variable 2 Thresholds 1 & 2 Twin 1 Variable 1 Thresholds 1 & 2 Twin 2 Variable 1 Thresholds 1 & 2 Twin 1 Variable 2

=

Factor Analysis Suppose we have a theory that the covariation between self reports of depression, anxiety and stress levels is due to one underlying factor C Depression Anxiety Stress R1 R2 R3

Factor Analysis…. Our data (simulated) Five variables – Three traits Depression, Anxiety & Stress Transformed to Z-scores

In Spss…

And we get…

c_factor.mx

c_factor.mx

c_factor.mx Plus a standardisation group so that our estimates can be compared to those from spss

What do we get?

What if our data was ordinal? Depression Yes/No 0/1 Anxiety and Stress Low / Average / High 0/1/2

Spss says no

Mx can do this Data file: ord.dat Script file: o_factor.mx Five variables ID, Depression, Anxiety, Stress, Sex Data is sorted to make it run faster!!! Script file: o_factor.mx

O_factor.mx

O_factor.mx Set to 0 because depression has 2 categories

O_factor.mx

Answer Ordinal data Continuous data Difference due to loss of information with ordinal data & slightly different fit function

If we have time Test to see if adding another factor improves the fit