Univariate Modeling in umx

Slides:



Advertisements
Similar presentations
Bivariate analysis HGEN619 class 2007.
Advertisements

Fitting Bivariate Models October 21, 2014 Elizabeth Prom-Wormley & Hermine Maes
Phenotypic factor analysis
Jack Davis Andrew Henrey FROM N00B TO PRO. PURPOSE Create a simulator from scratch that: Generates data from a variety of distributions Makes a response.
Univariate Model Fitting
Hypothesis Testing Steps in Hypothesis Testing:
Univariate Analysis in Mx Boulder, Group Structure Title Type: Data/ Calculation/ Constraint Reading Data Matrices Declaration Assigning Specifications/
Univariate Analysis Hermine Maes TC19 March 2006.
Raw data analysis S. Purcell & M. C. Neale Twin Workshop, IBG Colorado, March 2002.
Karri Silventoinen University of Helsinki Osaka University.
Karri Silventoinen University of Helsinki Osaka University.
Introduction to R Part 2. Working Directory The working directory is where you are currently saving data in R. What is the current working directory?
Confirmatory Factor Analysis Using OpenMx & R
Karri Silventoinen University of Helsinki Osaka University.
 Go to Faculty/marleen/Boulder2012/Moderating_cov  Copy all files to your own directory  Go to Faculty/sanja/Boulder2012/Moderating_covariances _IQ_SES.
Cholesky decomposition May 27th 2015 Helsinki, Finland E. Vuoksimaa.
Univariate modeling Sarah Medland. Starting at the beginning… Data preparation – The algebra style used in Mx expects 1 line per case/family – (Almost)
Practical SCRIPT: F:\meike\2010\Multi_prac\MultivariateTwinAnalysis_MatrixRaw.r DATA: DHBQ_bs.dat.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.
CFA: Basics Beaujean Chapter 3. Other readings Kline 9 – a good reference, but lumps this entire section into one chapter.
Comparison of different output options from Stata
Univariate Analysis Hermine Maes TC21 March 2008.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.
Introduction to Multivariate Genetic Analysis Danielle Posthuma & Meike Bartels.
© 2012 DigitalDay | MOBILE WEB DESIGN PRINCIPLES Best Practices Workshop 1.
QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
The SweSAT Vocabulary (word): understanding of words and concepts. Data Sufficiency (ds): numerical reasoning ability. Reading Comprehension (read): Swedish.
March 7, 2012M. de Moor, Twin Workshop Boulder1 Copy files Go to Faculty\marleen\Boulder2012\Multivariate Copy all files to your own directory Go to Faculty\kees\Boulder2012\Multivariate.
CFA: Basics Byrne Chapter 3 Brown Chapter 3 (40-53)
1 Consider the k studies as coming from a population of effects we want to understand. One way to model effects in meta-analysis is using random effects.
Categorical Data HGEN
Extended Pedigrees HGEN619 class 2007.
Programming in R Intro, data and programming structures
Pre-Workshop Survey Results
Structural Equation Modeling in R with OpenMx and umx
R programming language
business analytics II ▌assignment one - solutions autoparts 
Univariate Twin Analysis
Introduction to Multivariate Genetic Analysis
Fitting Univariate Models to Continuous and Categorical Data
Re-introduction to openMx
Heterogeneity HGEN619 class 2007.
Univariate Analysis HGEN619 class 2006.
MRC SGDP Centre, Institute of Psychiatry, Psychology & Neuroscience
Behavior Genetics The Study of Variation and Heredity
Correlation A bit about Pearson’s r.
Elementary Statistics
Descriptive Analysis and Presentation of Bivariate Data
Univariate modeling Sarah Medland.
Unreal Engine and C++ We’re finally at a point where we can start working in C++ with Unreal Engine To get everything set up for this properly, we’re going.
Probabilistic Models with Latent Variables
GxE and GxG interactions
Trainings 11/18 Advanced Java Things.
Pak Sham & Shaun Purcell Twin Workshop, March 2002
(Re)introduction to Mx Sarah Medland
Sarah Medland faculty/sarah/2018/Tuesday
(Virginia Commonwealth University)
umxCP & umxIP practical
p(A) = p(AA) + ½ p(Aa) p(a) = p(aa)+ ½ p(Aa)
Introduction to SAS Essentials Mastering SAS for Data Analytics
Amos Introduction In this tutorial, you will be briefly introduced to the student version of the SEM software known as Amos. You should download the current.
BOULDER WORKSHOP STATISTICS REVIEWED: LIKELIHOOD MODELS
Power Calculation Practical
Power Calculation Practical
Multivariate Genetic Analysis: Introduction
Andrea Friese, Silvia Artuso, Danai Laina
Heritability of Subjective Well-Being in a Representative Sample
Evaluation David Kauchak CS 158 – Fall 2019.
Presentation transcript:

Univariate Modeling in umx Timothy C. Bates tim.bates@ed.ac.uk

Roadmap for this interactive tutorial on Univariate Twin Analysis Introducing and installing umx Using umx to model genetic and environmental effects on total variance of a trait umxACEv models with ACE and ADE decomposition Comparing models umxCompare to compare umxACEv() and umxACE() models Testing fit of reduced models umxModify to create AE, CE and E-Only Models

Installing umx Typically installing umx from CRAN will be fine install.packages("umx") At the workshop, we’ll use the development version so new features can be rolled out to everyone during the workshop devtools::install_github("tbates/umx") Potential hassle: Dependencies umx will try and install all its dependencies, but cannot always catch everything it needs The response is straightforward: Read those messages and just install the package(s) being requested So if the message “There is no package "XML" ” install.packages("XML")

What version of umx do I have? umxVersion() umx version: 2.2.0 OpenMx version: 2.9.4 [GIT v2.9.4] R version: R version 3.4.3 (2017-11-30) Platform: x86_64-apple-darwin15.6.0 MacOS: 10.13.4 Default optimiser: CSOLNP NPSOL-enabled?: Yes OpenMP-enabled?: Yes You can update OpenMx thusly: install.OpenMx(loc = c("NPSOL", "travis", "CRAN", "open travis build page")

If you don’t have umx 2.2.0 (Jeff already pushed this to shop laptops) If using your own laptop, get the latest version with: # if you have not already install.packages("devtools") devtools:install_github("tbates/umx") Whenever you update a package, pays to restart R

A good editor is very helpful: TextMate 2 bundle Scripting and analysis are greatly enhanced by working in a great environment. My favorite is TextMate 2 It has bundles supporting R and for OpenMx (maintained by us)

General production line make  modify  compare  summarize  plot m1 = umxACE(mzData = mzData, dzData= dzData) m2 = umxModify(m1, "c_r1c1", comparison = TRUE) umxCompare(m1, m2) umxSummary(m1) plot(m1)

Twin Modeling Functions umxACEv umxACE umxCP umxIP umxGxE umxGxE_biv umxGxE_window umxSexLim Each has a umxSummary() and plot() method umxSummaryACEcov umxSummaryACEv umxSummaryACE umxSummaryCP umxSummaryGxE_biv umxSummaryGxE umxSummaryIP umxSummarySexLim

Many other handy functions Other Reporting Functions umxAPA: formats many things parameters() Show model parameters umx_set_optimizer() Current Optimizer is: 'CSOLNP'. Options are: 'CSOLNP', 'SLSQP', and 'NPSOL' Twin Data functions umx_long2wide() umx_wide2long() umx_make_TwinData() Simulate twin data, inc GxE umx_make_MR_data() umx_residualize() Flexible residualization of data, inc. twin data umx_scale_wide_twin_data() Allows scaling of wide data

Let’s run an ACE analysis umxACEv() Read the help ?umxACEv umxACE() Read the help ?umxACE Let’s run an ACE analysis

umxACEv: Lots of parameters… most can be ignored! umxACEv(name = "ACE", selDVs, selCovs = NULL, covMethod = c("fixed", "random"), dzData, mzData, suffix = NULL, dzAr = 0.5, dzCr = 1, addStd = TRUE, addCI = TRUE, numObsDZ = NULL, numObsMZ = NULL, boundDiag = NULL, weightVar = NULL, equateMeans = TRUE, bVector = FALSE, thresholds = c("deviationBased", "WLS"), autoRun = getOption("umx_auto_run"), sep = NULL, optimizer = NULL) nb: Not all options are completed as yet. e.g. fixed-effect covariates; WLS-based optimization

umxACEv: just the parameters we need m1 = umxACEv(selDVs = "wt", suffix ="", dzData=dzData, mzData=mzData) All variables continuous treating data as raw Running ACE with 4 parameters ACE -2×log(Likelihood) 27278.55 (df=4) Standardized solution | | A1| C1| E1| |:---|----:|-----:|----:| |wt1 | 0.95| -0.18| 0.23|

plot(model) plot() will make a graph, and open it in your browser Can also output pdf or .dot OmniGraffle or VISIO to edit plots for publication. umx_set_auto_plot(TRUE/FALSE) umx_set_plot_file_suffix()

What’s in the file plot() saves What’s in the file plot() saves? Graphviz “dot” language (like S (R’s parent), invented at Bell Laboratories digraph G {     splines = "FALSE";     # Latents     a1 [shape = circle];     c1 [shape = circle];     e1 [shape = circle];     # Manifests     ht1 [shape = square]; a1 -> ht1 [label = "0.92"]; c1 -> ht1 [label = "0.14"]; e1 -> ht1 [label = "0.36"];     {rank = same;  ht1 };     {rank = min ;  a1 };     {rank = max ;  c1; e1 }; }

Table output umx_set_table_format() Markdown by default Can be latex, html,… umx_set_table_format() Current format is'markdown'. Valid options are 'latex', 'html', 'markdown', 'pandoc', and 'rst'>

What’s inside this model? “top” is a model that contains all the shared matrices and algebras MZ is a model contains the MZ data and how to optimize this DZ is a model contains the DZ data and how to optimize this

What’s inside top? A, C, and E are matrices for our three variance components. dzAr and dzCr are our .5 and 1 expected correlations for DZs

FullMatrix 'expMean’ $labels wt1 wt2 means "expMean_r1c1" "expMean_r1c1” $values wt1 wt2 means 58.81 58.81 $free wt1 wt2 means TRUE TRUE

SymmMatrix 'A' $labels [,1] [1,] "A_r1c1” $values [,1] [1,] 82.36 $free [,1] [1,] TRUE $lbound: No lower bounds assigned. $ubound: No upper bounds assigned.

m1$top$ACE mxAlgebra 'ACE’ $formula: A + C + E $result: [,1] [1,] 86.61276 dimnames: NULL

m1$top$expCovMZ mxAlgebra 'expCovMZ’ $formula: rbind(cbind(ACE, AC), cbind(AC, ACE)) $result: wt1 wt2 wt1 86.61 66.35 wt2 66.35 86.61 dimnames: [1] "wt1" "wt2" [2] "wt1" "wt2"

Univariate model of weight #' # ======================= #' # = Univariate model of weight = #' # ======================= #' require(umx) #' data(twinData) # ?twinData from Australian twins. #' #' # Things to note: ACE model of weight will return a NEGATIVE variance in C. #' #  This is exactly why we have ACEv! It suggests we need a different model #' #  In this case: ADE. #' # Other things to note: #' # 1. umxACEv can figure out variable names: provide "sep" and "wt" -> "wt1" "wt2" #' # 2. umxACEv picks the variables it needs from the data. #' #' selDVs = "wt" #' mzData <- twinData[twinData$zygosity %in% "MZFF", ] #' dzData <- twinData[twinData$zygosity %in% "DZFF", ] #' m1 = umxACEv(selDVs = selDVs, sep = "", dzData = dzData, mzData = mzData)

ADE model ADE model What’s the difference for ACE and ADE? Just change dzCr to .25

Evidence for dominance ? (DZ correlation set to .25) m2 = umxACEv("ADE", selDVs ="wt", sep = "", dzData = dzData, mzData = mzData, dzCr = .25) note: the underlying matrices are still called A, C, and E. I catch this in the summary table, so columns are labeled A, D, and E. However, currently, the plot will say A, C, E.

Table and plot: what changed? | | A1| D1| E1| |:---|---:|----:|----:| |wt1 | 0.4| 0.37| 0.23|

Something to Try: umxSummary(m2, report="html ") What happened?

Modify model umxModify allows you to modify, re-run and summarize a model. umxModify(lastFit, update = NULL, master = NULL, regex = FALSE, free = FALSE, value = 0, newlabels = NULL, freeToStart = NA, name = NULL, verbose = FALSE, intervals = FALSE, comparison = FALSE, autoRun = TRUE)

Modify model Drop parameter C_r1c1, run model, and show comparison m3 = umxModify(m2, update = "C_r1c1", name = ”AE", comparison = T)

We can modify this model, dropping dominance component (matrix is still called C) m3= umxModify(m2,update="C_r1c1",comparison=T, name="AE") | | A1|D1 | E1| |:---|----:|:--|----:| |wt1 | 0.76|. | 0.24| |Model | EP|∆ -2LL |∆ df |p | AIC | Compare with Model| |:-----|--:|:------|:----|:-----|--------:|:------------------| |ADE | 4| | | | 19506.55| | |AE | 3|8.675 |1 |0.003 | 19513.23|ADE |

We can modify this model, dropping A component m4= umxModify(m2,update="A_r1c1",comparison=T, name="DE") umxCompare(m2, c(m3, m4)) |Model | EP|∆ -2LL |∆ df |p | AIC|Compare with Model | |:-----|--:|:---------|:----|:-----|--------:|:------------------| |ADE | 4| | | | 19506.55| | |AE | 3|8.6750179 |1 |0.003 | 19513.23|ADE | |DE | 3|8.2983642 |1 |0.004 | 19512.85|ADE |

Well done! #================================================== # = Well done! # = Now you can make and modify twin models # = in umx  #==================================================

umxACE ?umxACE ACE -2 × log(Likelihood)'log Lik.' 27287.23 (df=4) m1=umxACE(selDVs="wt", sep="", dzData=dzData,mzData=mzData) ACE -2 × log(Likelihood)'log Lik.' 27287.23 (df=4) Standardized solution | | a1|c1 | e1| |:---|----:|:--|----:| |wt1 | 0.87|. | 0.49|

umxACE with dominance instead of C m2 = umxACE("ADE", selDVs = selDVs, sep="", dzData = dzData, mzData = mzData, dzCr = .25) umxCompare(m2, m1) # ADE seems better? | | a1| d1| e1| |:---|----:|----:|----:| |wt1 | 0.63| 0.61| 0.48|

What about if we compare umxACEv and umxACE? m1 = umxACEv("ADEvari", selDVs = selDVs, sep="", dzData = dzData, mzData = mzData, dzCr = .25) m2 = umxACE("ADEchol", selDVs = selDVs, sep="", dzData = dzData, mzData = mzData, dzCr = .25) umxCompare(m1,m2) |Model | EP|∆ -2LL |∆ df |p | AIC|Compare with Model | |:-------|--:|:------|:----|:--|--------:|:------------------| |ADEvari | 4| | | | 19506.55| | |ADEchol | 4|0 |0 | | 19506.55|ADEvari |

.48 = sqrt(.23)

We can do much more… and will sample across the week. Common Pathway umxCP() Independent Pathway umxIP() Gene-environment interaction umxGxE() Path-based SEM umxRAM + umxPath EFA umxEFA()