Means, Thresholds and Moderation Sarah Medland – Boulder 2008 Corrected Version Thanks to Hongyan Du for pointing out the error on the regression examples.

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Qualitative predictor variables
Bivariate analysis HGEN619 class 2007.
Soc 3306a Lecture 6: Introduction to Multivariate Relationships Control with Bivariate Tables Simple Control in Regression.
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Multiple regression analysis
(Re)introduction to Mx Sarah Medland. KiwiChinese Gooseberry.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #18.
Summarizing Data Nick Martin, Hermine Maes TC21 March 2008.
(Re)introduction to Mx. Starting at the beginning Data preparation Mx expects 1 line per case/family Almost limitless number of families and variables.
Introduction to Linkage
Univariate Analysis in Mx Boulder, Group Structure Title Type: Data/ Calculation/ Constraint Reading Data Matrices Declaration Assigning Specifications/
Heterogeneity II Danielle Dick, Tim York, Marleen de Moor, Dorret Boomsma Boulder Twin Workshop March 2010.
Univariate Analysis Hermine Maes TC19 March 2006.
Mx Practical TC18, 2005 Dorret Boomsma, Nick Martin, Hermine H. Maes.
Thresholds and ordinal data Sarah Medland – Boulder 2010.
Introduction to Multivariate Genetic Analysis Kate Morley and Frühling Rijsdijk 21st Twin and Family Methodology Workshop, March 2008.
Multivariate Threshold Models Specification in Mx.
Karri Silventoinen University of Helsinki Osaka University.
Inference for regression - Simple linear regression
Overview of Meta-Analytic Data Analysis
Copy the folder… Faculty/Sarah/Tues_merlin to the C Drive C:/Tues_merlin.
Chapter 18 Four Multivariate Techniques Angela Gillis & Winston Jackson Nursing Research: Methods & Interpretation.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Ordinal (yet again) Sarah Medland – Boulder 2010.
1st meeting: Multilevel modeling: introduction Subjects for today:  Basic statistics (testing)  The difference between regression analysis and multilevel.
Using Weighted Data Donald Miller Population Research Institute 812 Oswald Tower, December 2008.
F:\sarah\fri_MV. Multivariate Linkage and Association Sarah Medland and Manuel Ferreira -with heavy borrowing from Kate Morley and Frühling Rijsdijk.
Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.
Education 793 Class Notes Multiple Regression 19 November 2003.
The importance of the “Means Model” in Mx for modeling regression and association Dorret Boomsma, Nick Martin Boulder 2008.
Regression Analysis Part C Confidence Intervals and Hypothesis Testing
Analysis Overheads1 Analyzing Heterogeneous Distributions: Multiple Regression Analysis Analog to the ANOVA is restricted to a single categorical between.
March 7, 2006Lecture 8aSlide #1 Matrix Algebra, or: Is this torture really necessary?! What for? –Permits compact, intuitive depiction of regression analysis.
Univariate Analysis Hermine Maes TC21 March 2008.
A first order model with one binary and one quantitative predictor variable.
Analysis of Experiments
Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Mx Practical TC20, 2007 Hermine H. Maes Nick Martin, Dorret Boomsma.
Continuous heterogeneity Danielle Dick & Sarah Medland Boulder Twin Workshop March 2006.
Frühling Rijsdijk & Kate Morley
Categorical Data Frühling Rijsdijk 1 & Caroline van Baal 2 1 IoP, London 2 Vrije Universiteit, A’dam Twin Workshop, Boulder Tuesday March 2, 2004.
Welcome  Log on using the username and password you received at registration  Copy the folder: F:/sarah/mon-morning To your H drive.
Linkage in Mx & Merlin Meike Bartels Kate Morley Hermine Maes Based on Posthuma et al., Boulder & Egmond.
Nonparametric Statistics
Copy folder (and subfolders) F:\sarah\linkage2. Linkage in Mx Sarah Medland.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
More on thresholds Sarah Medland. A plug for OpenMx? Very few packages can handle ordinal data adequately… OpenMx can also be used for more than just.
QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
Categorical Data HGEN
Nonparametric Statistics
HGEN Thanks to Fruhling Rijsdijk
Ordinal Data Sarah Medland.
Univariate Twin Analysis
Heterogeneity HGEN619 class 2007.
Univariate Analysis HGEN619 class 2006.
Ordinal data, matrix algebra & factor analysis
More on thresholds Sarah Medland.
Lab 2 Data Manipulation and Descriptive Stats in R
Nonparametric Statistics
Liability Threshold Models
Why general modeling framework?
Heterogeneity Danielle Dick, Hermine Maes,
(Re)introduction to Mx Sarah Medland
More on thresholds Sarah Medland.
BOULDER WORKSHOP STATISTICS REVIEWED: LIKELIHOOD MODELS
Multivariate Genetic Analysis: Introduction
Presentation transcript:

Means, Thresholds and Moderation Sarah Medland – Boulder 2008 Corrected Version Thanks to Hongyan Du for pointing out the error on the regression examples

This morning  Fitting a mean and regression with continuous data  Modelling Ordinal data  Fitting the regression model with ordinal data

Lets start with the data…  File: Wednesday.dat Contains 6 of the variables from Dorret’s example ntrid zygMZDZ age1 sekse1 AQ1 age2 sekse2 AQ2

Lets start with the data…

If this was a pedigree data file… FamidIndFatherMotherZygSexAge Trait xx xx 2312MZ MZ

How can we make this data file?  Assume we have data with 3 variables:

How do we make this data?  SPSS SORT CASES BY Family Individual. CASESTOVARS /ID = Family /INDEX = Individual /GROUPBY = VARIABLE.  SAS?  R?

Means…  In spss sas etc we calculate the mean  In Mx and other ML programs we estimate the mean

Spss…

Mx… Means.mx

 Spss assumes this is a sample  Mx assumes this is a population  Slightly different algebra

How about regression?  Y=X*B +C  Regression speak AutismQuotient = Sex*Beta1 + Age*Beta2 + Intercept  BG speak AutismQuotient = Sex Effect + Age Effect + Grand Mean

Spss…

regression.mx

Spss…

Run regression.mx

What does this mean?  Age Beta =.549 For every 1 unit increase in Age the mean shifts.549 Grand mean = Mean Age =18.2  So the mean for 20 year olds is predicted to be: = *.549

Sex effects?  Sex Beta =  Sex coded Male = 1 Female = 0  Female Mean: = *  Male Mean: = *-2.608

How do we get the p-values?

Set the elements to equal 0  Do this one at a time!

So…

Why bother with Mx?  Because most stat packages can’t handle non-independent data… Non-independence reduces the variance Biases t and F tests

Why bother with Mx?  Because we want complete flexibility in the model specification… As you see later today

Why bother with Mx?  Because very few packages can handle ordinal data adequately…

Binary data  File: two_cat.dat  NI=5  Labels Zyg twin1 twin2 Age Sex  Trait – smoking initiation Never Smoked/Ever Smoked (Recoded from yesterday) Data is sorted to speed up the analysis

Twin 1 smoking initiation

Mean =.47 SD =.499 Non Smokers =53%

Raw data distribution Mean =.47 SD =.499 Non Smokers =53% Threshold=.53 Standard normal distribution Mean = 0 SD =1 Non Smokers =53% Threshold =.074

Threshold =.074 – Huh what?  How can I work this out Excell  =NORMSINV()

Why do we rescale the data this way?  Convenience Variance always 1 Mean is always 0 We can interpret the area under a curve between two z-values as a probability or percentage

Why do we rescale the data this way? You could use other distributions but you would have to specify the fit function

Threshold.mx

Threshold =.075 – Huh what?

How about age/sex correction?

What does this mean?  Age Beta =.007 For every 1 unit increase in Age the threshold shifts.007

What does this mean?  Beta =.007  Threshold is  38 is SD from the mean age The threshold for 38 year olds is:.1544= *38  22 is SD from the mean age The threshold for 38 year olds is:.0422= *22

22 year olds Threshold = year olds Threshold =.1544 Is the age effect significant?

How to interpret this  The threshold moved slightly to the right as age increases  This means younger people were more likely to have tried smoking than older people But this was not significant

22 year olds Threshold = year olds Threshold = Is the age effect significant? If Beta =.03

How about the sex effect  Beta = -.05  Threshold =  Sex coded Male = 1, Female = 0  So the Male threshold is: = *-.05  The Female threshold is: = *-.05

Female Threshold = Male Threshold = Are males or females more likely to smoke?

Both effects together  38 year old Males:.1042= * *38  38 year old Females:.1542= * *38  22 year old Males: = * *22  22 year old Females:.0422= * *22

Mx Threshold Specification: 3+ Cat. Threshold matrix : T Full 2 2 Free 1st threshold Twin 1 Twin 2 increment

Mx Threshold Model : ThresholdsL*T / Threshold matrix : T Full 2 2 Free 1st threshold Twin 1 Twin 2 increment Mx Threshold Specification: 3+ Cat.

Mx Threshold Model : ThresholdsL*T / Threshold matrix : T Full 2 2 Free 1st threshold Twin 1 Twin 2 increment 2nd threshold Mx Threshold Specification: 3+ Cat.

Adding a regression  L*T +  maxth =2, ndef=2, nsib=2, nthr=4

Adding a regression

Multivariate Threshold Models Specification in Mx Thanks Kate Morley for these slides

#define nsib 2! Number of variables * number of siblings = 2 #define maxth 2! Maximum number of thresholds #define nvar 2! Number of variables #define ndef 1 ! Number of definition variables #define nthr 4 ! nsib x nvar #NGROUPS 8 G1: MZ Females Data NInput=8 Ordinal File=data.dat Labels famID zyg covar_a covar_b var1_a var2_a var1_b var2_b Select if zyg = 1 / SELECT covar_a covar_b var1_a var2_a var1_b var2_b / DEFINITION_VARIABLE covar_a covar_b / BEGIN MATRICES; X Lower nvar nvar Free! Genetic paths Y Lower nvar nvar Free! Common environmental paths Z Lower nvar nvar Free! Unique environmental paths H Full 1 1 T Full maxth nthr Free! Thresholds B Full nvar ndef Free ! Regression betas L lower maxth maxth! For converting incremental to cumulative thresholds G Full maxth 1! For duplicating regression betas across thresholds K Full ndef nsib! Contains definition variables END MATRICES;

Threshold model for multivariate, multiple category data with definition variables: We will break the algebra into two parts: 1 - Definition variables; 2 - Uncorrected thresholds; and go through it in detail. Part 1Part 2

Threshold correction Twin 1 Variable 1 Threshold correction Twin 1 Variable 2 Twin 1 Twin 2 Definition variables Threshold correction Twin 2 Variable 2 Threshold correction Twin 2 Variable 1

Transpose:

Thresholds 1 & 2 Twin 1 Variable 1 Thresholds 1 & 2 Twin 1 Variable 2 Thresholds 1 & 2 Twin 2 Variable 1 Thresholds 1 & 2 Twin 2 Variable 2

=

 _table.html