A Simulation and Cautionary Assessment

Slides:



Advertisements
Similar presentations
Lecture 11 (Chapter 9).
Advertisements

EMNLP, June 2001Ted Pedersen - EM Panel1 A Gentle Introduction to the EM Algorithm Ted Pedersen Department of Computer Science University of Minnesota.
Mapping the WTP Distribution from Individual Level Parameter Estimates Matthew W. Winden University of Wisconsin - Whitewater WEA Conference – November.
Longitudinal and Multilevel Methods for Models with Discrete Outcomes with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David.
Copyright EM LYON Par accord du CFC Cession et reproduction interdites Research in Entrepreneurship- The problem of unobserved heterogeneity Frédéric Delmar.
An Ecological Trap for Ecologists: Zero-Modified Models Western Mensurationists’ Meeting 2009 Tzeng Yih Lam Tzeng Yih Lam, OSU Manuela Huso,
Paper Discussion: “Simultaneous Localization and Environmental Mapping with a Sensor Network”, Marinakis et. al. ICRA 2011.

A model to calculate the absolute and relative risks of haemorrhoid surgery David Epstein, on behalf of the University of York Technology Assessment Group.
Mixture Modeling Chongming Yang Research Support Center FHSS College.
[Part 15] 1/24 Discrete Choice Modeling Aggregate Share Data - BLP Discrete Choice Modeling William Greene Stern School of Business New York University.
Xitao Fan, Ph.D. Chair Professor & Dean Faculty of Education University of Macau Designing Monte Carlo Simulation Studies.
Lecture 8: Generalized Linear Models for Longitudinal Data.
Scientific question: Does the lunch intervention impact cognitive ability? The data consists of 4 measures of cognitive ability including:Raven’s score.
Factorial Survey Methods: and the use of HLM, HOLIT, HULIT, and HLIT Models R. L. Brown, Ph.D. University of Wisconsin-Madison
Growth Mixture Modeling of Longitudinal Data David Huang, Dr.P.H., M.P.H. UCLA, Integrated Substance Abuse Program.
Dimensionality of the latent structure and item selection via latent class multidimensional IRT models FRANCESCO BARTOLUCCI.
This presentation is made available through a Creative Commons Attribution- Noncommercial license. Details of the license and permitted uses are available.
Latent Growth Curve Modeling In Mplus: An Introduction and Practice Examples Part II Edward D. Barker, Ph.D. Social, Genetic, and Developmental Psychiatry.
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
Roghayeh parsaee  These approaches assume that the study sample arises from a homogeneous population  focus is on relationships among variables 
1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Assessing Estimability of Latent Class Models Using a Bayesian Estimation Approach Elizabeth S. Garrett Scott L. Zeger Johns Hopkins University Departments.
1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2006 Oregon Health & Science University OGI School of Science & Engineering John-Paul.
Body Text Add description of study, duration, compensation, or any other information necessary to explain the study and the responsibilities of the participants.
Thanh Le, Katheleen J. Gardiner University of Colorado Denver
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
[Part 15] 1/24 Discrete Choice Modeling Aggregate Share Data - BLP Discrete Choice Modeling William Greene Stern School of Business New York University.
Sampling Theory Determining the distribution of Sample statistics.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Lynette.
Joint Modelling of Accelerated Failure Time and Longitudinal Data By By Yi-Kuan Tseng Yi-Kuan Tseng Joint Work With Joint Work With Professor Jane-Ling.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
Microeconometric Modeling
Transforming the data Modified from:
Annual Meeting & Exposition of American Public Health Association
Skills Training and Economic Growth Evidence from Russia
Microeconometric Modeling
William Greene Stern School of Business New York University
Microeconometric Modeling
Fadeel H. Mahmood1, Jeffrey A. Strakowski1,2, Marcie A
Discrete Choice Modeling
Discrete Choice Modeling
2007 APHA Presentation Anthony Goudie, MSPH
Discrete Choice Models
Latent Variables, Mixture Models and EM
SAVE Trial design: Patients with moderate to severe obstructive sleep apnea (OSA) and known CV disease were randomized in a 1:1 fashion to either CPAP.
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
TOP DM 10 Algorithms C4.5 C 4.5 Research Issue:
Detecting the Learning Value of Items In a Randomized Problem Set
Microeconometric Modeling
Predict Failures with Developer Networks and Social Network Analysis
Microeconometric Modeling
Discrete Choice Modeling
Microeconometric Modeling
Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee
Transforming Children’s Health Care Quality and Outcomes–A Not-So-Random Non- linear Walk Across the Translational Continuum  Denise Dougherty, PhD, Carolyn.
Latent Variable Mixture Growth Modeling in Mplus
The Implications of Misreporting for Longitudinal Studies of SNAP
EM Algorithm 主講人:虞台文.
Empirical Methods for Microeconomic Applications University of Lugano, Switzerland May 27-31, 2019 William Greene Department of Economics Stern School.
Clustering (2) & EM algorithm
Complexity as Fitness for Evolved Cellular Automata Update Rules
Percent binding of cross-reactive antibodies from cross-over studies in insulin-treated patients with type 1 or type 2 diabetes. Percent binding of cross-reactive.
Your Project Title Here…
Percent binding of cross-reactive antibodies from parallel studies in insulin-treated patients with type 1 or type 2 diabetes. Percent binding of cross-reactive.
Evaluating the “One-Model Fits All” Approach for Modeling Clinical Trial Adverse events Stephanie Pan, MS Icahn School of Medicine at Mount Sinai Hospital,
BACKGROUND MODEL METHOD/RESULTS METHOD/RESULTS OBJECTIVE CONCLUSIONS
PROBLEMS ON BINOMIAL DISTRIBUTION.  Introduction  What is binomial distribution?  Definition of binomial distribution  Assumptions of binomial distribution.
Presentation transcript:

A Simulation and Cautionary Assessment Performance of K-Fold Cross Validation for Binary Longitudinal Finite Mixture Models: A Simulation and Cautionary Assessment Thom Taylor, PhD Nicklaus Children’s Research Institute, Miami, FL VA Palo Alto Health Care System, Palo Alto, CA Introduction Recent work (Grimm et al., 2017) proposed K-Fold Cross Validation (KFCV) for Longitudinal Finite Mixture Models (LFMMs). However, when using Expectation Maximization, the reduced information available in each EM iteration for each KFCV may result in incorrect model selection. Simulation Methods Separate 3 and 4 class sets were simulated for LFMMs with 6 time points for a binomial outcome (354 total simulated conditions): Basic random intercept logit model: ηij=β0j+β1 𝑡𝑖𝑚𝑒 𝑖𝑗 +β2 𝑡𝑖𝑚𝑒 𝑖𝑗 2 +εij   β0j=γ00+U0j {.1, .3, 1.0} ∈ Random Intercept σ² {200, 500, 800} ∈ Latent Class n Results & Conclusions 70% of known 3 class simulations had the lowest AIC (best fit) while only 41% of known 4 class solutions had the lowest AIC (best fit), p = .06. KFCV may not always be appropriate in EM-based LFMMs, particularly if there is greater unobserved heterogeneity in the available data. 3 Class Model Simulation Parameters 4 Class Model Simulation Parameters Class β0j β1 β2 1 -3 3 0.05 -2 2 -0.5 -0.1 -0.01 -5 0.2 0.1   4 -8 0.6

A Simulation and Cautionary Assessment Performance of K-Fold Cross Validation for Binary Longitudinal Finite Mixture Models: A Simulation and Cautionary Assessment Thom Taylor, PhD Nicklaus Children’s Research Institute, Miami, FL VA Palo Alto Health Care System, Palo Alto, CA Introduction Recent work (Grimm et al., 2017) proposed K-Fold Cross Validation (KFCV) for Longitudinal Finite Mixture Models (LFMMs). However, when using Expectation Maximization, the reduced information available in each EM iteration for each KFCV may result in incorrect model selection. Simulation Methods Separate 3 and 4 class sets were simulated for LFMMs with 6 time points for a binomial outcome (354 total simulated conditions): Basic random intercept logit model: ηij=β0j+β1 𝑡𝑖𝑚𝑒 𝑖𝑗 +β2 𝑡𝑖𝑚𝑒 𝑖𝑗 2 +εij   β0j=γ00+U0j {.1, .3, 1.0} ∈ Random Intercept σ² {200, 500, 800} ∈ Latent Class n Results & Conclusions 70% of known 3 class simulations had the lowest AIC (best fit) while only 41% of known 4 class solutions had the lowest AIC (best fit), p = .06. KFCV may not always be appropriate in EM-based LFMMs, particularly if there is greater unobserved heterogeneity in the available data. 3 Class Model Simulation Parameters 4 Class Model Simulation Parameters Class β0j β1 β2 1 -3 3 0.05 -2 2 -0.5 -0.1 -0.01 -5 0.2 0.1   4 -8 0.6

A Simulation and Cautionary Assessment Performance of K-Fold Cross Validation for Binary Longitudinal Finite Mixture Models: A Simulation and Cautionary Assessment Thom Taylor, PhD Nicklaus Children’s Research Institute, Miami, FL VA Palo Alto Health Care System, Palo Alto, CA Introduction Recent work (Grimm et al., 2017) proposed K-Fold Cross Validation (KFCV) for Longitudinal Finite Mixture Models (LFMMs). However, when using Expectation Maximization, the reduced information available in each EM iteration for each KFCV may result in incorrect model selection. Simulation Methods Separate 3 and 4 class sets were simulated for LFMMs with 6 time points for a binomial outcome (354 total simulated conditions): Basic random intercept logit model: ηij=β0j+β1 𝑡𝑖𝑚𝑒 𝑖𝑗 +β2 𝑡𝑖𝑚𝑒 𝑖𝑗 2 +εij   β0j=γ00+U0j {.1, .3, 1.0} ∈ Random Intercept σ² {200, 500, 800} ∈ Latent Class n Results & Conclusions 70% of known 3 class simulations had the lowest AIC (best fit) while only 41% of known 4 class solutions had the lowest AIC (best fit), p = .06. KFCV may not always be appropriate in EM-based LFMMs, particularly if there is greater unobserved heterogeneity in the available data. 3 Class Model Simulation Parameters 4 Class Model Simulation Parameters Class β0j β1 β2 1 -3 3 0.05 -2 2 -0.5 -0.1 -0.01 -5 0.2 0.1   4 -8 0.6

A Simulation and Cautionary Assessment Performance of K-Fold Cross Validation for Binary Longitudinal Finite Mixture Models: A Simulation and Cautionary Assessment Thom Taylor, PhD Nicklaus Children’s Research Institute, Miami, FL VA Palo Alto Health Care System, Palo Alto, CA Introduction Recent work (Grimm et al., 2017) proposed K-Fold Cross Validation (KFCV) for Longitudinal Finite Mixture Models (LFMMs). However, when using Expectation Maximization, the reduced information available in each EM iteration for each KFCV may result in incorrect model selection. Simulation Methods Separate 3 and 4 class sets were simulated for LFMMs with 6 time points for a binomial outcome (354 total simulated conditions): Basic random intercept logit model: ηij=β0j+β1 𝑡𝑖𝑚𝑒 𝑖𝑗 +β2 𝑡𝑖𝑚𝑒 𝑖𝑗 2 +εij   β0j=γ00+U0j {.1, .3, 1.0} ∈ Random Intercept σ² {200, 500, 800} ∈ Latent Class n Results & Conclusions 70% of known 3 class simulations had the lowest AIC (best fit) while only 41% of known 4 class solutions had the lowest AIC (best fit), p = .06. KFCV may not always be appropriate in EM-based LFMMs, particularly if there is greater unobserved heterogeneity in the available data. 3 Class Model Simulation Parameters 4 Class Model Simulation Parameters Class β0j β1 β2 1 -3 3 0.05 -2 2 -0.5 -0.1 -0.01 -5 0.2 0.1   4 -8 0.6