Discrete-Time Survival Analysis PRESENTED BY CHIEN-TI LEE SEPTEMBER 12, 2014.

Slides:



Advertisements
Similar presentations
Allison Dunning, M.S. Research Biostatistician
Advertisements

If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/
Controlling for Time Dependent Confounding Using Marginal Structural Models in the Case of a Continuous Treatment O Wang 1, T McMullan 2 1 Amgen, Thousand.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Departments of Medicine and Biostatistics
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
From last time….. Basic Biostats Topics Summary Statistics –mean, median, mode –standard deviation, standard error Confidence Intervals Hypothesis Tests.
The Two Sample t Review significance testing Review t distribution
Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Statistics 303 Chapter 9 Two-Way Tables. Relationships Between Two Categorical Variables Relationships between two categorical variables –Depending on.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Lecture 24: Thurs., April 8th
Today Concepts underlying inferential statistics
Chapter 7 Correlational Research Gay, Mills, and Airasian
Linear Regression/Correlation
Decision Tree Models in Data Mining
Relationships Among Variables
The Two Sample t Review significance testing Review t distribution
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
Mixture Modeling Chongming Yang Research Support Center FHSS College.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Inference for regression - Simple linear regression
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Simple Linear Regression
Statistics for clinical research An introductory course.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
1 Survival Analysis Biomedical Applications Halifax SAS User Group April 29/2011.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
G Lecture 121 Analysis of Time to Event Survival Analysis Language Example of time to high anxiety Discrete survival analysis through logistic regression.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Understanding Statistics
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
Sampling Class 7. Goals of Sampling Representation of a population Representation of a population Representation of a specific phenomenon or behavior.
01/20151 EPI 5344: Survival Analysis in Epidemiology Survival curve comparison (non-regression methods) March 3, 2015 Dr. N. Birkett, School of Epidemiology,
Linear correlation and linear regression + summary of tests
Copyright restrictions may apply Predictive Values of Psychiatric Symptoms for Internet Addiction in Adolescents: A 2-Year Prospective Study Ko C-H, Yen.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
HSRP 734: Advanced Statistical Methods July 31, 2008.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
Lecture 12: Cox Proportional Hazards Model
Survival Analysis approach in evaluating the efficacy of ARV treatment in HIV patients at the Dr GM Hospital in Tshwane, GP of S. Africa Marcus Motshwane.
BC Jung A Brief Introduction to Epidemiology - XIII (Critiquing the Research: Statistical Considerations) Betty C. Jung, RN, MPH, CHES.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
Instructor Resource Chapter 13 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Nonparametric Statistics
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Additional Regression techniques Scott Harris October 2009.
BPS - 5th Ed. Chapter 231 Inference for Regression.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Methods of Presenting and Interpreting Information Class 9.
Nonparametric Statistics
BINARY LOGISTIC REGRESSION
An introduction to Survival analysis and Applications to Predicting Recidivism Rebecca S. Frazier, PhD JBS International.
Logistic Regression APKC – STATS AFAC (2016).
Discrete-Time Survival Analysis
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Multiple logistic regression
Nonparametric Statistics
Presentation transcript:

Discrete-Time Survival Analysis PRESENTED BY CHIEN-TI LEE SEPTEMBER 12, 2014

Purpose? To study the probability (or hazard), of experiencing an event. ▪ Unlike logistic regression, it takes into account “time” until the event occurs. ▪ It is also different from continuous-time survival analysis (e.g., cox regression) in the following ways: ▪ The data are only collected in time intervals (vs. the exact time an event occurred) ▪ Does not assume hazard-related probability ▪ Probability of hazard: The shape of the survival function over time is the same for all cases/groups ▪ Can be extended with time-varying covariates, mixture components, and distal outcomes etc.

Survival Probability vs. Hazard Probability ▪ T stands for the time interval of the event ▪ Survival Probability= S (J) ▪ J = The time interval in which the event occurs ▪ S (J) = P(T > J) means that the probability of surviving beyond time interval J ▪ Hazard probability= H (J) ▪ H (J) = P(T= J|T ≥ J) means that the probability of the event occurs in the time interval J, provided it has not occurred prior to j ▪ It is the probability of the event occurring in the interval j among those at risk in j ▪ S (J) = P(T ≠J|T≥J)P(T≠J-1|T≥ J-1)…P(T≠2 | T≥2) P(T≠1 | T≥1) = ∏ [1- h (k) ]; {k = 1 to a} ▪ ∏ = product of all values in range of series

First you must…Restructure the Dataset Using the “Data Survival” Command in Mplus to Save Time in Restructuring (See Mplus Manual 7, p.379) ▪ To create variables for discrete-time survival modeling where a binary discrete-time survival variable represents whether or not a single *non-repeatable* event has occurred in a specific time period ▪ Here are the rules… 1.If the original variable is missing, the new binary variable is missing 2.If the value of the original variable is *greater than* the cutpoint value, the new binary value is “1” indicating that the event has occurred 3.If the value of the original variable is less than or equal the cutpoint value, the new binary value is “0” indicating that the event has not occurred 4.After a discrete-time survival variable for an observation is assigned the value “1”, then subsequent discrete-time survival variables for that observation are assigned the value of the missing value flat “ * ”.

Transformation Dep_w1Dep_w2Dep_w3Dep_w4Dep_w *3 23*** * Dep_w1Dep_w2Dep_w3Dep_w4Dep_w5 1**** 01*** 01*** * 0000* BeforeAfter

What about…Missing Data ▪ In the context of survival analysis, missing data usually refers to the event times for some subject that are unknown to the researcher (aka: censoring). ▪ Right Censoring ▪ When a subject has not experienced the event at the end of the observation ▪ Left Censoring ▪ When a subject has experienced the event before you began the observation ▪ It is a very rare phenomenon ▪ The focus of survival analysis is about what happens when risk exposure begins

Truncation ▪ Left truncation often arises when patient information, such as time of diagnosis, is gathered retrospectively. ▪ For example, in a study of disease mortality where the outcome of interest is survival from the time of diagnosis, many patients may not have been enrolled in the study until several months or years after their diagnosis. Those patients, by virtue of having survived to the time of enrollment, could not have had an event between diagnosis and the study enrollment, and therefore they should be removed from the risk set between those two time points. To leave them in the risk set would bias the survival estimates. ▪ Right truncation happens when the individuals whose event time are less than some truncation threshold. ▪ For example, the experiment wants to study the effect of smoking before college. Your question for a group of participants, “when do you start smoking”, can effectively truncate the participants who start smoking after going to college.

Missing Data…continued ▪ Mplus can deal with right-censored data. ▪ Left-censored data is not supported for survival analysis in Mplus. ▪ Left-censoring is a very rare phenomenon ▪ The focus of survival analysis is about what happens when risk exposure begins ▪ So it makes sense to use it for marijuana use but maybe not for potty training (left-censored data) in adolescence.

Mplus Syntax – Example 6.20 TITLE:this is an example of a *continuous-time* survival analysis using the Cox regression model DATA: FILE = ex6.20.dat; VARIABLE: NAMES = t x tc; ! x is the predictor SURVIVAL = t (ALL); ! t is the variable that contains time-to-event information TIMECENSORED = tc (0 = NOT 1 = RIGHT); ! Information about right censoring ANALYSIS: BASEHAZARD = OFF; ! Non-parametric baseline hazard function is used MODEL:t ON x;

Mplus Syntax – Example 6.19 TITLE: this is an example of a discrete-time survival analysis DATA: FILE IS ex6.19.dat; VARIABLE: NAMES ARE u1-u4 x; CATEGORICAL = u1-u4; MISSING = ALL (999); ANALYSIS: ESTIMATOR = MLR; MODEL: f BY ! This represents a proportional odds assumption where the covariate x has the same influence on u1-u4 f ON x; ! Residual variance is fixed at zero (default)

Title: YTP project Data: file = "C:\Users\cl396\Documents\Research Work\Taiwanese Work\USU\TYP\TEST_SURVIVALDAT.dat"; Variable: names are confirm indthk respect harmony ID urban SUBURBAN AGE gender income pedu relation ASSESSNE DEP1 DEP2 DEP3 DEP4 DEP5 p1-p5 class; missing are all (-9999); usevariables are gender income pedu urban relation confirm indthk respect harmony dep1-dep5; categorical are dep1-dep5; DSURVIVAL = dep1-dep5; Analysis: estimator = MLR; starts = ; optseed=476295; processors=8 (starts); Model: f by f on gender income pedu urban relation confirm indthk respect harmony; Output: tech1 RESIDUAL; Plot: type is plot2;

Background ▪ The purpose of this study was to identify patterns of cultural values among Taiwanese youth and explore the relationship of these value affiliations with emerging depressive symptoms at later time points. ▪ Previous studies showed for Taiwanese youth, the well-being issue is of particular significance due to competitive educational environment” (Yi, Wu, Wu, Chang & Chang, 2009, p. 399).” ▪ However, others have shown that there was no difference with the prevalence of depression between a Chinese and a U.S. sample (Chen, Rubin, Li, 1995; Stewart et al, 2002). ▪ Academic strain is a cultural norm and therefore does not necessary lead to increased depressive symptoms

Cultural Values ▪ Social Conformity—meeting parental (scholastic) expectations ▪ Fail to meet parental academic expectations, family tension, which has been linked with depressive symptoms may increase. ▪ Interdependent thinking—receiving parental approval before making important decisions ▪ Western literature generally suggests that individuals are more likely to feel depressed when they perceive constraints restricting them from reaching a desire outcome or exerting undue influence on important, personal decisions ▪ Vertical obedience—conforming to the wishes of parents and authority figures ▪ Chinese adolescents’ attitudes have shifted to the point that many believe that parents do not have absolute authority and power over them ▪ Chinese parents, compared to western families, still exact more parental influence on their adolescent children ▪ Harmony maintenance—being socially sensitive and self-restraining to ensure peace in relationship ▪ From a Chinese cultural perspective, free emotional expressions, particularly negative ones, may disrupt harmony within the family dynamic

Model Diagram Created by Mplus

Model Results F ON Est. S.E. Est./S.E. p value (2-tailed) GENDER INCOME PEDU URBAN RELATION CONFIRM INDTHK RESPECT HARMONY Thresholds DEP1$ DEP2$ DEP3$ DEP4$ DEP5$

Estimated Baseline Survival Curve for Emerging Depressive Symptoms with Covariates

Estimated Survival Curve for Emerging Depressive Symptoms with Covariates

Model Diagram f Dep_w1 Dep_w2 Dep_w3 Dep_w4 Dep_w6 c Social Conformity Vertical Obedience Harmony Independent Thinking Control Variables Mplus Diagram Currently Does Not Support Mixture Analysis

Estimated means of cultural values across class memberships (N = 2,458). The 5-class average latent class probabilities for the most likely latent class membership were.99,.88,.87,.82, and.80 respectively, indicating acceptable prediction of class membership.

TITLE: YTP project Data: file = "C:\Users\cl396\Desktop\LPA_5class_n.dat"; variable: names are CONFIRM INDTHK RESPECT HARMONY ID1 URBAN SUBURBAN AGE GENDER INCOME PEDU RELATION ASSESSNE DEP1_D DEP2_D DEP3_D DEP4_D DEP6_D CPROB1 CPROB2 CPROB3 CPROB4 CPROB5 kc; missing are all (-9999); usevariables are PEDU URBAN GENDER INCOME ASSESSNE DEP1_D DEP2_D DEP3_D DEP4_D DEP6_D relation confirm indthk respect harmony CPROB1 CPROB2 CPROB3 CPROB4 CPROB5; Categorical are DEP1_D DEP2_D DEP3_D DEP4_D DEP6_D; DSURVIVAL = DEP1_D DEP2_D DEP3_D DEP4_D DEP6_D; knownclass= c1(kc=1 kc=2 kc=3 kc=4 kc=5); classes=c (5); training = CPROB1 CPROB2 CPROB3 CPROB4 CPROB5 (probabilities); Analysis: Type=mixture; Estimator = MLR; ALGORITHM = INTEGRATION; starts = ; optseed= ; processors=8 (starts); Model: %overall% f by DEP1_D DEP2_D DEP3_D DEP4_D f on gender RELATION urban pedu INCOME ASSESSNE; Output: tech1 LOGRANK; Plot: type is plot2;

Logrank Outputs LOGRANK OUTPUT LOGRANK TEST FOR DEP1_D-DEP6_D COMPARING CLASS 2 AGAINST CLASS 1 Chi-Square Value Degrees of Freedom 1 P-value 0.000

Baseline Survival Curve for Emerging Depressive Symptoms Stratified by Class Membership

Estimated Survival Curve for Emerging Depressive Symptoms with Covariates Stratified by Class Membership

Estimated survival rate of each class membership across time points (N = 2,310). Note: The decreased sample size in Fig 2 is due to the missing value in control variables (e.g., family income, parental education, family relationships). Chi-Squarep-value Class 2 vs. Class <.0001 Class 3 vs. Class Class 4 vs. Class Class 5 vs. Class Class 3 vs. Class Class 4 vs. Class Class 5 vs. Class Class 4 vs. Class Class 5 vs. Class Class 4 vs. Class Summary Table of Chi-square Tests for Estimated Survival Rate between Each Pair of Class Membership.

Findings ▪ Adolescents in Class 1 (3% of the sample) consists of 42.5 % females with a mean age of (SD =.55) and had a medium/low mean score across cultural values variables. ▪ It is the class most likely to develop depressive. ▪ Class 4 included 18 % of the sample, 46.7 % of whom were female, and the average age was (SD =.50). Adolescents in this group had the highest mean scores across domains of traditional cultural values including interdependent thinking. ▪ This Class most closely resembles that of traditional Eastern societies. ▪ They also had the least likely chance of developing depressive symptoms over time.

Resources ▪ val/default.htm val/default.htm ▪ ▪