Judith D. Singer & John B. Willett Harvard Graduate School of Education Extending the discrete-time hazard model ALDA, Chapter Twelve “Some departure from.

Slides:



Advertisements
Similar presentations
Judith D. Singer & John B. Willett Harvard Graduate School of Education Discrete-time survival analysis ALDA, Chapters 10, 11, and 12 Times change, and.
Advertisements

Multilevel Event History Modelling of Birth Intervals
Cross Sectional Designs
Random Assignment Experiments
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Basic Ideas of Statistics Unit 1.1 Basic Ideas of Statistics Corresponds to Chapter 1 in Triola.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
HSRP 734: Advanced Statistical Methods July 24, 2008.
Race and Socioeconomic Differences in Health Behavior Trajectories Across the Adult Life Course ACKNOWLEDGEMENTS This research was supported by the grant.
Extending the multilevel model for change ALDA, Chapter Five
Judith D. Singer & John B. Willett Harvard Graduate School of Education Modeling discontinuous and nonlinear change ALDA, Chapter Six “Things have changed”
John B. Willett & Judith D. Singer Harvard Graduate School of Education Introducing discrete-time survival analysis ALDA, Chapter Eleven “To exist is to.
Chapter 13 Multiple Regression
Stat 112: Lecture 10 Notes Fitting Curvilinear Relationships –Polynomial Regression (Ch ) –Transformations (Ch ) Schedule: –Homework.
Chapter 12 Multiple Regression
John B. Willett & Judith D. Singer Harvard Graduate School of Education Introducing the Multilevel Model for Change: ALDA, Chapter Three “When you’re finished.
Age and the Social Stratification of Long-Term Trajectories of Physical Activity ACKNOWLEDGEMENTS This research was supported by the grant R01 AG
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
1 G Lect 11M Binary outcomes in psychology Can Binary Outcomes Be Studied Using OLS Multiple Regression? Transforming the binary outcome Logistic.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Women, Minorities, and Technology Jacquelynne Eccles (PI), Pamela Davis-Kean (co-PI), and Oksana Malanchuk University of Michigan.
Model Checking in the Proportional Hazard model
Unit 5c: Adding Predictors to the Discrete Time Hazard Model © Andrew Ho, Harvard Graduate School of EducationUnit 5c– Slide 1
Unit 5c: Adding Predictors to the Discrete Time Hazard Model © Andrew Ho, Harvard Graduate School of EducationUnit 5c– Slide 1
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
Inference for regression - Simple linear regression
Chapter 5 Sampling Distributions
1 Chapter 10: Section 10.1: Vocabulary of Hypothesis Testing.
Simple Linear Regression
G Lecture 121 Analysis of Time to Event Survival Analysis Language Example of time to high anxiety Discrete survival analysis through logistic regression.
© Willett & Singer, Harvard University Graduate School of Education S077/Week #4– Slide 1 S077: Applied Longitudinal Data Analysis Week #4: What Are The.
Copyright © 2010 Pearson Education, Inc. Slide
CHILDREN AND FAMILIES: TIME MANAGEMENT AND PERCEPTION OF STRESS Elsa Fontainha ISEG – Technical University of Lisbon – Portugal 3 rd International Conference.
Using the Margins Command to Estimate and Interpret Adjusted Predictions and Marginal Effects Richard Williams
Unit 5b: The Logistic Regression Approach to Life Table Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 5b– Slide 1
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
Tobacco Control Research Conference July 2014 Determinants of smoking initiation in South Africa Determinants of smoking initiation in South Africa.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
© Willett & Singer, Harvard University Graduate School of EducationS077/Week #3– Slide 1 S077: Applied Longitudinal Data Analysis Week #3: What Topics.
Multiple Regression 3 Sociology 5811 Lecture 24 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
“Further Modeling Issues in Event History Analysis by Robert E. Wright University of Strathclyde, CEPR-London, IZA-Bonn and Scotecon.
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Copyright © 2009 Cengage Learning 18.1 Chapter 20 Model Building.
© Willett & Singer, Harvard University Graduate School of Education S077/Week #5– Slide 1 S077: Applied Longitudinal Data Analysis Week #5: What Are The.
Issues concerning the interpretation of statistical significance tests.
Predicting Stage Transitions in the Development of Nicotine Dependence Carolyn E. Sartor, Hong Xian, Jeffrey F. Scherrer, Michael Lynskey, William True,
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
ALISON BOWLING MAXIMUM LIKELIHOOD. GENERAL LINEAR MODEL.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Beginners statistics Assoc Prof Terry Haines. 5 simple steps 1.Understand the type of measurement you are dealing with 2.Understand the type of question.
© Willett, Harvard University Graduate School of Education, 6/13/2016S052/II.2(a3) – Slide 1 S052/II.2(a3): Applied Data Analysis Roadmap of the Course.
SECTION 1 TEST OF A SINGLE PROPORTION
Nonparametric Tests PBS Chapter 16 © 2009 W.H. Freeman and Company.
1 BUSI 6220 By Dr. Nick Evangelopoulos, © 2012 Brief overview of Linear Regression Models (Pre-MBA level)
Chapter 15 Multiple Regression Model Building
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Chapter 5 Sampling Distributions
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Judith D. Singer & John B. Willett Harvard Graduate School of Education Extending the discrete-time hazard model ALDA, Chapter Twelve “Some departure from the norm will occur as time grows more open about it” John Ashbery

Chapter 12: Extending the discrete-time hazard model Alternative specifications for TIME in the discrete-time hazard model (§12.1)—must we always use the TIME indicators or might a more parsimonious representation for TIME be nearly as good? Including time-varying predictors (§12.3)—as in growth modeling, the use of the person-period data set makes them easy to include (although be careful with interpretations) Evaluating the assumptions of the discrete-time hazard model—like all statistical models, these invoke important assumptions that should be examined (and if necessary relaxed): Linear additivity assumption (§12.4)—must all predictors operate only as “main effects” or can there be interactions? Proportionality assumption (§12.5)—must the effects of all predictors be constant over time? Alternative specifications for TIME in the discrete-time hazard model (§12.1)—must we always use the TIME indicators or might a more parsimonious representation for TIME be nearly as good? Including time-varying predictors (§12.3)—as in growth modeling, the use of the person-period data set makes them easy to include (although be careful with interpretations) Evaluating the assumptions of the discrete-time hazard model—like all statistical models, these invoke important assumptions that should be examined (and if necessary relaxed): Linear additivity assumption (§12.4)—must all predictors operate only as “main effects” or can there be interactions? Proportionality assumption (§12.5)—must the effects of all predictors be constant over time?

Pros and cons of the dummy specification for the “main effect of TIME”? (ALDA, Section 12.1, pp ) The dummy specification for TIME is: Completely general, placing no constraints on the shape of the baseline (logit) hazard function; Easily interpretable—each associated parameter represents logit hazard in time period j for the baseline group Consistent with life-table estimates PRO The dummy specification for TIME is also: Nothing more than an analytic decision, not a requirement of the discrete-time hazard model Completely lacking in parsimony. If J is large, it requires the inclusion of many unknown parameters; A problem when it yields fitted functions that fluctuate erratically across time periods because of nothing more than sampling variation CON Three reasons for considering an alternative specification Your study involves many discrete time periods (because data collection is long or time is less coarsely discretized) Hazard is expected to be near 0 in some time periods (causing convergence problems) Some time periods have small risk sets (because either the initial sample is small or hazard and censoring dramatically diminish the risk set over time) Three reasons for considering an alternative specification Your study involves many discrete time periods (because data collection is long or time is less coarsely discretized) Hazard is expected to be near 0 in some time periods (causing convergence problems) Some time periods have small risk sets (because either the initial sample is small or hazard and censoring dramatically diminish the risk set over time) The variable PERIOD in the person-period data set can be treated as continuous TIME

An ordered set of smooth polynomial representations for TIME Not necessarily “the best,” but practically speaking a very good place to start (ALDA, Section , pp ) Completely general spec always the “best fitting” model (lowest Deviance) Constant spec always the “worst fitting” model (highest Deviance) Use of ONE facilitates programming Polynomial specifications As in growth modeling, a systematic set of choices Choose centering constant “c” to ease interpretation Because each lower order model is nested within each higher order model, Deviance statistics can be directly compared to help make analytic decisions The 4 th and 5 th order polynomials are rarely adopted, but give you a sense of whether you should stick with the completely general specification.

Illustrative example: Time to tenure in colleges and universities Sample: 260 faculty members (who had received a National Academy of Education/Spencer Foundation Post-Doctoral Fellowship) Research design: Each was tracked for up to 9 years after taking his/her first academic job By the end of data collection, n=166 (63.8%) had received tenure; the other 36.2% were censored (because they might eventually receive tenure somewhere). For simplicity, we won’t include any substantive predictors (although the study itself obviously did) Sample: 260 faculty members (who had received a National Academy of Education/Spencer Foundation Post-Doctoral Fellowship) Research design: Each was tracked for up to 9 years after taking his/her first academic job By the end of data collection, n=166 (63.8%) had received tenure; the other 36.2% were censored (because they might eventually receive tenure somewhere). For simplicity, we won’t include any substantive predictors (although the study itself obviously did) Data source: Beth Gamse and Dylan Conger (1997) Abt Associates Report (ALDA, Section p 412)

Examining alternative polynomial specification for TIME : Deviance statistics and fitted logit hazard functions (ALDA, Section , pp ) The quadratic looks reasonably good, but can we test whether it’s “good enough”? As expected, deviance declines as model becomes more general General Constant Linear Quadratic Cubic

Testing alternative polynomial specification for TIME : Comparing deviance statistics (and AIC and BIC statistics) across nested models (ALDA, Section , pp ) Two comparisons always worth making Is the added polynomial term necessary? Is this polynomial as good as the general spec? Lousy Better, but not as good as general As good as general, better than linear No better than linear Clear preference for quadratic (although cubic has some appeal)

Including time-varying predictors: Age of onset of psychiatric disorder Sample: 1,393 adults ages 17 to 57 (drawn randomly through a phone survey in metropolitan Toronto) Research design: Each was ask whether and, if so, at what age (in years) he or she had first experienced a depressive episode n=387 (27.8%) reported a first onset between ages 4 and 39 Time-varying question predictor: PD, first parental divorce n=145 (10.4%) had experienced a parental divorce while still at risk of first depression onset PD is time-varying, indicating whether the parents of individual i divorced during, or before, time period j. PD ij =0 in periods before the divorce PD ij =1 in periods coincident with or subsequent to the divorce Additional time-invariant predictors: FEMALE – which we’ll use now NSIBS (total number of siblings)—which we’ll use in a few minutes Sample: 1,393 adults ages 17 to 57 (drawn randomly through a phone survey in metropolitan Toronto) Research design: Each was ask whether and, if so, at what age (in years) he or she had first experienced a depressive episode n=387 (27.8%) reported a first onset between ages 4 and 39 Time-varying question predictor: PD, first parental divorce n=145 (10.4%) had experienced a parental divorce while still at risk of first depression onset PD is time-varying, indicating whether the parents of individual i divorced during, or before, time period j. PD ij =0 in periods before the divorce PD ij =1 in periods coincident with or subsequent to the divorce Additional time-invariant predictors: FEMALE – which we’ll use now NSIBS (total number of siblings)—which we’ll use in a few minutes Data source: Blair Wheaton and colleagues (1997) Stress & adversity across the life course (ALDA, Section 12.3, p 428)

Including a time-varying predictor in the person-period data set (ALDA, Section 12.3, p 428) ID PERIOD PD FEMALE NSIBS EVENT ID 40: Reported first depression onset at 23; first parental divorce at age 9 Many periods per person (because annual data from age 4 to respondent’s current age, up to age 39) In fact, there are 36,997 records in this PP data set and only 387 events—would we really want to include 36 TIME dummies? First depression onset at age 23 PD is time-varying: Her parents divorced when she was 9 Turns out that a cubic function of TIME fits nearly as well as the completely general specification (  2 =34.51, 32 df, p>.25) and measurably better than a quadratic (  2 =5.83, 1 df, p<.05) FEMALE and NSIBS are time-invariant predictors that we’ll soon use

Including a time-varying predictor in the discrete-time hazard model (ALDA, Section , p ) What does  1 tell us ? Contrasts the population logit hazard for people who have experienced a parental divorce with those who have not, But because PD ij is time-varying, membership in the parental divorce group changes over time so we’re not always comparing the same people The predictor effectively compares different groups of people at different times! But, we’re still assuming that the effect of the time-varying predictor is constant over time. What does  1 tell us ? Contrasts the population logit hazard for people who have experienced a parental divorce with those who have not, But because PD ij is time-varying, membership in the parental divorce group changes over time so we’re not always comparing the same people The predictor effectively compares different groups of people at different times! But, we’re still assuming that the effect of the time-varying predictor is constant over time. Sample logit(proportions) of people experiencing first depression onset at each age, by PD status at that age Hypothesized population model (note constant effect of PD) Implicit particular realization of population model (for those whose parents divorce when they’re age 20)

Interpreting a fitted DT hazard model that includes a TV predictor (ALDA, Section , pp ) e =1.51  Controlling for gender, at every age from 4 to 39, the estimated odds of first depression onset are about 50% higher for individuals who experienced a concurrent, or previous, parental divorce e =1.73  Controlling for parental divorce, the estimated odds of first depression onset are 73% higher for women What about a woman whose parents divorced when she was 20?

Using time-varying predictors to test competing hypotheses about a predictor’s effect: The long term vs short term effects of parental death on first depression onset Age fitted hazard Parental death treated as a short-term effect Odds of onset are 462% higher in the year a parent dies Age fitted hazard Parental death treated as a long-term effect Odds of onset are 33% higher among people who parents have died ID PERIOD PDEATH1 PDEATH PDEATH1 is the long term effect PDEATH2 is the short term effect

The linear additivity assumption: Uncovering violations and simple solutions (ALDA, Section 12.4, pp 443) Linear additivity assumption Unit differences in a predictor—time- invariant or time-varying—correspond to fixed differences in logit-hazard. Data source: Nina Martin & Margaret Keiley (2002) Sample: 1,553 adolescents (n=887, 57.1% had been abused as children) Research design: Incarceration history from age 8 to 18 n=342 (22.0.8%) had been arrested. RQs: What’s the effect of abuse on the risk of arrest? What’s the effect of race? Does the effect of abuse differ by race (or conversely, does the effect of race differ by abuse status)? Data source: Nina Martin & Margaret Keiley (2002) Sample: 1,553 adolescents (n=887, 57.1% had been abused as children) Research design: Incarceration history from age 8 to 18 n=342 (22.0.8%) had been arrested. RQs: What’s the effect of abuse on the risk of arrest? What’s the effect of race? Does the effect of abuse differ by race (or conversely, does the effect of race differ by abuse status)? Non-linear effects of substantive predictors Interactions among substantive predictors

Evidence of an interaction between ABUSE and RACE (ALDA, Section , pp ) What is the shape of the logit hazard functions? For all groups, Risk of 1 st arrest is low during childhood, accelerates during the teen years, and peaks between How does the level differ across groups? While abused children appear to be consistently at greater risk of 1 st arrest, but the differential is especially pronounced among Blacks As in regular regression, when the effect of one predictor differs by the levels of another, we need to include a statistical interaction

Interpreting the interaction between ABUSE and RACE (ALDA, Section , pp ) Estimated odds ratios for the 4 possible prototypical individuals In comparison to a White child who had not been abused, the odds of 1 st arrest are: 28% higher for Blacks who had not been abused (note: this is not stat sig.) 43% higher for Whites who had been abused (this is stat sig.) Nearly 3 times higher for Blacks who had been abused. This is not the only way to violate the linear additivity assumption…

Checking the linear additivity assumption: Is the effect of NSIBS on depression onset linear? (ALDA, Section , pp ) Use all your usual strategies for checking non-linearity: transform the predictors, use polynomials, re-bin the predictor, … (4) ns All models include a cubic effect of TIME, and the main effects of FEMALE and PD

The proportionality assumption: Is a predictor’s effect constant over time or might it vary? (ALDA, Section , pp ) Predictor’s effect is constant over time Predictor’s effect increases over time Predictor’s effect decreases over time Predictor’s effect is particularly pronounced in certain time periods

Discrete-time hazard models that do not invoke the proportionality assumption (ALDA, Section , pp ) A completely general representation: The predictor has a unique effect in each period A more parsimonious representation: The predictor’s effect changes linearly with time  1 assesses the effect of X 1 in time period c  2 describes how this effect linearly increases (if positive) or decreases (if negative) Another parsimonious representation: The predictor’s effect differs across epochs  2 assesses the additional effect of X 1 during those time periods declared to be “later” in time

The proportionality assumption: Uncovering violations and simple solutions (ALDA, Section 12.4, pp 443) Data source: Suzanne Graham (1997) dissertation Sample: 3,790 high school students who participated in the Longitudinal Survey of American Youth (LSAY) Research design: Tracked from 10 th grade through 3 rd semester of college—a total of 5 periods Only n=132 (3.5%) took a math class for all of the 5 periods! RQs: When are students most at risk of dropping out of math? What’s the effect of gender? Does the gender differential vary over time? Data source: Suzanne Graham (1997) dissertation Sample: 3,790 high school students who participated in the Longitudinal Survey of American Youth (LSAY) Research design: Tracked from 10 th grade through 3 rd semester of college—a total of 5 periods Only n=132 (3.5%) took a math class for all of the 5 periods! RQs: When are students most at risk of dropping out of math? What’s the effect of gender? Does the gender differential vary over time? Risk of dropping out zig-zags over time— peaks at 12 th and 2 nd semester of college Magnitude of the gender differential varies over time—smallest in 11 th grade and increases over time Suggests that the proportionality assumption is being violated

Checking the proportionality assumption: Is the effect of FEMALE constant over time? (ALDA, Section , pp ) All models include a completely general specification for TIME using 5 time dummies: HS11, HS12, COLL1, COLL2, and COLL (4) ns 6.50 (1) p=0.0108