Download presentation
Presentation is loading. Please wait.
Published bySydney Hunter Modified over 8 years ago
1
DURATION ANALYSIS Eva Hromádková, 9.12.2010 Applied Econometrics JEM007, IES Lecture 9
2
Duration analysis Introduction Model the length of time spent in a given state before transition to another state - spell duration Unemployment, life, insurance status Nice application of panel data Issues: 1. Distributional assumptions on probability of transition (increasing/decreasing with time, etc.) 2. Sampling schemes Flow sampling: e.g. sample of those who enter unemployment in a given period Stock sampling: e.g. people unemployed in a given period Population sampling: e.g. all people regardless of employment status
3
Duration analysis Introduction Issues (continued): 3. info about duration can be censored: in some cases we do not observe “end” 4. possibility of multiple states (unemp, emp, out-of LF) 5. possibility of multiple spells 6. different applications have different focus: biostatistics – survival time, physics – failure time, economics – recidivism, length of match (employment, marriage, etc.)
4
Duration analysis Basic concepts Duration of spell T – nonnegative random variable Cumulative distribution function – F(t) (picture) Density function – dF(t)/dt (picture) Prob (duration of spell is less than t ) = F(t) = P(T<t) Survivor function - Prob (duration of spell is more than t) S(t)=P(T>t)=1-F(t) Hazard function λ(t) – instantaneous probability of leaving a state conditional on survival to time t Cumulative hazard function Different for discreet data (years, weeks)
5
Duration analysis Censoring I Types: Right: duration of spell is above certain value, but we do not know by how much (e.g. we don’t observe the end) Left: duration of spell is below a certain value, but we do not know by how much (e.g. we don’t observe the beginning) Interval: duration of spell falls in the time interval Type I censoring: all durations are censored above fixed time t c, e.g. 5000 x testing for light bulbs Type II censoring: study ceases after kth failure => only k complete spells and all other censored
6
Duration analysis Censoring II
7
Duration analysis Censoring III Random censoring: Observed T i = min(T * i, C * i ) Indicator for non-censoring: δ i =1[T * i <C * i ] Each individual has (potential) duration spell T * i and censoring time C * i Independent: determinants of censoring aren’t informative about duration
8
useful for descriptive statistics: Without censoring: survival function S(t) = # spells of duration <t/N With censoring: t 1 <t 2 <…<t j <…<t k -failure time d j - # spells ending at time t j m j - # of spells right-censored at t j -t j+1 r j - # of spells at risk at time t j r j = (d j + m j ) + … + (d k + m k ) = Σ l>j (d l + m l ) Duration analysis Non-parametric estimates I.
9
Duration analysis Non-parametric estimates II. Hazard function: λ j =d j /r j # spells ending at time t j out of all that have been at risk Survival function – Kaplan Meier estimator: S(t) = Π j | tj <t (1-λ j ) = Π j | tj<t (r j -d j )/r j Notes: adjustment for grouping of data (censoring can occur progressively over the interval)
10
Duration analysis Parametric estimates I. Exponential distribution: h(t) = λ – constant prob. of leaving state, memoryless property Survival function: S(t) = e -λt Weibull distribution: more general h(t)= λαt α-1; S(t) = e -λt α α>1 – h(t) is increasing, α<1 – h(t) is decreasing Other: Gomperts (biostatistics); log-normal & log-logistic (hazard first increases and then decreases with t); gamma Regressors are introduced by letting λ= e xβ with α left as constant
11
Duration analysis Maximum Likelihood Estimation Assumption: time invariant regressors X (vary over individuals, not over the length of spell) Uncensored data are observed with prob f(t|x,θ) Censored data are observed with prob P(T>t)= S(t|x,θ) Thus, probability of each observation is: f(t|x,θ) δ i x S(t|x,θ) 1-δ i where δ i =1 if no censoring We are trying to find θ that maximizes sum of probabilities – i.e. likelihood that we observed current actual realization ln L(θ) = Σ i=1,…, N [δ i ln f(t i |x i,θ) + (1-δ i )ln S(t i |x i,θ)]
12
Duration analysis Maximum Likelihood Estimation II Components of Likelihood Each type of observation contributes to likelihood Complete durations:f(t) Left truncated at t L (t>t L ):f(t)/S(t L ) Left censored at t CL :1-S(t CL ) Right censored at t CR :S(t CR ) Right truncated at t R (t<tr):f(t R )/[1-S(t R )] Interval censored at t CL, t CR :S(t CL )-S(t CR )
13
Duration analysis Cox model I Proportional hazard model: λ(t|x,β) = λ 0 (t) Φ(x,β) Semi-parametric model: Functional form for baseline hazard λ 0 (t) is unspecified Functional form for Φ(x,β) is fully specified – usually exponential form exp(xβ) Interpretation of coefficients: change in x Discreet: λ(t|x new,β) = λ 0 (t) exp(xβ+β j ) = exp(β j ) λ(t|x,β) Continuous: dλ(t|x,β) /dx j = λ 0 (t) exp(xβ)*β j
14
Duration analysis Cox model II Set-up: t 1 < t 2 <…< t j <…< t k – k discreet failure times R(t j ) = {l: t l > t j }- set of spells at risk at t j D(t j ) = {l: t l = t j }- set of spells completed at t j dj= Σ l 1(t l = t j )- # of spells completed at t j Probability of a particular spell at risk ending at time t j : Baseline hazard dropped out!!
15
Duration analysis Cox model III We must control for tied durations (i.e. more than 1 failure at a given time) Product of individual probabilities within R(t j ) Partial likelihood function: joint product of failure probabilities over failure times
16
Duration analysis Time varying coefficients Problem: If the time variation is endogenous – feedback duration of unemployment job search strategy Basic case: external time variation Solution: Very rough: replace variable by average value Within Cox approach, what matters at each failure time tj is the value of regressor x(t j ) for those observations in the risk set R(t j ) multiple observations for each subject
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.