Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 A Bayes method of a Monotone Hazard Rate via S-paths Man-Wai Ho National University of Singapore Cambridge, 9 th August 2007.

Similar presentations


Presentation on theme: "1 A Bayes method of a Monotone Hazard Rate via S-paths Man-Wai Ho National University of Singapore Cambridge, 9 th August 2007."— Presentation transcript:

1 1 A Bayes method of a Monotone Hazard Rate via S-paths Man-Wai Ho National University of Singapore Cambridge, 9 th August 2007

2 2 Agenda Overview of Estimation of Monotone Hazard Rates What is an S-path? A class of random monotone (non-decreasing or non-increasing) hazard rates –Posterior analysis via S-paths A Markov chain Monte Carlo (MCMC) method Numerical Examples

3 3 Hazard Rate / Hazard Function The hazard rate at time, is interpreted as the instantaneous probability of failure of an object. A wide variety of shapes: –The simplest case of a constant hazard rate corresponds to an exponential lifetime distribution. –Cases of increasing or decreasing hazard rate: lifetime distributions that are of a heavier or lighter tail, respectively, compared with an exponential distribution. t Lifetime ¸ ( t ) = l i m ± ! 0 P r ( t ·T < t + ± j T¸ t ) ±

4 4 Estimation of Monotone Hazard Rates – Bayesian Approach A non-increasing (or “decreasing”) hazard rate on the positive line may be written in the mixture form where is the indicator function of a set. The unknown measure is modeled as a random measure / random process. for a non-decreasing hazard rate. I ( A ) A ¹ R = ( 0 ; 1 ) ¸ ( t j ¹ ) = R R I ( t < u ) ¹ ( d u ) ; I ( t > u )

5 5 Motivation of this Work Bayes estimate (posterior mean) of an increasing hazard rate as a sum over e-vectors [Dykstra & Laud (1981)] or m-vectors [Lo & Weng (1989)] based on extended / weighted Gamma processes. Lo & Weng (1989) characterized the posterior distribution of any random hazard rate with being a weighted gamma process in terms of random partitions p. ¹ ¸ ( t j ¹ ) = R R k ( t ; u ) ¹ ( d u )

6 6 Motivation of this Work Recently, James (2002, 2005) generalized the result of Lo & Weng (1989) about general hazard rates by modeling as a completely random measure –Analogously obtained a characterization of the posterior distribution in terms of partitions p. –A completely random measure [Kingman (1967, 1993)] includes Gamma process, weighted Gamma process, generalized gamma process [Brix (1999)], stable process and many other random measures as special cases. ¹ ¸ ( t j ¹ ) = R R k ( t ; u ) ¹ ( d u )

7 7 Motivation of this Work Monotone Hazard Rates K erne l i s I ( t < u ) General Hazard Rates K erne l i s k ( t ; u ) Partitions structure S-paths structure ? ¹ i sawe i g h t e d G ammaran d ommeasure ¹ i sacomp l e t e l yran d ommeasure

8 8 What is an S-path? An integer-valued vector satisfying Denote S = ( S 0 ; S 1 ;:::; S n ¡ 1 ; S n ) j S j 0 n n ( i ) S 0 = 0 ( ii ) S n = n ( iii ) S j · j ( i v ) S j · S j + 1. m j = S j ¡ S j ¡ 1 ; f or j = 1 ;:::; n ) ( m 1 ;:::; m n )

9 9 What is an S-path? Served as an alternative for clustering integers provided that only (i) number of elements, and (ii) the maximum index, in each cluster are concerned. –Suppose The path conveys One cluster of 2 integers with 3 as the maximum index Another cluster of 2 integers with 4 as the maximum index A combinatorial reduction of p, e.g., if n n = 4 : n = 4 ; f ( 0 ; 0 ; 0 ; 2 ; 4 ) correspondence S = ( 0 ; 0 ; 0 ; 2 ; 4 ) p 1 ; p 2 2 C S p 1 = f( 1 ; 3 ) ; ( 2 ; 4 )g p 2 = f( 2 ; 3 ) ; ( 1 ; 4 )g

10 10 What is an S-path? One of the advantages of using S-paths over partitions: for a fixed, the space of all S is much less than the space of all p ( ) n ¼ n !

11 11 A Class of Random Decreasing Hazard Rate Consider a class of random “decreasing” hazard rates on defined by where is a completely random measure on. This class contains the decreasing counterpart of the models considered by Dykstra & Laud (1981) based on extended/weighted Gamma process ¹ R ¸ ( t j ¹ ) = R R I ( t < u ) ¹ ( d u ) ; t 2 R R = ( 0 ; 1 )

12 12 The Completely Random Measure An “Independent increment” process uniquely characterized by the Laplace functional Alternatively, can be represented in a distributional sense as where is a Poisson random measure on with intensity measure ¹ ¹ ( d u ) = R R z N ( d z ; d u ) N ( d z ; d u ) R £ R L ¹ ( g j ½ ; ´ ) = exp · ¡ Z R Z R ³ 1 ¡ e ¡ g ( u ) z ´ ½ ( d z j u ) ´ ( d u ) ¸ E [ N ( d z ; d u )] = ½ ( d z j u ) ´ ( d u )

13 13 The Data The data is observed upon a right-censorship scheme. Suppose we collect observations, denoted by, from items with monotone hazard rates until time. – are completely observed failure times, and – are the right-censored times. T 1 < ¢¢¢ < T n < ¿ N N ¿ T = ( T 1 ;:::; T n ;:::; T N ) T n + 1 = ¢¢¢ = T N ´ ¿

14 14 Posterior Analysis The likelihood is given by [Aalen (1975, 1978)] where The posterior distribution of is proportional to the product of the likelihood and the prior ¹ N ! ( N ¡ n ) ! " n Y i = 1 Z R I ( T i < u i ) ¹ ( d u i ) # exp [ ¡ ¹ ( g N )] ¹ ( g N ) = Z ¿ 0 " N X i = 1 I ( T i ¸ t ) # · Z R I ( t < w ) ¹ ( d w ) ¸ d t

15 15 Posterior Analysis A streamline proof: –Note –Augment the latent variables and work with the joint distribution of –Apply Proposition 2.3 in James (2005) to get a nice product form –recognize from the posterior distribution that the information carried by a partition about the remaining members other than the maximal element in any cell is irrelevant: when ¹ ( d u i ) = R R z i N ( d z i ; d u i ), i = 1 ;:::; n ( z i ; u i ), i = 1 ;:::; n ( z ; u ; N ; T ) p 2 C S n ( p ) Y i = 1 Y j 2 C i I ( T j < v i ) = n ( p ) Y i = 1 I ( max j 2 C i T j < v i ) = Y f j :m j > 0 g I ( T j < y j )

16 16 Posterior Analysis Theorem 1: The posterior law of given the data can be described by a three-step experiment: 1)An S-path has a distribution where for any integer and ¹ · i ( e ¡ f N ½ j u ) = R R z i e ¡ g N ( u ) z ½ ( d z j u ) < 1 ; i > 0 ; g N ( u ) = R ¿ 0 h P N i = 1 I ( T i ¸ t ) i I ( t < u ) d t T S = ( 0 ; S 1 ;:::; S n ¡ 1 ; n ) Z ( S ) = Á ( S )= P S Á ( S ), f N ( z ; u ) = g N ( u ) z Á ( S ) = Y f j :m j > 0 g µ j ¡ 1 ¡ S j ¡ 1 j ¡ S j ¶ Z 1 T j · m j ( e ¡ f N ½ j y ) ´ ( d y ) ;

17 17 Posterior Analysis 2)Given S, there exists independent pairs of, denoted by where is distributed as 3)Given has a distribution as where is a completely random measure characterized by ( y j ; Q j ) ( y ; Q ) = f( y j ; Q j ) :m j > 0 ; j = 1 ;:::; n g, P r f Q j 2 d z j S ; y j ; T g /z m j e ¡ g N ( y j ) z ½ ( d z j y j ) : ( S ; y ; Q ), ¹ ¤ N ¹ y j j S ; T ´ j ( d y j j S ; T ) / I ( T j < y j ) · m j ( e ¡ f N ½ j y j ) ´ ( d y j ) ; ¹ ¤ N + P f j :m j > 0 g Q j ± y j ; e ¡ g N ( u ) z ½ ( d z j u ) ´ ( d u ). P n j = 1 I ( m j > 0 )

18 18 The Bayes Estimate The Bayes estimate (posterior mean) of a decreasing hazard rate given is given by where if ; otherwise 0. T m j > 0 ¸ j ( t j S ) = Z 1 max ( t ; T j ) · m j + 1 ( e ¡ f N ½ j y ) ´ ( d y ) Z 1 T j · m j ( e ¡ f N ½ j y ) ´ ( d y ) ¸ 0 ( t ) = R 1 t · 1 ( e ¡ f N ½ j y ) ´ ( d y ) E [ ¸ ( t j ¹ )j T ] = ¸ 0 ( t ) + X S Z ( S ) n X j = 1 ¸ j ( t j S )

19 19 The Bayes Estimate The posterior mean is a sum over all S-paths with coordinates. The computation is formidable even when even though the total number of S is smaller than that of p NO exact simulation method for S! Possible strategies: –Develop Markov chain Monte Carlo (MCMC) algorithms or sequential importance sampling (SIS) methods for sampling S (not straightforward!) –Use algorithms for sampling p or latent variables? n + 1 n = 50

20 20 The Bayes Estimate Consistent estimate due to the weak consistency of the posterior by Drǎgichi & Ramamoorthi (2003) Always less variable than as a specialization of James (2002, 2005) according to a Rao-Blackwellization argument : by a discrete uniform conditional distribution of p|S,T and constancy of for all h ( p ) p 2 C S n X i = 1 ¸ j ( t j S ) = E [ h ( p )j S ; T ] = X p 2 C S h ( p ) ¼ ( p j S ; T ) ¸ 0 ( t ) + P p W ( p ) h ( p )

21 21 An MCMC method To draw a Markov chain of S-paths, which has a unique stationary distribution given by Generalize an accelerated path (AP) sampler [Ho (2002)] -- an efficient MCMC method for sampling S-paths in Bayesian nonparametric models based on Dirichlet process and Gamma process. –An improvement over a naïve Gibbs sampler [Ho (2002)] for S-paths. Z ( S ) = Á ( S )= P S Á ( S )

22 22 The AP Sampler A transition cycle contains steps. At step : –Obtain from the current path q = m i n f i > r:m i > 0 g r ( = 1 ;:::; n ¡ 1 ) n ¡ 1 S j r ¡ 1 r q n j 0

23 23 The AP Sampler –Note that the current S is given by where –Then, the new S will benew S for, with (transition) probability proportional to (transition) probability proportional to Repeat for steps to finish one cycle ( 0 ; S 1 ;:::; S r ¡ 1 ; c ;:::; c ; S q ;:::; S n ¡ 1 ; n ) S r ¡ 1 · c · m i n ( r ; S q ¡ 1 ) ( 0 ; S 1 ;:::; S r ¡ 1 ; j ;:::; j ; S q ;:::; S n ¡ 1 ; n ) j = S r ¡ 1 ; S r ¡ 1 + 1 ;:::; m i n ( r ; S q ¡ 1 ) Á (( 0 ; S 1 ;:::; S r ¡ 1 ; j ;:::; j ; S q ;:::; S n ¡ 1 ; n )) r = 1 ;:::; n ¡ 1

24 24 0 r ¡ 1 r j S j q S q S r = S r ¡ 1 S r = S r ¡ 1 + 1 S r = r

25 25 The AP Sampler – the Transition Probabilities I f j = S r ¡ 1, t h epro b a b i l i t y i spropor t i ona l t o O t h erw i se i f j 2 f S r ¡ 1 + 1 ; S r ¡ 1 + 2 ;:::; m i n ( r ; S q ¡ 1 )g, t h epro b a b i l i t y i spropor t i ona l t o r ¡ S r ¡ 1 S q ¡ 1 ¡ S r ¡ 1 Z 1 T q · S q ¡ S r ¡ 1 ( e ¡ f N ½ j y ) ´ ( d y ) µ S q ¡ S r ¡ 1 ¡ 2 S q ¡ j ¡ 1 ¶ q ¡ 1 Y i = r + 1 µ i ¡ j i ¡ S r ¡ 1 ¶ £ Z 1 T r · j ¡ S r ¡ 1 ( e ¡ f N ½ j y ) ´ ( d y ) £ Z 1 T q · S q ¡ j ( e ¡ f N ½ j z ) ´ ( d z )

26 26 Evaluation of the Bayes Estimate Start with an arbitrary path Repeat cycles to yield a Markov chain Compute the ergodic average to approximate the sum in the posterior mean of the monotone hazard rate M S ( 0 ). S ( 0 ) ; S ( 1 ) ;:::; S ( M ) 1 M M X i = 1 n X j = 1 ¸ j ( t j S ( i ) ) X S Z ( S ) n X j = 1 ¸ j ( t j S )

27 27 Rationale behind the AP Sampler Markov chain is defined by a transition kernel A Markov chain has a unique stationary distribution –Irreducible transition kernel: all states in the space communicate with each other within one cycle Construct reducible kernels such that the stationary distribution of the resulted chain is the target distribution and the product of them is irreducible [Hastings (1970; Tierney (1994)] n ¡ 1

28 28 Numerical Examples Gamma process for (i.e., ) with shape measure Lifetime data of an item with a hazard rate Data are generated subject to ; the censoring rate is about 15% Monte Carlo size is Initial path: ¹ ¿ = 3 ¸ ( t ) = ½ 10 · t < 1 0 : 5 t ¸ 1 : M = 1000 S ( 0 ) = ( 0 ; 1 ;:::; n ¡ 1 ; n ) ´ ( d u ) = 1 6 I ( 0 < u < 6 ) d u ½ ( d z j u ) = z ¡ 1 e ¡ z d z

29 29

30 30 Comparisons between Different Methods Many commonly-used partition-based methods [Lo, Brunner & Chan (1996); Ishwaran & James (2003); James (2005)] Replicate 1000 independent hazard estimates by each of the three available methods: (i) the AP sampler, (ii) the naïve Gibbs path (gP) sampler, and (iii) the gWCR sampler in Lo, Brunner and Chan (1996).

31 31

32 32 Comparisons between Different Methods At different time points, the three averages are close to each other, yet the standard errors vary substantially. –The standard error of hazard rate estimates produced by the AP sampler is the smallest among all the three methods. –The AP sampler definitely outweighs the naive Gibbs path sampler and beats the closest competitor, the gWCR sampler, by a comfortable margin.

33 33 Conclusions Tractable posterior distribution and Bayes estimate in terms of S-paths A Rao-Blackwellization result for S over p An efficient numerical method for sampling S-paths Accentuate the importance of study and usage of S-paths in models with monotonicity constraints (e.g., symmetrical unimodal density, monotone density, unimodal density, bathtub-shaped hazard rate,…)

34 34 THANK YOU!


Download ppt "1 A Bayes method of a Monotone Hazard Rate via S-paths Man-Wai Ho National University of Singapore Cambridge, 9 th August 2007."

Similar presentations


Ads by Google