Download presentation
Presentation is loading. Please wait.
Published bySharyl Bridges Modified over 9 years ago
1
COPYRIGHT OF: ABHINAV ANAND JYOTI ARORA SHRADDHA RAMSWAMY DISCRETE CHOICE MODELING IN HEALTH ECONOMICS
2
INTRODUCTION Studies suggest that self rated health score is a reliable predictor of health status We investigate impact of a host of personal and status characteristics such as age, gender etc on the health perception of US Citizens
3
DATA Dataset : NHANES Epidemiological Followup Study :1992 Health status, represented by Yi coded as follows Age is measured in years, education is measured in terms of number of years of schooling completed and dichotomous variable is created for gender (female = 1) and race (black = 1). POOR Y i = 1 FAIR Y i = 2 GOOD Y i = 3 VERY GOOD Y i = 4 EXCELLENT Y i = 5
4
METHODOLOGY 1) Ordered Logit model for the first part of our enquiry. 2) Sequential Logit Model for the second part of our enquiry
5
ORDERED LOGIT MODEL SPECIFICATION A multinomial choice model where the values taken by the dependent variable takes a natural order. Y i * is latent variable such that Y i = j when α j-1 < Y i *< α j where j= 1,2,3,4,5 and Y i * = β’X i + u i where u follows logistic distribution. 5 α 1 α 2 α 3 α 4
6
ORDERED LOGIT MODEL Where F( ) is a cdf and j = 1,2,3,4,5 and i is thei th individual We assume that u follows logistic distribution
7
CONTD…. P(Y i =1/X i ) = F [ α 1 – β’X] P(Y i = 2/X i ) = F [α 2 – β’X] – F[α 1 -β’X] P(Y i = 3/X i ) = F [α 3 – β’X] – F [α 2 -β’X] P(Y i = 4/X i ) = F [α 4 – β’X] – F [α 3 – β’X] P(Y i = 5/X i ) = 1– F [α 4 -β’X] Where F ( ) is defined as above. For estimating the model we specify 5 dummy variables for the i th individual with the following rule Z ij = 1 if Y i = j where j = 1,2,3,4,5. = 0 otherwise
8
ORDERED LOGIT MODEL ESTIMATION Using MLE Using Newton Raphson formula. 5 5 Assuming independent observations, we get 37125 FF
9
RESULTS Analysis of Maximum Likelihood Estimates ParameterDFEstimate Standard ErrorChi-SquarePr > Chisq Intercept 51-1.4460.247334.1904<.0001 Intercept 410.12550.24630.25980.6103 Intercept 311.61390.247942.3953<.0001 Intercept 213.1380.2539152.7003<.0001 Age1-0.03130.00262143.3251<.0001 gender10.009890.06050.02670.8701 race1-0.21220.066910.06760.0015 edu10.15530.0114184.097<.0001 south1-0.79890.107255.5218<.0001 COMMAND: proc logistic data = sasuser.nhanes descending; model health = age gender race edu south; run;
10
..contd Odds Ratio Estimate Effect Point Estimate 95% Wald Confidence Limit Age0.9690.9640.974 gender1.010.8971.137 race0.8090.7090.922 edu1.1681.1421.194 south0.450.3650.555 Association of Predicted Probabilities and Observed Responses Percent Concordant65.8Somers'D0.322 Percent Discordant33.6Gamma0.324 Percent Tied0.6Tau-a0.244 Pairs 522179 9c0.661
11
Probability estimate for i th individual 1 2 3 4 5 1 (-1.4460+β’X i ) (3.138+β’X i ) (1.6139+β’X i )(3.138+β’X i ) (1.6139+β’X i ) (3.138+β’X i ) (1.6139+β’X i ) (0.1225+β’X i ) (3.138+β’X i ) (-1.4460+β’X i )
12
INFERENCE (ORDERED LOGIT) One additional year of age results in a 3.13% decreases in odds ratio of higher self rating. The impact of gender is almost negligible. Blacks are 19.12% less likely than whites to rate their health at higher response values An additional year of schooling leads to 16.80% increase in odds ratio higher self rating The Southern residents in each district are 55% less likely than the northern to rate their health at higher response values. There are 522179 pairs of observations Of these 65.8% are concordant pairs while 33.6% are discordant pairs.
13
SEQUENTIAL LOGIT MODEL Choices/Responses follow a sequence, so we need (m-1) latent variables to characterize (m) unordered choices. Self-rated health measure can be considered as a purely cardinal variable following a sequence instead of some natural ordering. This allows us to perform discrete choice analysis using (non-ordered) sequential logit model.
14
SEQUENTIAL LOGIT MODEL Root (Sample) Poor (1) Fair+++ (2 or 3 or 4 or 5) Fair (2)Good++ (3 or 4 or 5) Good (3) VeryGood+ (4 or 5) Very Good (4) Excellent (5) Framework Five choices, and hence we have 4 latent variables to describe the choices. Choices in each step are independent of the previous step. Probability Computation Example P (Y i = 2) = P [Y i ≠ 1 and Y i = 2 |Y i ≠ 1] = P [Y i ≠ 1] P [Y i = 2|Y i ≠ 1 ] Therefore, for an individual i the conditional probability that his self-rated health measure will have a value j є {1,2,3,4,5} will be given by : P ij = P (Y i = j |X i ) and so on till j = 5
15
ESTIMATION IN SEQUENTIAL LOGIT MODEL Maximum Likelihood Estimation One-shot joint optimization with Independent Examples Repeated Optimization Thus, the parameter β 1 can be estimated by dividing the entire sample into two groups Poor Fair OR Good OR Very Good OR Excellent β 2 can be estimated by first taking the sub-sample of those did not report poor into two groups Fair Good OR Very Good OR Excellent β 3 can be estimated by taking the sub-sample of those who didn’t report poor or fair into two groups Good Very Good OR Excellent β 4 can be estimated by taking the sub-sample of those who didn’t report poor or fair or good into two groups Very Good Excellent In each case the binary models can be estimated by logit using MLE. Thus, the parameter β 1 can be estimated by dividing the entire sample into two groups Poor Fair OR Good OR Very Good OR Excellent β 2 can be estimated by first taking the sub-sample of those did not report poor into two groups Fair Good OR Very Good OR Excellent β 3 can be estimated by taking the sub-sample of those who didn’t report poor or fair into two groups Good Very Good OR Excellent β 4 can be estimated by taking the sub-sample of those who didn’t report poor or fair or good into two groups Very Good Excellent In each case the binary models can be estimated by logit using MLE.
16
SEQUENTIAL LOGIT MODEL Implementation in SAS data seqlogit; set seqlogit; fairplus = (shm>1); fair = (shm=2); if fairplus = 1; run; proc format; value shm 1='poor' 2-5='fair+++'; value gender 0='male' 1='female'; value race 0='white' 1='black'; value resid 0='north' 1='south'; run; proc qlim data=seqlogit; *covest=qml; class race resid gender; endogenous fair ~ discrete(dist=logistic order=formatted); model fair = age gender race edu resid; format gender gender. race race. resid resid.; run; Among those who report fair or good or very good or excellent health, the odds of reporting fair (rather than good++) are 64% lower among residents south of baseline than residents north of baseline of the same age, gender, education and race. Among those who report fair or good or very good or excellent health, the odds of reporting fair (rather than good++) are 64% lower among residents south of baseline than residents north of baseline of the same age, gender, education and race.
17
CONCLUSION Age, race, education (in terms of number of years of schooling ), and having residence in southern part of the district have a significant impact on self rated health. Gender doesn’t have a significant impact. Ordered Logit Model Age, education ( in terms of schooling) and having residence in southern part of the district have a significant impact on self rated health. Gender and race don’t have significant impact. Sequential Logit Model
18
REFERENCES Agresti A. Categorical Data Analysis, Second edition. New York: John Wiley & Sons; 2002 Gardiner J C., Luo Z. Logit Models in Practice: B, C, E, G, M, N, O… SAS Institute Inc. ; 2011
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.