COPYRIGHT OF: ABHINAV ANAND JYOTI ARORA SHRADDHA RAMSWAMY DISCRETE CHOICE MODELING IN HEALTH ECONOMICS.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Chapter 2 Describing Contingency Tables Reported by Liu Qi.
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Simple Logistic Regression
1 9. Logistic Regression ECON 251 Research Methods.
[Part 1] 1/15 Discrete Choice Modeling Econometric Methodology Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
HSRP 734: Advanced Statistical Methods July 24, 2008.
Logistic Regression Example: Horseshoe Crab Data
Loglinear Models for Independence and Interaction in Three-way Tables Veronica Estrada Robert Lagier.
Overview of Logistics Regression and its SAS implementation
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,
Binary Response Lecture 22 Lecture 22.
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
Ordered probit models.
An Introduction to Logistic Regression JohnWhitehead Department of Economics Appalachian State University.
EPI 809/Spring Multiple Logistic Regression.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
BINARY CHOICE MODELS: LOGIT ANALYSIS
Generalized Linear Models
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Logistic Regression II Simple 2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Logistic Regression III: Advanced topics Conditional Logistic Regression for Matched Data Conditional Logistic Regression for Matched Data.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Lecture 3-3 Summarizing r relationships among variables © 1.
Lecture 8: Generalized Linear Models for Longitudinal Data.
EIPB 698E Lecture 10 Raul Cruz-Cano Fall Comments for future evaluations Include only output used for conclusions Mention p-values explicitly (also.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Bayesian Analysis and Applications of A Cure Rate Model.
[Part 4] 1/43 Discrete Choice Modeling Bivariate & Multivariate Probit Discrete Choice Modeling William Greene Stern School of Business New York University.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
Applied Epidemiologic Analysis - P8400 Fall 2002
1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
HSRP 734: Advanced Statistical Methods July 17, 2008.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
+ Terrell Preventable Readmission Project Jeylan Buyukdura & Natalie Davies.
GEE Approach Presented by Jianghu Dong Instructor: Professor Keumhee Chough (K.C.) Carrière.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Discrete Choice Modeling William Greene Stern School of Business New York University.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Log-linear Models HRP /03/04 Log-Linear Models for Multi-way Contingency Tables 1. GLM for Poisson-distributed data with log-link (see Agresti.
1 Topic 4 : Ordered Logit Analysis. 2 Often we deal with data where the responses are ordered – e.g. : (i) Eyesight tests – bad; average; good (ii) Voting.
Heart Disease Example Male residents age Two models examined A) independence 1)logit(╥) = α B) linear logit 1)logit(╥) = α + βx¡
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
[Part 5] 1/43 Discrete Choice Modeling Ordered Choice Models Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Discrete Choice Modeling William Greene Stern School of Business New York University.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.
Analysis of matched data Analysis of matched data.
Logistic Regression Logistic Regression - Binary Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 6 Probability Distributions Section 6.1 Summarizing Possible Outcomes and Their Probabilities.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
BINARY LOGISTIC REGRESSION
Notes on Logistic Regression
William Greene Stern School of Business New York University
John Loucks St. Edward’s University . SLIDES . BY.
Presentation transcript:

COPYRIGHT OF: ABHINAV ANAND JYOTI ARORA SHRADDHA RAMSWAMY DISCRETE CHOICE MODELING IN HEALTH ECONOMICS

INTRODUCTION Studies suggest that self rated health score is a reliable predictor of health status We investigate impact of a host of personal and status characteristics such as age, gender etc on the health perception of US Citizens

DATA Dataset : NHANES Epidemiological Followup Study :1992 Health status, represented by Yi coded as follows Age is measured in years, education is measured in terms of number of years of schooling completed and dichotomous variable is created for gender (female = 1) and race (black = 1). POOR Y i = 1 FAIR Y i = 2 GOOD Y i = 3 VERY GOOD Y i = 4 EXCELLENT Y i = 5

METHODOLOGY 1) Ordered Logit model for the first part of our enquiry. 2) Sequential Logit Model for the second part of our enquiry

ORDERED LOGIT MODEL SPECIFICATION A multinomial choice model where the values taken by the dependent variable takes a natural order. Y i * is latent variable such that Y i = j when α j-1 < Y i *< α j where j= 1,2,3,4,5 and Y i * = β’X i + u i where u follows logistic distribution. 5 α 1 α 2 α 3 α 4

ORDERED LOGIT MODEL Where F( ) is a cdf and j = 1,2,3,4,5 and i is thei th individual We assume that u follows logistic distribution

CONTD…. P(Y i =1/X i ) = F [ α 1 – β’X] P(Y i = 2/X i ) = F [α 2 – β’X] – F[α 1 -β’X] P(Y i = 3/X i ) = F [α 3 – β’X] – F [α 2 -β’X] P(Y i = 4/X i ) = F [α 4 – β’X] – F [α 3 – β’X] P(Y i = 5/X i ) = 1– F [α 4 -β’X] Where F ( ) is defined as above. For estimating the model we specify 5 dummy variables for the i th individual with the following rule Z ij = 1 if Y i = j where j = 1,2,3,4,5. = 0 otherwise

ORDERED LOGIT MODEL ESTIMATION Using MLE Using Newton Raphson formula. 5 5 Assuming independent observations, we get FF

RESULTS Analysis of Maximum Likelihood Estimates ParameterDFEstimate Standard ErrorChi-SquarePr > Chisq Intercept <.0001 Intercept Intercept <.0001 Intercept <.0001 Age <.0001 gender race edu <.0001 south <.0001 COMMAND: proc logistic data = sasuser.nhanes descending; model health = age gender race edu south; run;

..contd Odds Ratio Estimate Effect Point Estimate 95% Wald Confidence Limit Age gender race edu south Association of Predicted Probabilities and Observed Responses Percent Concordant65.8Somers'D0.322 Percent Discordant33.6Gamma0.324 Percent Tied0.6Tau-a0.244 Pairs c0.661

Probability estimate for i th individual ( β’X i ) (3.138+β’X i ) ( β’X i )(3.138+β’X i ) ( β’X i ) (3.138+β’X i ) ( β’X i ) ( β’X i ) (3.138+β’X i ) ( β’X i )

INFERENCE (ORDERED LOGIT) One additional year of age results in a 3.13% decreases in odds ratio of higher self rating. The impact of gender is almost negligible. Blacks are 19.12% less likely than whites to rate their health at higher response values An additional year of schooling leads to 16.80% increase in odds ratio higher self rating The Southern residents in each district are 55% less likely than the northern to rate their health at higher response values. There are pairs of observations Of these 65.8% are concordant pairs while 33.6% are discordant pairs.

SEQUENTIAL LOGIT MODEL Choices/Responses follow a sequence, so we need (m-1) latent variables to characterize (m) unordered choices. Self-rated health measure can be considered as a purely cardinal variable following a sequence instead of some natural ordering. This allows us to perform discrete choice analysis using (non-ordered) sequential logit model.

SEQUENTIAL LOGIT MODEL Root (Sample) Poor (1) Fair+++ (2 or 3 or 4 or 5) Fair (2)Good++ (3 or 4 or 5) Good (3) VeryGood+ (4 or 5) Very Good (4) Excellent (5) Framework Five choices, and hence we have 4 latent variables to describe the choices. Choices in each step are independent of the previous step. Probability Computation Example P (Y i = 2) = P [Y i ≠ 1 and Y i = 2 |Y i ≠ 1] = P [Y i ≠ 1] P [Y i = 2|Y i ≠ 1 ] Therefore, for an individual i the conditional probability that his self-rated health measure will have a value j є {1,2,3,4,5} will be given by : P ij = P (Y i = j |X i ) and so on till j = 5

ESTIMATION IN SEQUENTIAL LOGIT MODEL Maximum Likelihood Estimation One-shot joint optimization with Independent Examples Repeated Optimization Thus, the parameter β 1 can be estimated by dividing the entire sample into two groups Poor Fair OR Good OR Very Good OR Excellent β 2 can be estimated by first taking the sub-sample of those did not report poor into two groups Fair Good OR Very Good OR Excellent β 3 can be estimated by taking the sub-sample of those who didn’t report poor or fair into two groups Good Very Good OR Excellent β 4 can be estimated by taking the sub-sample of those who didn’t report poor or fair or good into two groups Very Good Excellent In each case the binary models can be estimated by logit using MLE. Thus, the parameter β 1 can be estimated by dividing the entire sample into two groups Poor Fair OR Good OR Very Good OR Excellent β 2 can be estimated by first taking the sub-sample of those did not report poor into two groups Fair Good OR Very Good OR Excellent β 3 can be estimated by taking the sub-sample of those who didn’t report poor or fair into two groups Good Very Good OR Excellent β 4 can be estimated by taking the sub-sample of those who didn’t report poor or fair or good into two groups Very Good Excellent In each case the binary models can be estimated by logit using MLE.

SEQUENTIAL LOGIT MODEL Implementation in SAS data seqlogit; set seqlogit; fairplus = (shm>1); fair = (shm=2); if fairplus = 1; run; proc format; value shm 1='poor' 2-5='fair+++'; value gender 0='male' 1='female'; value race 0='white' 1='black'; value resid 0='north' 1='south'; run; proc qlim data=seqlogit; *covest=qml; class race resid gender; endogenous fair ~ discrete(dist=logistic order=formatted); model fair = age gender race edu resid; format gender gender. race race. resid resid.; run; Among those who report fair or good or very good or excellent health, the odds of reporting fair (rather than good++) are 64% lower among residents south of baseline than residents north of baseline of the same age, gender, education and race. Among those who report fair or good or very good or excellent health, the odds of reporting fair (rather than good++) are 64% lower among residents south of baseline than residents north of baseline of the same age, gender, education and race.

CONCLUSION Age, race, education (in terms of number of years of schooling ), and having residence in southern part of the district have a significant impact on self rated health. Gender doesn’t have a significant impact. Ordered Logit Model Age, education ( in terms of schooling) and having residence in southern part of the district have a significant impact on self rated health. Gender and race don’t have significant impact. Sequential Logit Model

REFERENCES Agresti A. Categorical Data Analysis, Second edition. New York: John Wiley & Sons; 2002 Gardiner J C., Luo Z. Logit Models in Practice: B, C, E, G, M, N, O… SAS Institute Inc. ; 2011