Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Continued Psy 524 Ainsworth
Probit The two most common error specifications yield the logit and probit models. The probit model results if the are distributed as normal variates,
Econometrics I Professor William Greene Stern School of Business
Empirical Methods for Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013 William Greene Department of Economics Stern School.
Brief introduction on Logistic Regression
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
[Part 1] 1/15 Discrete Choice Modeling Econometric Methodology Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
Limited Dependent Variables
Lecture 16: Logistic Regression: Goodness of Fit Information Criteria ROC analysis BMTRY 701 Biostatistical Methods II.
Binary Response Lecture 22 Lecture 22.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004.
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
The Binary Logit Model Definition Characteristics Estimation 0.
Discrete Choice Modeling William Greene Stern School of Business New York University Lab Sessions.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Discrete Choice Modeling William Greene Stern School of Business New York University Lab Sessions.
Discrete Choice Modeling William Greene Stern School of Business New York University.
2. Binary Choice Estimation. Modeling Binary Choice.
Econometric Methodology. The Sample and Measurement Population Measurement Theory Characteristics Behavior Patterns Choices.
Discrete Choice Modeling William Greene Stern School of Business New York University.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Discrete Choice Modeling William Greene Stern School of Business New York University.
1 G Lect 6b G Lecture 6b Generalizing from tests of quantitative variables to tests of categorical variables Testing a hypothesis about a.
Discrete Choice Modeling William Greene Stern School of Business New York University Lab Sessions.
Limited Dependent Variables Ciaran S. Phibbs May 30, 2012.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
Microeconometric Modeling
1/30: Topic 4.1 – Nested Logit and Multinomial Probit Models Microeconometric Modeling William Greene Stern School of Business New York University New.
Discrete Choice Modeling William Greene Stern School of Business New York University.
The dangers of an immediate use of model based methods The chronic bronchitis study: bronc: 0= no 1=yes poll: pollution level cig: cigarettes smokes per.
Dates Presentations Wed / Fri Ex. 4, logistic regression, Monday Dec 7 th Final Tues. Dec 8 th, 3:30.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Regression Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
11. Nested Logit Models. Extended Formulation of the MNL Groups of similar alternatives Compound Utility: U(Alt)=U(Alt|Branch)+U(branch) Behavioral implications.
[Part 8] 1/26 Discrete Choice Modeling Nested Logit Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
1/30: Topic 4.1 – Nested Logit and Multinomial Probit Models Microeconometric Modeling William Greene Stern School of Business New York University New.
[Part 2] 1/86 Discrete Choice Modeling Binary Choice Models Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
6. Ordered Choice Models. Ordered Choices Ordered Discrete Outcomes E.g.: Taste test, credit rating, course grade, preference scale Underlying random.
[Part 5] 1/43 Discrete Choice Modeling Ordered Choice Models Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
Discrete Choice Modeling William Greene Stern School of Business New York University.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
Logistic Regression APKC – STATS AFAC (2016).
William Greene Stern School of Business New York University
Basic Estimation Techniques
M.Sc. in Economics Econometrics Module I
THE LOGIT AND PROBIT MODELS
William Greene Stern School of Business New York University
Basic Estimation Techniques
Microeconometric Modeling
Microeconometric Modeling
THE LOGIT AND PROBIT MODELS
Microeconometric Modeling
Microeconometric Modeling
Econometrics Chengyuan Yin School of Mathematics.
LIMITED DEPENDENT VARIABLE REGRESSION MODELS
Modeling with Dichotomous Dependent Variables
William Greene Stern School of Business New York University
Presentation transcript:

Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13,

Part 3 Modeling Binary Choice

A Model for Binary Choice  Yes or No decision (Buy/Not buy)  Example, choose to fly or not to fly to a destination when there are alternatives.  Model: Net utility of flying U fly =  +  1Cost +  2Time +  Income +  Choose to fly if net utility is positive  Data: X = [1,cost,terminal time] Z = [income] y = 1 if choose fly, U fly > 0, 0 if not.

What Can Be Learned from the Data? (A Sample of Consumers, i = 1,…,N) Are the attributes “relevant?” Predicting behavior - Individual - Aggregate Analyze changes in behavior when attributes change

Application  210 Commuters Between Sydney and Melbourne  Available modes = Air, Train, Bus, Car  Observed: Choice Attributes: Cost, terminal time, other Characteristics: Household income  First application: Fly or other

Binary Choice Data Choose Air Gen.Cost Term Time Income

An Econometric Model  Choose to fly iff U FLY > 0 U fly =  +  1Cost +  2Time +  Income +  U fly > 0   > -(  +  1Cost +  2Time +  Income)  Probability model: For any person observed by the analyst, Prob(fly) = Prob[  > -(  +  1Cost +  2Time +  Income)]  Note the relationship between the unobserved  and the outcome

 +  1Cost +  2TTime +  Income

Econometrics  How to estimate ,  1,  2,  ? It’s not regression The technique of maximum likelihood Prob[y=1] = Prob[  > -(  +  1Cost +  2Time +  Income)] Prob[y=0] = 1 - Prob[y=1]  Requires a model for the probability

Completing the Model: F(  )  The distribution Normal: PROBIT, natural for behavior Logistic: LOGIT, allows “thicker tails” Gompertz: EXTREME VALUE, asymmetric, underlies the basic logit model for multiple choice  Does it matter? Yes, large difference in estimates Not much, quantities of interest are more stable.

Estimated Binary Choice Model | Binomial Probit Model | | Maximum Likelihood Estimates | | Model estimated: Jan 20, 2004 at 04:08:11PM.| | Dependent variable MODE | | Weighting variable None | | Number of observations 210 | | Iterations completed 6 | | Log likelihood function | | Restricted log likelihood | | Chi squared | | Degrees of freedom 3 | | Prob[ChiSqd > value] = | | Hosmer-Lemeshow chi-squared = | | P-value= with deg.fr. = 8 | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| Index function for probability Constant GC TTME HINC

Estimated Binary Choice Models LOGIT PROBIT EXTREME VALUE Variable Estimate t-ratio Estimate t-ratio Estimate t-ratio Constant GC TTME HINC Log-L Log-L(0)

 +  1Cost +  2Time +  (Income+1) Effect on predicted probability of an increase in income (  is positive)

How Well Does the Model Fit?  There is no R squared  “Fit measures” computed from log L “pseudo R squared = 1 – logL0/logL Others… - these do not measure fit.  Direct assessment of the effectiveness of the model at predicting the outcome

Fit Measures for Binary Choice  Likelihood Ratio Index Bounded by 0 and 1 Rises when the model is expanded  Cramer (and others)

Fit Measures for the Logit Model | Fit Measures for Binomial Choice Model | | Probit model for variable MODE | | Proportions P0= P1= | | N = 210 N0= 152 N1= 58 | | LogL = LogL0 = | | Estrella = 1-(L/L0)^(-2L0/n) = | | Efron | McFadden | Ben./Lerman | | | | | | Cramer | Veall/Zim. | Rsqrd_ML | | | | | | Information Akaike I.C. Schwarz I.C. | | Criteria | Pseudo – R-squared

Predicting the Outcome  Predicted probabilities P = F(a + b1Cost + b2Time + cIncome)  Predicting outcomes Predict y=1 if P is large Use 0.5 for “large” (more likely than not)  Count successes and failures

Individual Predictions from a Logit Model Observation Observed Y Predicted Y Residual x(i)b Pr[Y=1] Note two types of errors and two types of successes.

Predictions in Binary Choice Predict y = 1 if P > P* Success depends on the assumed P*

ROC Curve  Plot %Y=1 correctly predicted vs. %y=1 incorrectly predicted  45 0 is no fit. Curvature implies fit.  Area under the curve compares models

Aggregate Predictions Frequencies of actual & predicted outcomes Predicted outcome has maximum probability. Threshold value for predicting Y=1 =.5000 Predicted Actual 0 1 | Total | | Total | 210

Analyzing Predictions Frequencies of actual & predicted outcomes Predicted outcome has maximum probability. Threshold value for predicting Y=1 is P* (This table can be computed with any P*.) Predicted Actual 0 1 | Total N(a0,p0) N(a0,p1) | N(a0) 1 N(a1,p0) N(a1,p1) | N(a1) Total N(p0) N(p1) | N

Analyzing Predictions - Success  Sensitivity = % actual 1s correctly predicted = 100N(a1,p1)/N(a1) % [100(38/58)=65.5%]  Specificity = % actual 0s correctly predicted = 100N(a0,p0)/N(a0) % [100(151/152)=99.3%]  Positive predictive value = % predicted 1s that were actual 1s = 100N(a1,p1)/N(p1) % [100(38/39)=97.4%]  Negative predictive value = % predicted 0s that were actual 0s = 100N(a0,p0)/N(p0) % [100(151/171)=88.3%]  Correct prediction = %actual 1s and 0s correctly predicted = 100[N(a1,p1)+N(a0,p0)]/N [100(151+38)/210=90.0%]

Analyzing Predictions - Failures  False positive for true negative = %actual 0s predicted as 1s = 100N(a0,p1)/N(a0) % [100(1/152)=0.668%]  False negative for true positive = %actual 1s predicted as 0s = 100N(a1,p0)/N(a1) % [100(20/258)=34.5%]  False positive for predicted positive = % predicted 1s that were actual 0s = 100N(a0,p1)/N(p1) % [100(1/39)=2/56%]  False negative for predicted negative = % predicted 0s that were actual 1s = 100N(a1,p0)/N(p0) % [100(20/171)=11.7%]  False predictions = %actual 1s and 0s incorrectly predicted = 100[N(a0,p1)+N(a1,p0)]/N [100(1+20)/210=10.0%]

Aggregate Prediction is a Useful Way to Assess the Importance of a Variable Frequencies of actual & predicted outcomes. Predicted outcome has maximum probability. Threshold value for predicting Y=1 =.5000 Predicted Actual 0 1 | Total | | Total | 210 Predicted Actual 0 1 | Total | | Total | 210 Model fit without TTMEModel fit with TTME