MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.

Slides:

Advertisements

Similar presentations

Dummy Dependent variable Models

Advertisements

Brief introduction on Logistic Regression

Logistic Regression Psy 524 Ainsworth.

FIN822 Li11 Binary independent and dependent variables.

Logit & Probit Regression

Linear regression models

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.

Ch11 Curve Fitting Dr. Deshi Ye

Nguyen Ngoc Anh Nguyen Ha Trang

Models with Discrete Dependent Variables

Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.

1Prof. Dr. Rainer Stachuletz Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)

Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,

Multiple Linear Regression Model

Binary Response Lecture 22 Lecture 22.

QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.

GRA 6020 Multivariate Statistics; The Linear Probability model and The Logit Model (Probit) Ulf H. Olsson Professor of Statistics.

Regression with a Binary Dependent Variable. Introduction What determines whether a teenager takes up smoking? What determines if a job applicant is successful.

Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.

The Simple Regression Model

FIN357 Li1 Binary Dependent Variables Chapter 12 P(y = 1|x) = G(  0 + x  )

EPI 809/Spring Multiple Logistic Regression.

Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.

Topic 3: Regression.

Lecture 14-2 Multinomial logit (Maddala Ch 12.2)

An Introduction to Logistic Regression

Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.

Simple Linear Regression Analysis

Generalized Linear Models

Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1

9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.

1 Regression Models with Binary Response Regression: “Regression is a process in which we estimate one variable on the basis of one or more other variables.”

Correlation and Linear Regression

University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.

Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.

When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.

April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.

Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.

Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.

Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.

Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Warsaw Summer School 2015, OSU Study Abroad Program Advanced Topics: Interaction Logistic Regression.

Multiple Logistic Regression STAT E-150 Statistical Methods.

Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.

Chapter 8: Simple Linear Regression Yang Zhenlin.

Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.

Logistic regression (when you have a binary response variable)

Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.

04/19/2006Econ 6161 Econ 616 – Spring 2006 Qualitative Response Regression Models Presented by Yan Hu.

Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.

Logistic Regression and Odds Ratios Psych DeShon.

Nonparametric Statistics

The Probit Model Alexander Spermann University of Freiburg SS 2008.

Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.

The “Big Picture” (from Heath 1995). Simple Linear Regression.

Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.

LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.

Nonparametric Statistics

The simple linear regression model and parameter estimation

Chapter 7. Classification and Prediction

M.Sc. in Economics Econometrics Module I

QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

THE LOGIT AND PROBIT MODELS

Generalized Linear Models

THE LOGIT AND PROBIT MODELS

Nonparametric Statistics

Logistic Regression.

Introduction to Logistic Regression

Limited Dependent Variables

Presentation transcript:

MODELS OF QUALITATIVE CHOICE by Bambang Juanda

 Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the analysis of survey data in which the behavioral response are qualitative; one uses either the subway, the bus or the automobile; one is either in the labor force or out of the labor force; etc.  Binary-choice models assume that individuals are faced with choice between two alternatives and that their choice depends on their characteristics.

 Although it is reasonable to expect a direct relationship between their characteristics and their choices, we cannot be sure how each and avery individual make a choice.  One purpose of a qualitative choice model is to determine the probability that an individual with a given set of attributes will make one choice rather than the alternative.  We assume that the probability of an individual making a given choice is a linear function of individual attributes.

Overview Continuous Categorical Linear Regression Analysis - ResponseAnalysis -Linear Probability Model -Probit Model

1. Linear Probability Model Y i =  +  X i + ε i (10.1) Where X i = value of attribute, e.g. Income for ith individual, Y i = 1, if first option is chosen (buy a car) 0, if second option is chosen (not buy a car). ε i = independently distributed random variable with 0 mean. To interpret eq. (10.1), we take the expected value of each dependent variable observation Y i : E(Y i ) =  +  X i (10.2) Since Y i can take on only two values (1 dan 0), we can describe the probability distribution of Y by letting: P i = P(Y i =1) dan 1-P i = P(Y i =0), Then, E(Y i ) = 1 (P i ) + 0 (1-P i ) = P i (10.3) Model (10.1)  probability that an ith individual will buy a car, given information about her income. The slope of line measures the effect on the probability of buying a car of a unit change in income.

Estimated Linear Probability Model  +  X i, jika 0<(  +  X i )<1 P i = 1, jika (  +  X i ) ≥ 1 0, jika (  +  X i ) ≤ 0 (10.4)

Probability Distribution of ε i YiYi εiεi Probability 1 1-  -  X i PiPi 0 -  -  X i 1 - P i

E(ε i ) = (1-  -  X i ) P i + (-  -  X i ) (1-P i ) = 0 then P i =  +  X i (1-P i ) = 1 -  -  X i Variance of error term Thus, Variable Y distribute according to Bernouli probability distribution.  error term is heteroscedastic

Difficulties in linear probability model  need to transform the original model in such a way that predictions will lie in the (0;1) interval for all X. One of this transformation forms is the cumulative probability function), F. [1] The resulting probability distribution might be represented as: [1] P i = F(  +  X i ) = F(Z i ) While numerous alternative cumulative probabilitiy functions are possible, we shall consider only two, the normal and the logistic cumulative probabilitiy functions. [1][1] cumulative probability function is F(x i ) = Prob(X≤x i )

2. Probit Model Pi = F(  +  Xi) = F(Zi) Assume there is a theoretical continuous an index Z i, which is determined by an explanatory variable X. Thus, we can write: Z i =  +  X i Assume that Z is a normally distributed random variable, so that the probability that Z is less than (or equal to) Z i can be computed from the cummulative normal probability function. The standardized cumulative normal function is written:: (10.9) Where s is a random variable which is normally distributed with mean zero and unit variance. By construction, the variable P i will lie in the (0;1) interval. P i represents the probability that individual with income X i make a choice (buy a car). Since this probability is measured by the area under the standard normal curve from -  to Z i, the event (buy a car) will be more likely to occur the larger the value of the index Z i. To obtain an estimate of index Z i, we apply the inverse of the cumulative normal function to eq.(10.9) : Z i = F -1 (P i ) =  +  X i

The Relation of Index Z and Cumulative Normal Probability Function Distribution ZF(Z)Z

Linear Probability Model vs Probit Model Linear Model

 While the probit model is more appealing than the linear probability model, it involves nonlinear maximum likelihood estimation.  The theoritical justification for employing the probit model is somewhat limited.  We shall consider a somewhat more appealing model specification, the logit model

It is based on the cumulative logistic probability function and is specified as. Logistic Regression Model (Logit Model ) Simple Logit Model : The Logistic distribution curve is said to be S shaped, so that its interpretation is logic. 0 ≤ E(Y/X) ≤ 1 Interpretation: probability that an ith individual will make a choice (e.g. buy a car), given information about her income X i

Logit Transformation Probability of an event (p i ) is transformed by the form: iindex of all observations (1, 2,..,n). p i probability of an event (e.g. buy a car) occurs for an ith observation. logis natural logarithm (basic number e). Function g(x) is Linear in Parameter, and -~ ≤ g(x) ≤ ~, so that it can be estimated by OLS

Assumption (variable X has an interval scale) P i Predictor (X) Logit Transformation Predictor (X)

Interpretation of Logit Model Coefficients For binary independent variable, e.g. sex (X=1, X=0) X=1X=0 Y=1 Y=0 P(1) : Probability of buying a car for male consumer P(0) : Probability of buying a car for female consumer 11 Total 11

Interpretation of Coefficient  1 = g(X+1) – g(X) For X binary:  1 = g(1) – g(0) Odds Ratio: “How much more likely to buy for male consumer compared to for female consumer” Interpretation of relatif probability approach P(1)/P(0) is applicable when P(x) is small For X continuous, exp(  1) : How much more likely to buy when X increase 1 unit Association measure

Properties of the Odds Ratio -0.5 No Association =x+1=x (1-  ) 100% Confidence Interval of Odds Ratio: exp(c  ± z  /2 c s  ) ^ ^ Note:

Multiple Logistic Regression

Illustration of a model to study the effect of sex (X 1 ), age (X 2 ), and income (X 2 ) on buying a car. logit (p i ) = For an independent continuous variable X, occasionally 1 unit is too small or large to consider  Estimation for the change of “c” unit g(x+c) – g(x) = c  1 Odds Ratio-nya:

Testing a Model with p Independent Variables Testing for the significance of the Model: H 0 :  1 =  2 =…=  p =0 H 1 : ada  j ≠0  Likelihood Ratio Test Statistics (G) ~ Testing for a coefficient partially: H 0 :  j =0 H 1 :  j ≠0  WaldTest Statistics (W) ~ Z  2 (p)

Adjusted Odds Ratio ontrolling for

Types of Logistic Regression Response Variable Yes No Binary Two Categories Type of Logistic Regression Binary Nominal Ordinal Three or More Categories