1 Regression Models with Binary Response Regression: “Regression is a process in which we estimate one variable on the basis of one or more other variables.”

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Brief introduction on Logistic Regression
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
FIN822 Li11 Binary independent and dependent variables.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Nguyen Ngoc Anh Nguyen Ha Trang
Models with Discrete Dependent Variables
Lecture 8 Relationships between Scale variables: Regression Analysis
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Multiple Linear Regression Model
Regression with a Binary Dependent Variable. Introduction What determines whether a teenager takes up smoking? What determines if a job applicant is successful.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
FIN357 Li1 Binary Dependent Variables Chapter 12 P(y = 1|x) = G(  0 + x  )
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Topic 3: Regression.
Generalized Linear Models
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
MTH 161: Introduction To Statistics
Session 10. Applied Regression -- Prof. Juran2 Outline Binary Logistic Regression Why? –Theoretical and practical difficulties in using regular (continuous)
9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to.
The Simple Linear Regression Model: Specification and Estimation ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
1 Introduction to Modeling Beyond the Basics (Chapter 7)
04/19/2006Econ 6161 Econ 616 – Spring 2006 Qualitative Response Regression Models Presented by Yan Hu.
Economics 310 Lecture 22 Limited Dependent Variables.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Logistic Regression and Odds Ratios Psych DeShon.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
1/25 Introduction to Econometrics. 2/25 Econometrics Econometrics – „economic measurement“ „May be defined as the quantitative analysis of actual economic.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
FREQUENCY DISTRIBUTION
The simple linear regression model and parameter estimation
Chapter 14 Introduction to Multiple Regression
BINARY LOGISTIC REGRESSION
Bivariate & Multivariate Regression Analysis
Discrete Choice Models:
M.Sc. in Economics Econometrics Module I
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS
THE LOGIT AND PROBIT MODELS
Drop-in Sessions! When: Hillary Term - Week 1 Where: Q-Step Lab (TBC) Sign up with Alice Evans.
Financial Econometrics Lecture Notes 4
Generalized Linear Models
BIVARIATE REGRESSION AND CORRELATION
Simple Linear Regression - Introduction
THE LOGIT AND PROBIT MODELS
CHAPTER 26: Inference for Regression
Prepared by Lee Revere and John Large
The Simple Linear Regression Model: Specification and Estimation
Simple Linear Regression
Chapter 9 Dummy Variables Undergraduated Econometrics Page 1
Regression Part II.
Introduction to Econometrics, 5th edition
Presentation transcript:

1 Regression Models with Binary Response Regression: “Regression is a process in which we estimate one variable on the basis of one or more other variables.” For example  We can estimate the production of wheat on the basis of amount of rainfall and the fertilizer used.  We can estimate marks of a student on the basis of “hours of his study”, “his IQ level”, and “quality of his teachers”.  we can estimate whether a family own a house or not. “A family owns a house or not” depends upon many factors such as “income level” “social status” etc

Introduction There are situations where the response variable is qualitative. In such situation binary response models are applied. In a model where response is quantitative our objective is to estimate its expected value of response for given values of explanatory variables i.e. E (Y i / X 1i, X 2i...X pi ).In models where response Y is qualitative we are interested to find the probability of something happening, (such as voting for a particular candidate, owning a car etc) given the values of explanatory variables i.e P (Y i =1 / X 1i, X 2i...X pi ). So binary response models are also called probability models. Some basic Binary Response Models are 1.Linear probability model 2. Logit Regression Model 3. Probit Regression Model 2

Linear Probability Model (LPM) It is most simplest Binary Response Model. In it simply OLS method is used to regress dichotomous response on the independent variable(s) The linear probability model can be presented in the following form 3 where Y=1 for one category of response and Y = 0 for other Y can be interpreted as conditional probability that the event will occur given the level of X i (where in this model explanatory variable X i may be continuous or categorical but Y must be dichotomous random variable).

Problems with LPM Linear probability model is being used in many fields but it has several disadvantages. These problems are discussed here, 1.Non – Normality of the Disturbances As Y (response ) is binary so it follows a Bernoulli probability distribution, so the error terms are not normally distributed. And Normality of error terms is a very important assumption for ordinary Least squares OLS method to apply. 2.Heteroscedasiticity of the Disturbances 4 For the application of OLS it is assumed that variances of the disturbances are homoscedastic i.e. should be constant. But in LPM as Yi follows Bernoulli distribution for which = P i (1-P i ) Now as P i is a function of X i so it will change as X i changes, meaning that it depends upon X i through its influence on P i which leads to conclude that in variance is not same and the assumption of homoscedasticity is violated so OLS cannot be applied. 3.Low Value of R2 In dichotomous response models, the computed R 2 is of limited value. In LPM, corresponding to a given level of X the response Y takes values either 0 or 1. The computed value of R 2 is likely to be much lower than unity for models of this nature (Binary Response Models )

4.A Logical Problem Since we have interpreted LPM as model that gives the conditional probability of occurrence of an event (a category of response) given a certain level of explanatory variable X i. So as being probability E (Y i / X i ) must fall in [0, 1] interval. But in LPM this is not guaranteed as and As a result (β 0 + β 1 X i ) can take any value from the entire real line i.e. 4 continued

Logit Regression Model The Logit model gives some linear relationship between logit of the observed probabilities (not probabilities themselves) and unknown parameters of the model. Contrary to LPM the logit model relation will be OrWhere And is cumulative distribution function of logistics distribution. Here To apply OLS method we use “logit” of observed probabilities as response which is defined as The logit of observed probabilities is linear in X and also linear in parameters so OLS can be applied to get the parameters of the Model easily. 5

Probit Regression Another alternative to the linear probability model is Probit regression. Probit regression is based cumulative distribution of Normal distribution. The normal distribution is best representation of many kinds of natural phenomenon. So Probit regression is better alternative of LPM as compared to logit regression. Probit is the non-linear function of probability defined as where Where N.E.D stands for normal equivalent deviate and F is the cumulative distribution function of standard normal distribution. In contrast to the probability itself (which takes v values from 0 to 1) the values of the probit corresponding to P i range from to. Which give So Probit Regression  Logical problem of LPM is solved  Normality of error term is achieved due to use of cumulative distribution function of standard normal distribution.  The problem of R 2 can be solved by a suitable transformation of the explanatory variable in such a way that the relation between Probit and explanatory variable become linear. So Probit Regression can be a better choice in the class of Binary Response Models. 6

References 1.Aldrich, J. and Nelson, F. (1984). Linear Probability, Logit, and Probit Models. Beverly Hills: Sage. 2.Amemiya, T. (1974). Bivariate Probit Analysis: Minimum Chi-Square Methods. Journal of the American Statistical Association, Vol. 69, No. 348, pp Anscombe, F.J. (1956). On Estimating Binomial Response Relations. Biometrika, Vol, 43, No. 3/4, pp Berkson, J. (1951). Why I prefer logits to probits. Biometrics, Vol.7, No. 4, pp Caudill, S. B. (1987). Dichotomous Choice Models and Dummy Variables. The Statistician, Vol. 36, No. 4, pp Chambers, E. A. and Cox, D. R. (1967). Discrimination Between Alternatives Binary Response Models. Biometrika, Vol. 54, No. 3/4, pp Finney, D. J. (1971). Probit Analysis. 3rd Ed. Cambridge: Cambridge University Press. 8.Goldberger, A. S. (1964). Econometric Theory. New York: John Wiley & Sons. 9.Goodman, L. A. (1972). A Modified Multiple Regression Approach to the Analysis of Dichotomous Variables. American Sociological Review, Vol. 37, No. 1, pp