Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,

Slides:



Advertisements
Similar presentations
Sociology 680 Multivariate Analysis Logistic Regression.
Advertisements

Brief introduction on Logistic Regression
Logistic Regression Psy 524 Ainsworth.
Logistic Regression.
Week 3. Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs.
Chapter 8 – Logistic Regression
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
SLIDE 1IS 240 – Spring 2010 Logistic Regression The logistic function: The logistic function is useful because it can take as an input any.
Multinomial Logistic Regression
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Topic 3: Regression.
An Introduction to Logistic Regression
Elaboration Elaboration extends our knowledge about an association to see if it continues or changes under different situations, that is, when you introduce.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
STAT E-150 Statistical Methods
Logistic Regression KNN Ch. 14 (pp ) MINITAB User’s Guide
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Logistic Regression STA2101/442 F 2014 See last slide for copyright information.
Session 10. Applied Regression -- Prof. Juran2 Outline Binary Logistic Regression Why? –Theoretical and practical difficulties in using regular (continuous)
Business Intelligence and Decision Modeling Week 11 Predictive Modeling (2) Logistic Regression.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Logistic Regression Database Marketing Instructor: N. Kumar.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
CS 478 – Tools for Machine Learning and Data Mining Linear and Logistic Regression (Adapted from various sources) (e.g., Luiz Pessoa PY 206 class at Brown.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Linear Discriminant Analysis (LDA). Goal To classify observations into 2 or more groups based on k discriminant functions (Dependent variable Y is categorical.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
Logistic Regression. Linear Regression Purchases vs. Income.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.
Generalized Linear Models (GLMs) and Their Applications.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Regression Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
1 Introduction to Modeling Beyond the Basics (Chapter 7)
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Logistic Regression and Odds Ratios Psych DeShon.
Multiple Linear Regression An introduction, some assumptions, and then model reduction 1.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
DISCRIMINANT ANALYSIS. Discriminant Analysis  Discriminant analysis builds a predictive model for group membership. The model is composed of a discriminant.
Predicting Mortgage Pre-payment Risk. Introduction Definition Borrower pays off the loan before the contracted term loan length. Lender loses future part.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Chapter 13 Logistic regression.
BINARY LOGISTIC REGRESSION
Chapter 7. Classification and Prediction
A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.
Logistic Regression APKC – STATS AFAC (2016).
Advanced Quantitative Techniques
M.Sc. in Economics Econometrics Module I
THE LOGIT AND PROBIT MODELS
Regression Techniques
Introduction to logistic regression a.k.a. Varbrul
THE LOGIT AND PROBIT MODELS
Introduction to Logistic Regression
Module 5, Lesson 1: Logistic Regression ( )
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Logistic Regression.
Presentation transcript:

Chapter 8 Logistic Regression 1

Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable, Y, is categorical. A categorical variable as divides the observations into classes. – If Y denotes a recommendation on holding /selling / buying a stock, then we have a categorical variable with 3 categories. – Each of the stocks in the dataset (the observations) as belonging to one of three classes: the “hold" class, the “sell" class, and the “buy” class. Logistic regression can be used for classifying a new observation into one of the classes, based on the values of its predictor variables (called “classification"). It can also be used in data (where the class is known) to find similarities between observations within each class in terms of the predictor variables (called “profiling"). 2

Introduction Logistic regression is used in applications such as: – 1. Classifying customers as returning or non-returning (classification) – 2. Finding factors that differentiate between male and female top executives (profiling) – 3. Predicting the approval or disapproval of a loan based on information such as credit scores (classification). In this chapter we focus on the use of logistic regression for classification. We deal only with a binary dependent variable, having two possible classes. The results can be extended to the case where Y assumes more than two possible outcomes. Popular examples of binary response outcomes are – success/failure, – yes/no, – buy/don't buy, – default/don't default, and – survive/die. We code the values of a binary response Y as 0 and 1. 3

Introduction We may choose to convert continuous data or data with multiple outcomes into binary data for purposes of simplification, reflecting the fact that decision- making may be binary – approve the loan / don't approve, – make an offer/ don't make an offer) Like MLR, the independent variables X 1,X 2, …,X k may be categorical or continuous variables or a mixture of these two types. In MLR the aim is to predict the value of the continuous Y for a new observation In Logistic Regression the goal is to predict which class a new observation will belong to, or simply to classify the observation into one of the classes. In the stock example, we would want to classify a new stock into one of the three recommendation classes: sell, hold, or buy. 4

Logistic Regression In logistic regression we take two steps: – the first step yields estimates of the probabilities of belonging to each class. In the binary case we get an estimate of P(Y = 1), – the probability of belonging to class 1 (which also tells us the probability of belonging to class 0). In the next step we use – a cutoff value on these probabilities in order to classify each case to one of the classes. – In a binary case, a cutoff of 0.5 means that cases with an estimated probability of P(Y = 1) > 0.5 are classified as belonging to class 1, – whereas cases with P(Y = 1) < 0.5 are classified as belonging to class 0. – The cutoff need not be set at

Logistic Regression Unlike ordinary linear regression, logistic regression does not assume that the relationship between the independent variables and the dependent variable is a linear one. Nor does it assume that the dependent variable or the error terms are distributed normally. 6

The form of the model is 7 where p is the probability that Y=1 and X 1, X 2,...,X k are the independent variables (predictors). b 0, b 1, b 2,.... b k are known as the regression coefficients, which have to be estimated from the data. Logistic regression estimates the probability of a certain event occurring. Logistic Regression

Logistic regression thus forms a predictor variable (log (p/(1-p)) which is a linear combination of the explanatory variables. The values of this predictor variable are then transformed into probabilities by a logistic function. Such a function has the shape of an S. – See the graph on the next slide On the horizontal axis we have the values of the predictor variable, and on the vertical axis we have the probabilities. Logistic regression also produces Odds Ratios (O.R.) associated with each predictor value. 8 Logistic Regression

9

The "odds" of an event is defined as the probability of the outcome event occurring divided by the probability of the event not occurring. In general, the "odds ratio" is one set of odds divided by another. The odds ratio for a predictor is defined as the relative amount by which the odds of the outcome increase (O.R. greater than 1.0) or decrease (O.R. less than 1.0) when the value of the predictor variable is increased by 1.0 units. In other words, (odds for PV+1)/(odds for PV) where PV is the value of the predictor variable. 10

The logit as a function of the predictors 11 Logistic Regression The odds as a function of the predictors The probability as a function of the predictors

The Logistic Regression Model 12 Example: Charles Book Club

Problems Financial Conditions of Banks Identifying Good Systems Administrators 13