Week 3. Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs.

Slides:



Advertisements
Similar presentations
Statistical Analysis SC504/HS927 Spring Term 2008
Advertisements

Sociology 680 Multivariate Analysis Logistic Regression.
Exploring the Shape of the Dose-Response Function.
Logistic Regression Psy 524 Ainsworth.
Regression analysis Linear regression Logistic regression.
Logistic Regression.
Chapter 8 – Logistic Regression
Regression With Categorical Variables. Overview Regression with Categorical Predictors Logistic Regression.
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Analysis of Complex Survey Data Day 3: Regression.
Analyzing quantitative data – section III Week 10 Lecture 1.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
Prelude of Machine Learning 202 Statistical Data Analysis in the Computer Age (1991) Bradely Efron and Robert Tibshirani.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Generalized Linear Models
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人:黃子齊
Chapter 6 Regression Algorithms in Data Mining
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
5.2 Input Selection 5.3 Stopped Training
CS 478 – Tools for Machine Learning and Data Mining Linear and Logistic Regression (Adapted from various sources) (e.g., Luiz Pessoa PY 206 class at Brown.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
Logistic Regression. Conceptual Framework - LR Dependent variable: two categories with underlying propensity (yes/no) (absent/present) Independent variables:
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Regression Models Fit data Time-series data: Forecast Other data: Predict.
Practical Statistics Regression. There are six statistics that will answer 90% of all questions! 1. Descriptive 2. Chi-square 3. Z-tests 4. Comparison.
Chapter 4: Introduction to Predictive Modeling: Regressions
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.
Logistic Regression. Linear Regression Purchases vs. Income.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Linear Discriminant Analysis and Logistic Regression.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
CHAPTER 10: Logistic Regression. Binary classification Two classes Y = {0,1} Goal is to learn how to correctly classify the input into one of these two.
1 Chapter 4: Introduction to Predictive Modeling: Regressions 4.1 Introduction 4.2 Selecting Regression Inputs 4.3 Optimizing Regression Complexity 4.4.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
LOGISTIC REGRESSION Binary dependent variable (pass-fail) Odds ratio: p/(1-p) eg. 1/9 means 1 time in 10 pass, 9 times fail Log-odds ratio: y = ln[p/(1-p)]
Statistics 2: generalized linear models. General linear model: Y ~ a + b 1 * x 1 + … + b n * x n + ε There are many cases when general linear models are.
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Logistic Regression An Introduction. Uses Designed for survival analysis- binary response For predicting a chance, probability, proportion or percentage.
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
Predicting Mortgage Pre-payment Risk. Introduction Definition Borrower pays off the loan before the contracted term loan length. Lender loses future part.
Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Nonparametric Statistics
BINARY LOGISTIC REGRESSION
Logistic Regression When and why do we use logistic regression?
Logistic Regression APKC – STATS AFAC (2016).
Advanced Quantitative Techniques
Notes on Logistic Regression
Generalized Linear Models
LOGISTIC REGRESSION 1.
Drop-in Sessions! When: Hillary Term - Week 1 Where: Q-Step Lab (TBC) Sign up with Alice Evans.
Introduction to Data Mining and Classification
Introduction to logistic regression a.k.a. Varbrul
Nonparametric Statistics
What is Regression Analysis?
Introduction to Logistic Regression
Predicting Loan Defaults
Logistic Regression.
Regression Part II.
Presentation transcript:

Week 3

Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs

Estimating relation between different variables dependent variables and independent variables change in DV for any change in IV Applications forecasting, healthcare, economics, finance

Whether someone will respond or not to advertisements? Whether someone is a high default risk on a loan? Whether someone will buy or not buy? Whether the patient will responds to treatment or not? Whether a machine will fail next week?

Regression Analysis where DV is binary (0/1) – most common case Classify a new observation into a class based on its predictors Predictors can be categorical or continuous

Probability Odds Logit function Logistic function

Specification

Specify the logistic function Estimate the parameter βs Substitute the value of βs in model to estimate odds ratio = β 0 + β 1 x 1 + β 2 x 2 ·· ^ log p 1 – p () ^

Odds ratio : Amount odds change with unit change in input. 1  odds  exp(β i ) Δx i consequence... = β 0 + β 1 x 1 + β 2 x 2 ·· ^ log p 1 – p () ^

Can the categories be correctly predicted given a set of predictors? What is the relative importance of each predictor? Which predictors have a ‘statistically significant effect’?

Entry Cutoff Input p -value...

Entry Cutoff Input p -value...

Entry Cutoff Input p -value...

Entry Cutoff Input p -value...

Entry Cutoff Input p -value

Stay Cutoff Input p -value...

Stay Cutoff Input p -value...

Stay Cutoff Input p -value...

Stay Cutoff Input p -value...

Stay Cutoff Input p -value...

Stay Cutoff Input p -value...

Stay Cutoff Input p -value...

Stay Cutoff Input p -value

Entry Cutoff Stay Cutoff...

Input p -value Entry Cutoff Stay Cutoff...

Input p -value Entry Cutoff Stay Cutoff...

Input p -value Entry Cutoff Stay Cutoff...

Input p -value Entry Cutoff Stay Cutoff...

Input p -value Entry Cutoff Stay Cutoff...

Input p -value Entry Cutoff Stay Cutoff

Model fit statistic training validation...

Model fit statistic Evaluate each sequence step....

high leverage points skewed input distribution standard regression true association standard regression true association Original Input Scale...

high leverage points skewed input distribution standard regression true association standard regression true association Original Input Scale more symmetric distribution Regularized Scale...

Original Input Scale more symmetric distribution Regularized Scale standard regression... Original Input Scale high leverage points skewed input distribution

Regularized Scale standard regression... Original Input Scale regularized estimate

Regularized Scale standard regression... Original Input Scale regularized estimate true association