Week 3
Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs
Estimating relation between different variables dependent variables and independent variables change in DV for any change in IV Applications forecasting, healthcare, economics, finance
Whether someone will respond or not to advertisements? Whether someone is a high default risk on a loan? Whether someone will buy or not buy? Whether the patient will responds to treatment or not? Whether a machine will fail next week?
Regression Analysis where DV is binary (0/1) – most common case Classify a new observation into a class based on its predictors Predictors can be categorical or continuous
Probability Odds Logit function Logistic function
Specification
Specify the logistic function Estimate the parameter βs Substitute the value of βs in model to estimate odds ratio = β 0 + β 1 x 1 + β 2 x 2 ·· ^ log p 1 – p () ^
Odds ratio : Amount odds change with unit change in input. 1 odds exp(β i ) Δx i consequence... = β 0 + β 1 x 1 + β 2 x 2 ·· ^ log p 1 – p () ^
Can the categories be correctly predicted given a set of predictors? What is the relative importance of each predictor? Which predictors have a ‘statistically significant effect’?
Entry Cutoff Input p -value...
Entry Cutoff Input p -value...
Entry Cutoff Input p -value...
Entry Cutoff Input p -value...
Entry Cutoff Input p -value
Stay Cutoff Input p -value...
Stay Cutoff Input p -value...
Stay Cutoff Input p -value...
Stay Cutoff Input p -value...
Stay Cutoff Input p -value...
Stay Cutoff Input p -value...
Stay Cutoff Input p -value...
Stay Cutoff Input p -value
Entry Cutoff Stay Cutoff...
Input p -value Entry Cutoff Stay Cutoff...
Input p -value Entry Cutoff Stay Cutoff...
Input p -value Entry Cutoff Stay Cutoff...
Input p -value Entry Cutoff Stay Cutoff...
Input p -value Entry Cutoff Stay Cutoff...
Input p -value Entry Cutoff Stay Cutoff
Model fit statistic training validation...
Model fit statistic Evaluate each sequence step....
high leverage points skewed input distribution standard regression true association standard regression true association Original Input Scale...
high leverage points skewed input distribution standard regression true association standard regression true association Original Input Scale more symmetric distribution Regularized Scale...
Original Input Scale more symmetric distribution Regularized Scale standard regression... Original Input Scale high leverage points skewed input distribution
Regularized Scale standard regression... Original Input Scale regularized estimate
Regularized Scale standard regression... Original Input Scale regularized estimate true association