Download presentation
Presentation is loading. Please wait.
Published byOwen Simmons Modified over 8 years ago
1
CLASSIFICATION: LOGISTIC REGRESSION Instructor: Dr. Chun Yu School of Statistics Jiangxi University of Finance and Economics Fall 2015
2
Dependent Variable Y In many regression applications the dependent variable y is categorical, we can describe it in binary fashion such that any outcome is either a success or a failure (arbitrarily defined) For example, tossing a coin, we can get either heads (success) or tails (failure) y=1, if outcome is success y=0, if outcome is failure If repeat the experiment for n times, then y follows a binomial distribution
3
Logistic Regression Model
5
Estimating P
6
Example Home owner Marital Status Taxable Income Defaulted Borrower Yes No Yes No Yes No Single Married Single Married Divorce Married Divorce Single Married Single 125k 100k 70k 120k 95k 60k 220k 85k 75k 90k No Yes No Yes No Yes
7
Inputted data HO=1 if “Yes”; HO=0 if “No” MS=1 if “single or divorce”; MS=0 if “married” Y=1 if “No”; Y=0 if “Yes” HOMSTIY 10010010001001001000 10101011011010101101 125 100 70 120 95 60 220 85 75 90 11110110101111011010
8
R Results for Logistic Regression > HO=c(1,0,0,1,0,0,1,0,0,0) > MS=c(1,0,1,0,1,0,1,1,0,1) > TI=c(125,100,70,120,95,60,220,85,75,90) > y=c(1,1,1,1,0,1,1,0,1,0) > mylogit <- glm(y ~ HO + MS + TI,family = "binomial") > mylogit$coef (Intercept) HO MS TI 318.634434 443.966839 -89.995028 -2.946977
9
Prediction HO=0, MS=1, TI=100, P=? The predicted P is: 2.220446e-16 Classification? Home owner Marital Status Taxable Income Defaulted Borrower NoSingle100k?
10
Classification Error Rate ##Prediction on Training data and test data > ##correct classification rate > sum((newdata3$PredictedProb-trainData$y)<0.0001)/ nrow(trainData) [1] 1 > sum((newdata3$PredictedProb-testData$y)<0.0001)/ nrow(testData) [1] 1
11
Decision Tree Classification > myFormula <- y ~ HO+MS+TI > loan_ctree <- ctree(myFormula, data=trainData) > # check the prediction > testPred <- predict(loan_ctree, newdata = testData) > table(testPred, testData$y) testPred 0 1 misclassification 0.111111 7 1 0.125 1 0 23 0
12
Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.