Download presentation
Presentation is loading. Please wait.
Published byachmad syaiful Modified over 6 years ago
1
Aplikasi Weight of Evidence pada data case study: Data Diskretisasi Oleh Achmad Syaiful (G152170321) Tugas Kuliah Pemodelan Klasifikasi Departemen Statistika Institut Pertanian Bogor
2
Outlines 1. Latar Belakang 2. Data dan Struktur 3. Metode dan Analisis 4. Hasil dan Pembahasan 5. Kesimpulan
3
Latar Belakang Discrete Categorical Continues Predictor Categorical Predictor Penggunaan WOE bertujuan untuk memperoleh model yang lebih baik Menyederhanakan data sehingga tidak dipengaruhi oleh outlier
4
Data dan Struktur Data Sekunder > head(data) x class 1 51 0 2 19 1 3 66 1 4 35 0 5 64 1 6 48 1 > summary(data) x class Min. :18.00 Min. :0.0000 1st Qu.:31.00 1st Qu.:0.0000 Median :44.00 Median :1.0000 Mean :42.81 Mean :0.5186 3rd Qu.:55.00 3rd Qu.:1.0000 Max. :67.00 Max. :1.0000
5
Metode dan Analisis Weight Of Evidence 1)Bagi data menjadi data training dan data testing dengan perbandingan (3:1) 2)Menguji data awal dengan metode Logistik Regresi 3)Hitung Weight of Evidence 4)Menguji data yang telah dibuat weight of evidence Tahapan Analisis:
6
Hasil dan Pembahasan acak <- sample(1:nrow(data), 1146) data.training <- data[acak,] data.testing <- data[-acak,] model.logistik<-glm(class~ x, data=data.training, family="binomial") summary(model.logistik) prob.prediksi<-predict(model.logistik, data.testing, type="response") prediksi 0.5, 1, 0) library(caret) confusionMatrix(prediksi, data.testing$class) Sintaks: Output: > confusionMatrix(prediksi, data.testing$class) Confusion Matrix and Statistics Reference Prediction 0 1 0 88 78 1 102 115 Accuracy : 0.53 95% CI : (0.4787, 0.5809) No Information Rate : 0.5039 P-Value [Acc > NIR] : 0.16581 Kappa : 0.0591 Mcnemar's Test P-Value : 0.08647 Sensitivity : 0.4632 Specificity : 0.5959 Pos Pred Value : 0.5301 Neg Pred Value : 0.5300 Prevalence : 0.4961 Detection Rate : 0.2298 Detection Prevalence : 0.4334 Balanced Accuracy : 0.5295 'Positive' Class : 0
7
Hasil dan Pembahasan (Cont.) library(woe) woe<-woe(Data=data.training,"x",TRUE,"class",7,Bad=0,Good=1) woe data.training$x_bin <- ifelse(data.training$x < 27, 110.7, ifelse(data.training$x < 35, -62.3, ifelse(data.training$x < 44, -189.7, ifelse(data.training$x < 51, -97.7, ifelse(data.training$x < 59, 49.5,238.4))))) data.testing$x_bin <- ifelse(data.testing$x < 27, 110.7, ifelse(data.testing$x < 35, -62.3, ifelse(data.testing$x < 44, -189.7, ifelse(data.testing$x < 51, -97.7, ifelse(data.testing$x < 59, 49.5,238.4))))) Sintaks: Output: > woe BIN MIN MAX BAD GOOD TOTAL BAD% GOOD% TOTAL% WOE IV BAD_SPLIT GOOD_SPLIT 1 1 18 27 44 147 191 0.081 0.245 0.167 110.7 0.182 0.230 0.770 2 2 27 35 120 71 191 0.220 0.118 0.167 -62.3 0.064 0.628 0.372 3 3 35 44 164 27 191 0.300 0.045 0.167 -189.7 0.484 0.859 0.141 4 4 44 51 135 56 191 0.247 0.093 0.167 -97.7 0.150 0.707 0.293 5 5 51 59 68 123 191 0.125 0.205 0.167 49.5 0.040 0.356 0.644 6 6 59 67 15 176 191 0.027 0.293 0.167 238.4 0.634 0.079 0.921
8
Hasil dan Pembahasan (Cont.) model.logistik<-glm(class~ x_bin, data=data.training, family="binomial") summary(model.logistik) prob.prediksi<-predict(model.logistik, data.testing, type="response") prediksi 0.5, 1, 0) library(caret) confusionMatrix(prediksi, data.testing$class) Sintaks: Output: > confusionMatrix(prediksi, data.testing$class) Confusion Matrix and Statistics Reference Prediction 0 1 0 140 50 1 50 143 Accuracy : 0.7389 95% CI : (0.6919, 0.7822) No Information Rate : 0.5039 P-Value [Acc > NIR] : <2e-16 Kappa : 0.4778 Mcnemar's Test P-Value : 1 Sensitivity : 0.7368 Specificity : 0.7409 Pos Pred Value : 0.7368 Neg Pred Value : 0.7409 Prevalence : 0.4961 Detection Rate : 0.3655 Detection Prevalence : 0.4961 Balanced Accuracy : 0.7389 'Positive' Class : 0
9
Kesimpulan Penggunaan WOE pada model Logistik Regresi Meningkatkan Akurasi pemodelan
10
Terima Kasih
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.