Download presentation
Presentation is loading. Please wait.
Published byRafe Smith Modified over 6 years ago
1
Can-CSC-GBE: Developing Cost-sensitive Classifier with Gentleboost Ensemble for breast cancer classification using protein amino acids and imbalanced data Source : Computers in Biology and Medicine, 2016, 73:38-46 Authors : Ali S, Majid A, Javed SG Speaker : Jiefan Tan Date : 2017/4/6 巴基斯坦应用和科学学院
2
Outline Introduction Related Work Proposed method Experiment
Conclusion
3
Introduction(1/3) Traditional machine learning techniques assume that all classification errors have the same cost and try to minimize the number of errors rather than the total cost. While in real-world applications, errors often have quite different cost. And there are a large number of imbalanced datasets in practical application.
4
Introduction(2/3)- Different cost
Traditional classification Mental pressure May die Healthy Cancer
5
Introduction(3/3) Dataset = * 1+ *99 overall accuracy rate = 99%
accuracy of the minority class = 0 classification * 100 May die
6
Cost-Sensitive Algorithm
Related Work Technology Methods 代表算法 Rescaling Thresholding ETA Sampling BFKO,ADSNNHRS Weighting C4.5CS Cost-Sensitive Algorithm Cost-Sensitive Decision Tree C4.5CS,CBDSDT Cost-Sensitive Neural Networks CSBNN,SDAE Cost-Sensitive SVM CSSVM,CISVM Ensemble Algorithm Bagging Boosting AdaBoost gentleBoost Evaluation Criteria Based on Cost Matrix Precision,Recall,F-value,G-mean Curve, Chart ROC,AUC, Cost-Curve
7
Proposed method(1/6)-Cost matrix
Cost matrix to evaluate two-class problem is shown in Table In the table, we use the notation C(i,j) to represent the misclassification cost of classifying an instance from its actual class j into the predicted class i. Actual negative Actual positive Predict negative C(0,0), or TN C(0,1), or FN Predict positive C(1,0), or FP C(1,1), or TP
8
Proposed method(2/6)-CSC(1/2)
The expect cost R(i|x) of classifying an instance x into class i (by a classifier) can be expressed as: If P(j|x)>0.5 then x -> class j
9
Proposed method(3/6)-CSC(2/2)
The classifier will classify an instance x into positive class if and only if :
10
Proposed method(4/6)-Boosting
Training set …… …… Boosting Sample 1 Boosting Sample K Boosting Sample T Classifier 1 Classifier K Classifier T Boosting Ensemble Classifier Voting results
11
Decision tree as base learners
Proposed method(5/6) Extract Features Decision tree as base learners Ensemble Classifier
12
Proposed method(6/6)-Evaluation Criteria
Symbol Calculate method Precision TP/(TP+FP) Recall (Sensitivity)Sp TP/(TP+FN) Specificity-Sn TN/(TN+FP) G-mean
13
Experiment
14
Conclusion Imbalanced data classification problem has always been one of the important research issues in machine learning field. The proposed Can-CSC-GBE system has effectively reduced the misclassification costs and thereby improved the overall classification performance.
15
Thank you !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.