Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Empirical Study of Learning from Imbalanced Data Using.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Empirical Study of Learning from Imbalanced Data Using."— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Empirical Study of Learning from Imbalanced Data Using Random Forest Presenter : Ai-Chen Liao Authors : Taghi M. Khoshgofattr, Moiz Golawala, and Jason Van Hulse 2007. ICTAI. Page : 310 - 317

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective Method Experiment Experimental Result Conclusion Comments

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation A tree A forest

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Motivation  RF is a relatively new learner, only preliminary experimentation on the construction of random forest classifiers in the context of imbalanced data has been reported in previous work.  What should be the recommended default number of trees in the ensemble?  What should the recommended value be for the number of attributes?  How does the RF learner perform on imbalanced data when compared with other commonly-used learners? NB, SVM, KNN, C4.5, etc. …

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Objective  This work, is the first to conduct comprehensive experimentation with the RF learner in Weka and recommend empirically proven default values for the numTrees and numFeatures parameters.

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Method ─ RF 6 Dataset : 取後放回 1 … 2 1 2 3 4 5

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Method ─ Experimental Datasets Metrics :  The area under the ROC curve (AUC)  The Kolmogorov-Smirnov (KS)

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M.  numFeatures  numTrees 8 Experimental Results  Phase 1: Selecting an Appropriate RF Learner

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experimental Results  Phase 2: Comparison of RF-100 to Other Learners 9 Good !

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 10 Conclusion  The contribution of this study is to provide an extensive empirical evaluation of RF learners built from imbalanced data.  The parameters for the RF learners were chosen to ensure good performance in many different circumstances and to be reasonable for the imbalanced datasets.

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 11 Comments Advantage  Building many learners in these experiments let me believe in the reliability of their experimental results. Drawback  Due to space restrictions many experiments results are not included here. Application  Handling imbalanced data


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Empirical Study of Learning from Imbalanced Data Using."

Similar presentations


Ads by Google