Presentation is loading. Please wait.

Presentation is loading. Please wait.

Credit Card Applicants’ Credibility Prediction with Decision Tree n Dan Xiao n Jerry Yang.

Similar presentations


Presentation on theme: "Credit Card Applicants’ Credibility Prediction with Decision Tree n Dan Xiao n Jerry Yang."— Presentation transcript:

1 Credit Card Applicants’ Credibility Prediction with Decision Tree n Dan Xiao n Jerry Yang

2 Agenda n Project Goal n Project Domains n Project Method n Implementation n Conclusions n Experience

3 Project Goal n The objective of this project is to build a decision tree to predict the classification (good or bad) of credit card applicants.

4 Project Domains n There are around 4000 records n There are 15 attributes –Credit_Card_Debt –Highest_Credit_Card_APR –Monthly_Car_Pmt –Monthly_Income –Monthly_Mortage –Martial_Status

5 Project Domains n Attributes continued –No_of_Credit_Cards –Year_of_Employment –Citizenship –Home_Ownership –Accounts –Sex –Race –Results

6 Project Method n Classification and Prediction –Construct a model (Decision Tree) with the training data –Apply the testing data to the model (Decision Tree) to predict the applicants’ credibility

7 Implementation n Software WEKA Classifier Filter Learning Schemes - ZeroR -oneR -M5 -J48

8 Implementation (Continued) n Relevance Analysis n Data Cleaning –File conversion –Missing data –Outlier

9 Implementation (Continued) n Testing Dataset –File conversion –Data testing Percentage split Supplied test set Cross-validation

10 Percentage split Split Ratio Confidence Number Of Leaves Number of Nodes Correct Classification 25%0.25305494.77% 25%0.153594.77%

11 Supplied Data test ConfidenceNumber of LeavesNumber of NodesCorrect Classification 0.153594.67% 0.25305496.08%

12 Cross-validation Fold Number ConfidenceNumber of Leaves Number of Nodes Correct Classification 50.153594.28% 100.153594.21% 150.153594.46%

13 Cross-validation (Continued) Fold Number ConfidenceNumber of Leaves Number of Nodes Correct Classification 50.25305494.28% 100.25305494.21% 150.25305494.46%

14 Conclusions n We are satisfied with the accuracy (correct classification) n Cross-validation is good for use when the dataset is small. n The pruned tree models have huge difference with confidence 0.15 and 0.25

15 Future Work n Entropy-based discretization can reduce Overfitting n Accuracy is not satisfied with big dataset n Data bias can’t be avoid completely

16 Experience n Original dataset n New dataset n Results

17 Thank You


Download ppt "Credit Card Applicants’ Credibility Prediction with Decision Tree n Dan Xiao n Jerry Yang."

Similar presentations


Ads by Google