Credit Card Applicants’ Credibility Prediction with Decision Tree n Dan Xiao n Jerry Yang.

Credit Card Applicants’ Credibility Prediction with Decision Tree n Dan Xiao n Jerry Yang

Agenda n Project Goal n Project Domains n Project Method n Implementation n Conclusions n Experience

Project Goal n The objective of this project is to build a decision tree to predict the classification (good or bad) of credit card applicants.

Project Domains n There are around 4000 records n There are 15 attributes –Credit_Card_Debt –Highest_Credit_Card_APR –Monthly_Car_Pmt –Monthly_Income –Monthly_Mortage –Martial_Status

Project Domains n Attributes continued –No_of_Credit_Cards –Year_of_Employment –Citizenship –Home_Ownership –Accounts –Sex –Race –Results

Project Method n Classification and Prediction –Construct a model (Decision Tree) with the training data –Apply the testing data to the model (Decision Tree) to predict the applicants’ credibility

Implementation n Software WEKA Classifier Filter Learning Schemes - ZeroR -oneR -M5 -J48

Implementation (Continued) n Relevance Analysis n Data Cleaning –File conversion –Missing data –Outlier

Implementation (Continued) n Testing Dataset –File conversion –Data testing Percentage split Supplied test set Cross-validation

Percentage split Split Ratio Confidence Number Of Leaves Number of Nodes Correct Classification 25%0.25305494.77% 25%0.153594.77%

Supplied Data test ConfidenceNumber of LeavesNumber of NodesCorrect Classification 0.153594.67% 0.25305496.08%

Cross-validation Fold Number ConfidenceNumber of Leaves Number of Nodes Correct Classification 50.153594.28% 100.153594.21% 150.153594.46%

Cross-validation (Continued) Fold Number ConfidenceNumber of Leaves Number of Nodes Correct Classification 50.25305494.28% 100.25305494.21% 150.25305494.46%

Conclusions n We are satisfied with the accuracy (correct classification) n Cross-validation is good for use when the dataset is small. n The pruned tree models have huge difference with confidence 0.15 and 0.25

Future Work n Entropy-based discretization can reduce Overfitting n Accuracy is not satisfied with big dataset n Data bias can’t be avoid completely

Experience n Original dataset n New dataset n Results

Thank You

Credit Card Applicants’ Credibility Prediction with Decision Tree n Dan Xiao n Jerry Yang.

Similar presentations

Presentation on theme: "Credit Card Applicants’ Credibility Prediction with Decision Tree n Dan Xiao n Jerry Yang."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Credit Card Applicants’ Credibility Prediction with Decision Tree n Dan Xiao n Jerry Yang.

Similar presentations

Presentation on theme: "Credit Card Applicants’ Credibility Prediction with Decision Tree n Dan Xiao n Jerry Yang."— Presentation transcript:

Similar presentations

About project

Feedback