Predicting Mortgage Pre-payment Risk
Introduction Definition Borrower pays off the loan before the contracted term loan length. Lender loses future part of the income stream associated with the loan. Inspiration & Previous work Oded Netzer’s talk on text-mining techniques for business applications – loan default prediction.
Breakdown of the Problem Target To pin-point among borrower profiles, who all likely to prepay the mortgage. Data UCI Machine Learning Repository : Credit Approval Dataset – 690 observations; 15 attributes. German Credit Dataset – 1000 observations, 20 attributes. Give Me Some Credit - Kaggle credit-scoring competition - very large.
Prospective Solutions Techniques for categorization of data Decision Tree - recursively separates observations in branches to construct a tree to improve prediction accuracy, with use of measurements like information gain ratio, Gini index etc. Naïve Bayes Classifier - calculates a set of probabilities by counting the frequency and combinations of values in the dataset. Logistic Regression - measures the relationship between the one dependent binary variable and one or more independent variables. by estimating probabilities using a logistic function. Appropriate when the dependent variable is binary.
Evaluation of Results Measure of success How clear the proposed categorization scheme is proposed. How applicable the scheme is in enabling lenders in managing prepayment risk by providing a useful structure for early mitigation targeting.