Presentation is loading. Please wait.

Presentation is loading. Please wait.

Piecewise Logistic Regression: An application in credit scoring

Similar presentations


Presentation on theme: "Piecewise Logistic Regression: An application in credit scoring"— Presentation transcript:

1 Piecewise Logistic Regression: An application in credit scoring
By Raymond Anderson Standard Bank of South Africa Presented at the Credit Scoring and Control IV conference Edinburgh 26-28 August, 2015 10/09/2018

2 Variable selection is like landing a plane in a cross-wind:
Aircraft analogy Variable selection is like landing a plane in a cross-wind: Airplane - always adjusting to ensure a safe landing Regression – always adjusting to ensure the best fit Standard WOE regression – like a fixed wing aircraft. Piecewise WOE regression – more like a bird, adjusting each wing independently as required. Greater maneuverability. 10/09/2018

3 Definitions With respect to a number of discrete intervals, sets, or pieces <piecewise continous functions> [Merriam Webster] Denoting that a function has a specified property [such] as smoothness or continuity, on each of a finite number of pieces into which its domain is divided [ In mathematics, a piecewise-defined function (also called a piecewise or hybrid function) is a function which is defined by multiple sub functions, each applying to a certain interval of the main functions domain [Wikipedia] Piecewise regression, also known as segmented or “broken-stick” regression, is a method in regression analysis in which the independent variable is partitioned into intervals and a separate line segment is fit to each interval [Wikipedia] Mostly associated with linear regresssion Few or no references found for logistic regession 10/09/2018

4 Piece Assignment Split into high- and low- risk pieces, and treat discontinuities (e.g. at 0 or 100%) separately. - High- vs. low risk to address different ends of risk spectrum - Discontinuities where data is conflicted (e.g. what do 0 and 100 really mean?) Maximum of 4 pieces used, usually only 2 or 3 10/09/2018

5 Low-income loan portfolio
High default portfolio Loan volumes peaked in H and H1 2012 With reduced risk appetite, loan volumes fell heavily Policy decline on Gross Income increased from $100 to $300 p.m. Loans for terms less than 12 months curtailed Thru-the-door profile is now much lower risk 10/09/2018

6 Bought performance! Reject performance was “bought”, i.e. no reject inference was done Through-the-door population now better than development accepts Old score implemented during Out-of-time period 10/09/2018

7 Types Transformation type: Regression type: Time:
Base case, using one variable per characteristic; Piecewise, using multiple user-defined variables per characteristic; Dummy, using multiple variables per characteristic, one per coarse class. Regression type: Positive/negative, the normal stepwise logistic results; Positive only, which removed any variables with negative beta coefficients; and Limited, where the number of variables is limited to further avoid overfitting; Time: Development, second half of 2012; First out-of-time, first half of 2013; and Second out-of-time, second half of 2013. 10/09/2018

8 Gini results Reduces over time, due to risk homogenisation of portfolio and scorecard deterioration. Reduces as steps taken to ensure model robustness, but marginally. Piecewise almost always performs better (exception is Dev/Stepped/Dummy) 10/09/2018

9 Power Progression Used 25 variables that appeared in any of the “Stepped” models Starting point was Piecewise, and then moved towards extreme ends Modified characteristics in alphabetical (near random) order Gini coefficient shows a curvilinear relationship Gini peaks in middle around Piecewise, and then deteriorates marginally 10/09/2018

10 Why???? Hypothesis: Base Case vs. Piecewise Dummy WOE vs. Piecewise
Better reflects the non-linear relationships within the data Provided many common-sense insights Emphasis on “Time with Bank” for up to 7 years (higher beta), “Age of customer < 45” and “Time at Employment < 7 years” not penalised but higher values rewarded. Lack of negative bureau information rewarded, but penalties for recent credit appetite (enquiries, new loans) and lack of credit experience (number of profiles). Dummy WOE vs. Piecewise Dummy model focused on tails of the distribution Insufficient focus on the mid-range Model too simple Einstein: “Everything must be as simple as possible… but not simpler”. 10/09/2018

11 Model complexity Base case one variable per characteristic, dummy one per attribute, piecewise varies. As variable rationalisation progresses, complexity reduces. Attributes decrease, variables increase, and characteristics vary. = = 10/09/2018

12 Caveats Development was done on a unique dataset, rich in bad accounts over a period that provided multiple out-of-time samples. The process of eliminating unstable characteristics by referring to out-of-time samples is not standard, which may raise expectations for out-of-time benefits. Where reject inference is done, the inference results will heavily influence the results in the higher risk spectrum. Extra care needs to be taken where there are low good or bad volumes in any of the risk buckets There may be some issues with monitoring and validation, as processes/calculations may need to be modified. Benefits may reduce if staging done, or variables forced into model. 10/09/2018

13 Conclusion Piecewise Benefits Caveats
Better represents non-linear relationships, without being too simple Marginally more predictive on development, but more robust out of time Relative 0.9% lift on development, but up to 4.4% out-of-time Model interpretation provided greater portfolio insights Potentially easier implementation Less need for multiple models where segmentation done on risk Caveats Unusual situation, with many bads and multiple out-of-time samples High out-of-time results were raised by eliminating unstable characteristics For application scoring, reject inference will have a greater effect on high-risk region Validation and monitoring may be affects: a) marginally more characteristics to monitor; b) some calculations may need reviewing (e.g. characteristic contribution) 10/09/2018

14 QUESTIONS??? 10/09/2018


Download ppt "Piecewise Logistic Regression: An application in credit scoring"

Similar presentations


Ads by Google