Presentation is loading. Please wait.

Presentation is loading. Please wait.

Predicting Loan Delinquency at 1M Transactions per Second

Similar presentations


Presentation on theme: "Predicting Loan Delinquency at 1M Transactions per Second"— Presentation transcript:

1 Predicting Loan Delinquency at 1M Transactions per Second
David R Community Lead, Microsoft

2 It looks like you’ve created a predictive model…
NOW WHAT?

3 TRAINING A MODEL IS EASY OPERATIONALIZING IT IS HARDER
TRAINING A MODEL IS EASY OPERATIONALIZING IT IS HARDER

4 Generating Predictions
Batch Mode Create many (millions!) of predictions at once Time required proportional to number of predictions Real Time Only a few (maybe only one!) data point available to predict There may be multiple requests in a short timeframe Latency the key metric here Many applications require sub-second latency at endpoint

5 Real-Time Operationalization Options
Rewrite prediction code in some other language PMML / C++ / Java / … OR, use your R code: Deploy as a web service with Microsoft R Server Deploy as a stored procedure in SQL Server

6 Lending Club Loan Performance Data
Feature selection and generation: aka.ms/lendingclub LoanStatNew Description all_util Balance to credit limit on all trades annual_inc_joint The combined self-reported annual income provided by the co-borrowers during registration dti_joint A ratio calculated using the co-borrowers' total monthly payments on the total debt obligations, excluding mortgages and the requested LC loan, divided by the co-borrowers' combined self-reported monthly income int_rate Interest Rate on the loan mths_since_last_record The number of months since the last public record. revol_util Revolving line utilization rate, or the amount of credit the borrower is using relative to all available revolving credit. total_rec_prncp Principal received to date is_bad (generated) Late > 16 days, Default, or Charged Off

7 Operationalization with Microsoft R Server
Consumption Explore and consume services in R directly Quant Microsoft R Client (mrsdeploy package) IT Administator Deployment Publish R function into web services Data Scientist Microsoft R Server configured for operationalizing R analytics Services / Sessions getService Microsoft R Client (mrsdeploy package) publishService Apps REST API calls Configuration In-cloud or on-prem Add nodes to scale out High availability & load balancing Integration Swagger-based APIs: Consume with any programming language Developer

8 Flexible vs Real-Time Deployment
Flexible Deployment Publish any R script or function as Web Service R interpreter runs script on demand via REST API Real-Time Deployment Publish R model object RevoScaleR or MicrosoftML Prediction engine generates scores from data via REST API library(mrsdeploy) publishService( serviceType='Script', Code=<<R script or function>>) library(mrsdeploy) publishService( serviceType='RealTime', model=<<R object>>)

9 Real-Time Deployment Models
Linear Regression (rxLinMod, rxFastLinear) Logistic Regression (rxLogit, rxLogisticRegression) Classification / Regression trees (rxDTree, rxFastTrees) Classification / Regression forests (rxDForest, rxFastForest) Stochastic gradient-boosted decision trees (rxBTrees) One-class Support Vector Machines (rxOneClassSvm) Convolutional Neural Networks (rxNeuralNet) Also: pre-trained models for text sentiment and image featurization Source: Have a model object that was created with following supported functions: From RevoScaleR package, these specific functions: rxLogit, rxLinMod, rxBTrees, rxDTree, and rxDForestfunctions From MicrosoftML package, only the machine learning tasks and transform tasks functions, which include rxFastTrees, rxFastForest, rxLogisticRegression, rxOneClassSvm, rxNeuralNet, rxFastLinear, featurizeText, concat, categorical, categoricalHash, selectFeatures, featurizeImage, getSentiment, loadimage, resizeImage, extractPixels, selectColumns, and dropColumns

10 Flexible and real-time scoring with Microsoft R Server
Demonstration Server: Azure Data Science Virtual Machine, Azure GS5 instance (32 cores, 448 GB memory) Client: SurfaceBook Flexible and real-time scoring with Microsoft R Server

11

12

13

14 Remote client, 10 threads with a payload of 100 predictions

15 Remote client, 10 threads with a payload of 100 predictions

16 Flexible vs Real-Time Performance Comparison
Server: Standard_D3_v2 (4 CPU core, 14GB RAM), Windows Algos Real time (ms) Flexible (ms) RxLogit (model size 2K) 3.5 39.2 RxNeuralNet (model size 8K) 2.5 122.0 Model Size Real time (ms) Flexible (ms) 2 MB (RxLogisticRegression) 5.0 9215.7 43 MB 5.4

17 Deployment in SQL Server 2016
Apps SQL SERVER 2016 sp_execute_external_script Flexible Microsoft R Client (RevoScaleR package) rxSerializeObject Apps sp_rxPredict Real-Time

18 SQL Server: R Script Operationalization

19 SQL Server: Real-Time Operationalization

20 1M predictions/sec Same benchmark
One-sixth the resources SQL Server 2017 8 sockets, 192 cores 6 TB RAM Flexible operationalization blog.revolutionanalytics.com/2016/09/fraud-detection.html

21 Operationalization Overview
Platform Flexible Operationalization Any R Function / Package Real-Time Operationalization Specific RevoScaleR / MicrosoftML models SQL Server EXEC sp_execute_external_script @language = N'R', @script = N'<<R script>>' EXEC sp_rxPredict @model=<<serialized R object>> @inputData=<<SQL query>> Microsoft R Server library(mrsdeploy) publishService( serviceType='Script', Code=<<R script or function>>) publishService( serviceType='RealTime', model=<<R object>>) Use Microsoft R Server 9+ or SQL Server as the deployment server Flexible Operationalization supports any R code / package Real-Time Operationalization supports Microsoft R models with improved latency

22 Thank You! David Smith @revodavid R Community Lead, Microsoft
Special thanks: Pratik Palnitkar, Microsoft Arun Gurunathan, Microsoft Download Microsoft R Client: aka.ms/rclient Data Science Virtual Machine: aka.ms/dsvm


Download ppt "Predicting Loan Delinquency at 1M Transactions per Second"

Similar presentations


Ads by Google