Predicting Loan Delinquency at 1M Transactions per Second

Slides:



Advertisements
Similar presentations
System Center 2012 R2 Overview
Advertisements

Cloud OS Microsoft’s Vision of the Unified Platform for Modern Business.
Google App Engine Cloud B. Ramamurthy 7/11/2014CSE651, B. Ramamurthy1.
Integration in the Cloud Overview Relay Topic/Queues BizTalk Services (preview)
GOOGLE APP ENGINE By Muktadiur Rahman. Contents  Cloud Computing  What is App Engine  Why App Engine  Development with App Engine  Quote & Pricing.
9/24/2017 7:27 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Building 1 million predictions per second using SQL-R
Platform as a Service (PaaS)
Progress Apama Fundamentals
R + R Tool for Visual Studio= Data Science
IT06 – HAVE YOUR OWN DYNAMICS NAV TEST ENVIRONMENT IN 90 MINUTES
4/19/ :02 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Platform as a Service (PaaS)
Dynamics 365 Enterprise Edition
Scalable Web Apps Target this solution to brand leaders responsible for customer engagement and roll-out of global marketing campaigns. Implement scenarios.
5/16/2018 3:45 AM BRK3298 Building machine learning applications using R and Python in SQL Server 2017 Umachandar Jayachandran (UC) Sumit Kumar Program.
Predicting Azure Consumption using Ensemble Learning
Diskpool and cloud storage benchmarks used in IT-DSS
Lead SQL BankofAmerica Blog: SQLHarry.com
Analyzing Big Data with Microsoft R
Microsoft Azure P wer Lunch
AI development using Data Science Virtual Machines (DSVM) in Azure
Spark Presentation.
Logo here Module 3 Microsoft Azure Web App. Logo here Module Overview Introduction to App Service Overview of Web Apps Hosting Web Applications in Azure.
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
Introduction to R Programming with AzureML
SQL Server Data Tools for Visual Studio Part I: Core SQL Server Tools
Nimble Streamer Helps Media Content Providers Create Streaming Networks Cost-Effectively and Easily by Utilizing Azure’s Worldwide Scalability MICROSOFT.
Scalable Web Apps Target this solution to brand leaders responsible for customer engagement and roll-out of global marketing campaigns. Implement scenarios.
Database Testing in Azure Cloud
02 | Design and implement database
Melbourne Azure Meetup
Microsoft Bot Framework: changing how we communicate with users
Build /21/2018 © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION.
Microsoft Build /8/2018 5:15 AM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
Machine Learning Services
Using docker containers
Machine Learning Services
11/14/ :30 PM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Users Manage Terabytes of Data with Powerful and Agnostic Hosting from Azure Cloud Service Partner Logo “Given the challenges we face both in dealing with.
Learning Predictive Modeling with Data from Lending Club
TechEd /23/ :44 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Accelerate Your Self-Service Data Analytics
Microsoft Virtual Academy
Outline Virtualization Cloud Computing Microsoft Azure Platform
Moving advanced analytics to your SQL Server databases
Modern cloud PaaS for mobile apps, web sites, API's and business logic apps
Microsoft Virtual Academy
Microsoft Virtual Academy
TechEd /11/ :54 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Technical Capabilities
Declarative Transfer Learning from Deep CNNs at Scale
Machine Learning Interpretability
Building and running HPC apps in Windows Azure
Developing for Windows Azure
TechEd /11/ :25 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Predictive Models with SQL Server Machine Learning Services
4/18/2019 9:46 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Service Template Creation from the Ground Up
Predicting Loan Defaults
Azure Container Service
PerformanceBridge Application Suite and Practice 2.0 IT Specifications
SCCM in hybrid world Predrag Jelesijević Microsoft 7/6/ :17 AM
The Gamma Operator for Big Data Summarization on an Array DBMS
A DevOps process for deploying R to production
Microsoft Virtual Academy
Johan Lindberg, inRiver
Microsoft 365 Business Technical Fundamentals Series
06 | SQL Server and the Cloud
Presentation transcript:

Predicting Loan Delinquency at 1M Transactions per Second David Smith @revodavid R Community Lead, Microsoft

It looks like you’ve created a predictive model… NOW WHAT?

TRAINING A MODEL IS EASY OPERATIONALIZING IT IS HARDER http://hamiltonmusical.wikia.com/wiki/Right_Hand_Man TRAINING A MODEL IS EASY OPERATIONALIZING IT IS HARDER

Generating Predictions Batch Mode Create many (millions!) of predictions at once Time required proportional to number of predictions Real Time Only a few (maybe only one!) data point available to predict There may be multiple requests in a short timeframe Latency the key metric here Many applications require sub-second latency at endpoint

Real-Time Operationalization Options Rewrite prediction code in some other language PMML / C++ / Java / … OR, use your R code: Deploy as a web service with Microsoft R Server Deploy as a stored procedure in SQL Server

Lending Club Loan Performance Data www.lendingclub.com/info/download-data.action Feature selection and generation: aka.ms/lendingclub LoanStatNew Description all_util Balance to credit limit on all trades annual_inc_joint The combined self-reported annual income provided by the co-borrowers during registration dti_joint A ratio calculated using the co-borrowers' total monthly payments on the total debt obligations, excluding mortgages and the requested LC loan, divided by the co-borrowers' combined self-reported monthly income int_rate Interest Rate on the loan mths_since_last_record The number of months since the last public record. revol_util Revolving line utilization rate, or the amount of credit the borrower is using relative to all available revolving credit. total_rec_prncp Principal received to date is_bad (generated) Late > 16 days, Default, or Charged Off

Operationalization with Microsoft R Server Consumption Explore and consume services in R directly Quant Microsoft R Client (mrsdeploy package) IT Administator Deployment Publish R function into web services Data Scientist Microsoft R Server configured for operationalizing R analytics Services / Sessions getService Microsoft R Client (mrsdeploy package) publishService Apps REST API calls Configuration In-cloud or on-prem Add nodes to scale out High availability & load balancing Integration Swagger-based APIs: Consume with any programming language Developer

Flexible vs Real-Time Deployment Flexible Deployment Publish any R script or function as Web Service R interpreter runs script on demand via REST API Real-Time Deployment Publish R model object RevoScaleR or MicrosoftML Prediction engine generates scores from data via REST API library(mrsdeploy) publishService( serviceType='Script', Code=<<R script or function>>) library(mrsdeploy) publishService( serviceType='RealTime', model=<<R object>>)

Real-Time Deployment Models Linear Regression (rxLinMod, rxFastLinear) Logistic Regression (rxLogit, rxLogisticRegression) Classification / Regression trees (rxDTree, rxFastTrees) Classification / Regression forests (rxDForest, rxFastForest) Stochastic gradient-boosted decision trees (rxBTrees) One-class Support Vector Machines (rxOneClassSvm) Convolutional Neural Networks (rxNeuralNet) Also: pre-trained models for text sentiment and image featurization Source: https://msdn.microsoft.com/en-us/microsoft-r/operationalize/data-scientist-manage-services#publish-web-services Have a model object that was created with following supported functions: From RevoScaleR package, these specific functions: rxLogit, rxLinMod, rxBTrees, rxDTree, and rxDForestfunctions From MicrosoftML package, only the machine learning tasks and transform tasks functions, which include rxFastTrees, rxFastForest, rxLogisticRegression, rxOneClassSvm, rxNeuralNet, rxFastLinear, featurizeText, concat, categorical, categoricalHash, selectFeatures, featurizeImage, getSentiment, loadimage, resizeImage, extractPixels, selectColumns, and dropColumns https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-azure-ml-netsharp-reference-guide

Flexible and real-time scoring with Microsoft R Server Demonstration Server: Azure Data Science Virtual Machine, Azure GS5 instance (32 cores, 448 GB memory) Client: SurfaceBook Flexible and real-time scoring with Microsoft R Server

Remote client, 10 threads with a payload of 100 predictions

Remote client, 10 threads with a payload of 100 predictions

Flexible vs Real-Time Performance Comparison Server: Standard_D3_v2 (4 CPU core, 14GB RAM), Windows Algos Real time (ms) Flexible (ms) RxLogit (model size 2K) 3.5 39.2 RxNeuralNet (model size 8K) 2.5 122.0 Model Size Real time (ms) Flexible (ms) 2 MB (RxLogisticRegression) 5.0 9215.7 43 MB 5.4 20255.6

Deployment in SQL Server 2016 Apps SQL SERVER 2016 sp_execute_external_script Flexible Microsoft R Client (RevoScaleR package) rxSerializeObject Apps sp_rxPredict Real-Time

SQL Server: R Script Operationalization

SQL Server: Real-Time Operationalization

1M predictions/sec Same benchmark One-sixth the resources SQL Server 2017 8 sockets, 192 cores 6 TB RAM Flexible operationalization blog.revolutionanalytics.com/2016/09/fraud-detection.html

Operationalization Overview Platform Flexible Operationalization Any R Function / Package Real-Time Operationalization Specific RevoScaleR / MicrosoftML models SQL Server EXEC sp_execute_external_script @language = N'R', @script = N'<<R script>>' EXEC sp_rxPredict @model=<<serialized R object>> @inputData=<<SQL query>> Microsoft R Server library(mrsdeploy) publishService( serviceType='Script', Code=<<R script or function>>) publishService( serviceType='RealTime', model=<<R object>>) Use Microsoft R Server 9+ or SQL Server 2016+ as the deployment server Flexible Operationalization supports any R code / package Real-Time Operationalization supports Microsoft R models with improved latency

Thank You! David Smith @revodavid R Community Lead, Microsoft Special thanks: Pratik Palnitkar, Microsoft Arun Gurunathan, Microsoft Download Microsoft R Client: aka.ms/rclient Data Science Virtual Machine: aka.ms/dsvm