Blazingly Fast Machine Learning Inference Vish Abrams Architect, Cloud Development Machine Learning Team, Oracle Cloud Infrastructure October 22, 2018
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation.
Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6
Program Agenda 1 Machine Learning Inference What is GraphPipe? Performance Protocol Deep Dive Real World Demo More Info 2 3 4 5 6
Machine Learning Inference (Model Serving) Building machine learning models has become much easier due to open source frameworks like TensorFlow and Pytorch Serving machine learning models means putting your trained model onto a server so that it can be accessed by client applications This involves two components: the ML client and the ML server. The client talks to the server using some kind of communication protocol: often JSON over HTTP.
ML Client
ML Server
Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6
What is GraphPipe? GraphPipe is an open source protocol and collection of software designed to simplify machine learning model deployment and decouple it from framework-specific model implementations.
In other words, it turns this: What is GraphPipe? In other words, it turns this: mxnet server tensorflow serving custom server standard json custom protocol protocol buffers custom client autogenerated client
What is GraphPipe? Into this: graphpipe-onnx graphpipe-tf
GraphPipe Features A minimalist machine learning transport specification based on flatbuffers Simple reference model servers for Tensorflow, Caffe2, and ONNX. Efficient client implementations in Go, Python, and Java.
Why Did we Make It? Production deployments of AI agents are around the corner Model Serving is an important part of production solutions Existing solutions suffer from various problems: Inconsistent Inefficient Custom Clients A standard along with simple implementations moves the industry forward
Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6
Ease of Development Model Servers are written in Go – a very accessible language Flatbuffer code generation makes it easy to produce new clients Open spec makes it possible to integrate with existing servers
Protocol Performance
Serving Performance
Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6
Flatbuffers Extensible protocol Small code footprint Near-zero deserialization overhead
Protocol Summary
Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6
AlphaZero Timeline AlphaGo Beats Fan Hui Oct-16 Paper published in Nature Jan-16 AlphaGo Beats Lee Sedol Mar-16 AlphaGo Beats Ke Jie May-17 AlphaGoZero published Oct-17 AlphaZero published Dec-17
AlphaZero Algorithm for training a machine to play any game* Any game that can be represented with a Markov Process Trained without human information through self play Needs a structured representation of the game state Needs rules for transitioning from one state to the next
The Game Playing Black Box Neural Network Position Move
Training the Network Training Labeled Data Neural Network
Generating Data Self-Play (MCTS) Neural Network Labeled Data
AlphaZero In a nutshell Neural Network Self-Play Training Labeled Data
AlphaZero for Connnect Four We trained a network to play Connect Four using 150 cycles of this process (and playing about 1,000,000 games during self-play) The network finds the correct move in about 99% of positions We used GraphPipe as part of the training process because we were generating games across a cluster of 5 machines with GPUs But GraphPipe is even more useful for deploying this model so that people can use it How do we deploy our model for use in an application? GraphPIpe!
Serving the AlphaZero Trained Network Position Web Frontend Neural Network GraphPipe GraphPipe Move
Live Demo!
Live Demo This is an event branded Section Header with Graphic slide ideal for including a picture with a brief title and optional subtitle. This slide can also be used as a Q and A slide. Do not customize this slide with your own background. Subtitle
Actual Architecture
Program Agenda 1 Machine Learning Model Serving What is GraphPipe? Advantages Protocol Deep Dive Real World Demo More Info 2 3 4 5 6
GraphPipe https://oracle.github.io/graphpipe/ https://github.com/oracle/graphpipe https://github.com/oracle/graphpipe-go https://github.com/oracle/graphpipe-py https://github.com/oracle/graphpipe-tf-py https://hub.docker.com/r/sleepsonthefloor/ https://hackernoon.com/machine-learning-model-pipelines-part-i- e138b7a7c1ef
AlphaZero https://azfour.com/ https://medium.com/oracledevs/lessons-from-implementing-alphazero- 7e36e9054191 https://medium.com/@sleepsonthefloor/azfour-a-connect-four-webapp- powered-by-the-alphazero-algorithm-d0c82d6f3ae9 https://medium.com/applied-data-science/alphago-zero-explained-in-one- diagram-365f5abf67e0 https://deepmind.com/documents/119/agz_unformatted_nature.pdf https://arxiv.org/abs/1712.01815
Questions and Answers Subtitle This is an event branded Section Header with Graphic slide ideal for including a picture with a brief title and optional subtitle. This slide can also be used as a Q and A slide. Do not customize this slide with your own background. Subtitle