Download presentation
Presentation is loading. Please wait.
Published bySophia Stevens Modified over 6 years ago
1
Deep Dive with Microsoft Cognitive Toolkit (CNTK)
6/26/2018 1:18 PM Deep Dive with Microsoft Cognitive Toolkit (CNTK) Cha Zhang Principal Applied Science Manager Microsoft AI & Research © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
2
6/26/2018 1:18 PM Introduction © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
3
Microsoft Cognitive Toolkit (CNTK)
6/26/2018 1:18 PM Microsoft Cognitive Toolkit (CNTK) Microsoft’s open-source deep-learning toolkit Created by Microsoft Speech researchers in 2012, “Computational Network Toolkit” Open sourced on GitHub since Jan 2016 (MIT license) 12,000+ GitHub stars Third among all DL toolkits 140+ contributors from MIT, Stanford, Nvidia, Intel, and many others © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
4
Microsoft Cognitive Toolkit (CNTK)
6/26/2018 1:18 PM Microsoft Cognitive Toolkit (CNTK) Runs over 80% Microsoft internal DL workload Cortana HoloLens Bing and Bing Ads Skype Translator Windows Office Microsoft Cognitive Services Microsoft Research Internal == External © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
5
Microsoft Cognitive Toolkit (CNTK)
6/26/2018 1:18 PM Microsoft Cognitive Toolkit (CNTK) 1st-class on Linux and Windows Docker support Rich API support Mostly implemented in C++ (train and eval) Low level + high level Python API R and C# API for train and eval UWP, Java and Spark support Built-in readers for distributed learning Keras backend support (Beta) Model compression (Fast binarized evaluation) Latest version: v2.2 (Sep. 15, 2017) © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
6
Toolkit Benchmark http://dlbench.comp.hkbu.edu.hk/
6/26/2018 1:18 PM Toolkit Benchmark Caffe: 1.0rc5(39f28e4) CNTK: 2.0 Beta10(1ae666d) MXNet: 0.93(32dc3a2) TensorFlow: 1.0(4ac9c09) Torch: 7(748f5e3) Benchmarking by HKBU, Version 8 Single Tesla K80 GPU, CUDA: 8.0 CUDNN: v5.1 Caffe Cognitive Toolkit MxNet TensorFlow Torch FCN5 (1024) 55.329ms 51.038ms 60.448ms 62.044ms 52.154ms AlexNet (256) 36.815ms 27.215ms 28.994ms ms 37.462ms ResNet (32) ms 81.470ms 84.545ms ms 90.935ms LSTM (256) (v7 benchmark) - 43.581ms (44.917ms) ms ( ms) ( ms) ms ( ms) © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
7
Production Ready Scalability
8
Toolkit Delivering Near-Linear Multi-GPU Scaling
Microsoft Cognitive Toolkit First Deep Learning Framework Fully Optimized for Pascal Toolkit Delivering Near-Linear Multi-GPU Scaling AlexNet Performance 170x Faster v. CPU Server images / sec AlexNet training batch size 128, Grad Bit = 32, Dual socket E5-2699v4 CPUs (total 44 cores) CNTK 2.0b3 (to be released) includes cuDNN 5.1.8, NCCL 1.6.1, NVLink enabled
9
Released in CNTK v2.2 (Sep. 2017)
Scale with NCCL 2.0 GTC, May 2017 Released in CNTK v2.2 (Sep. 2017)
10
Scale to 1,000 GPUs
11
6/26/2018 1:18 PM CNTK Overview © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
12
Installation $ pip install <url> Microsoft Build 2017
6/26/2018 1:18 PM Installation $ pip install <url> © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
13
Tutorials Microsoft Build 2017 6/26/2018 1:18 PM
© Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
14
Tutorials on Azure (Free Pre-Hosted)
15
https://github.com/Microsoft/CNTK
18
CNTK expresses (nearly) arbitrary neural networks by composing simple building blocks into complex computational networks, supporting relevant network types and applications.
19
What is CNTK? Example: 2-hidden layer feed-forward NN
h1 = s(W1 x + b1) h1 = sigmoid (x @ W1 + b1) h2 = s(W2 h1 + b2) h2 = sigmoid W2 + b2) P = softmax(Wout h2 + bout) P = softmax Wout + bout) with input x RM and one-hot label L RM and cross-entropy training criterion ce = LT log P ce = cross_entropy (L, P) Scorpusce = max
20
What is CNTK? Example: 2-hidden layer feed-forward NN
h1 = s(W1 x + b1) h1 = sigmoid (x @ W1 + b1) h2 = s(W2 h1 + b2) h2 = sigmoid W2 + b2) P = softmax(Wout h2 + bout) P = softmax Wout + bout) with input x RM and one-hot label y RJ and cross-entropy training criterion ce = yT log P ce = cross_entropy (L, P) Scorpusce = max
21
What is CNTK? example: 2-hidden layer feed-forward NN
h1 = s(W1 x + b1) h1 = sigmoid (x @ W1 + b1) h2 = s(W2 h1 + b2) h2 = sigmoid W2 + b2) P = softmax(Wout h2 + bout) P = softmax Wout + bout) with input x RM and one-hot label y RJ and cross-entropy training criterion ce = yT log P ce = cross_entropy (P, y) Scorpusce = max
22
What is CNTK? h1 = sigmoid (x @ W1 + b1) h2 = sigmoid (h1 @ W2 + b2)
P = softmax Wout + bout) ce = cross_entropy (P, y)
23
What is CNTK? h1 = sigmoid (x @ W1 + b1) h2 = sigmoid (h1 @ W2 + b2)
ce h1 = sigmoid (x @ W1 + b1) h2 = sigmoid W2 + b2) P = softmax Wout + bout) ce = cross_entropy (P, y) cross_entropy P softmax bout + Wout • h2 s b2 + W2 • h1 s b1 + W1 • x y
24
What is CNTK? Nodes: functions (primitives) Edges: values
Can be composed into reusable composites Edges: values Incl. tensors, sparse Automatic differentiation ∂F / ∂in = ∂F / ∂out ∙ ∂out / ∂in Deferred computation execution engine Editable, clonable LEGO-like composability allows CNTK to support wide range of networks & applications ce cross_entropy P softmax bout + Wout • h2 s b2 + W2 • h1 s b1 + W1 • x y
25
CNTK Workflow trainer reader network • SGD (momentum, Adam, …)
Microsoft Build 2017 6/26/2018 1:18 PM CNTK Workflow Script configure and executes through CNTK Python APIs… trainer • SGD (momentum, Adam, …) • minibatching reader • minibatch source • task-specific deserializer • automatic randomization • distributed reading corpus model network • model function • criterion function • CPU/GPU execution engine • packing, padding © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
26
As Easy as 1-2-3 from cntk import * # reader
def create_reader(path, is_training): ... # network def create_model_function(): def create_criterion_function(model): # trainer (and evaluator) def train(reader, model): def evaluate(reader, model): # main function model = create_model_function() reader = create_reader(..., is_training=True) train(reader, model) reader = create_reader(..., is_training=False) evaluate(reader, model)
27
Workflow Prepare data Configure reader, network, learner (Python)
Train: python my_cntk_script.py
28
Prepare Data: Reader def create_reader(map_file, mean_file, is_training): # image preprocessing pipeline transforms = [ ImageDeserializer.crop(crop_type='Random', ratio=0.8, jitter_type='uniRatio') ImageDeserializer.scale(width=image_width, height=image_height, channels=num_channels, interpolations='linear'), ImageDeserializer.mean(mean_file) ] # deserializer return MinibatchSource(ImageDeserializer(map_file, StreamDefs( features = StreamDef(field='image', transforms=transforms), ' labels = StreamDef(field='label', shape=num_classes) )), randomize=is_training, epoch_size = INFINITELY_REPEAT if is_training else FULL_DATA_SWEEP)
29
Prepare Data: Reader def create_reader(map_file, mean_file, is_training): # image preprocessing pipeline transforms = [ ImageDeserializer.crop(crop_type='Random', ratio=0.8, jitter_type='uniRatio') ImageDeserializer.scale(width=image_width, height=image_height, channels=num_channels, interpolations='linear'), ImageDeserializer.mean(mean_file) ] # deserializer return MinibatchSource(ImageDeserializer(map_file, StreamDefs( features = StreamDef(field='image', transforms=transforms), ' labels = StreamDef(field='label', shape=num_classes) )), randomize=is_training, epoch_size = INFINITELY_REPEAT if is_training else FULL_DATA_SWEEP) Automatic on-the-fly randomization important for large data sets Readers compose, e.g. image text caption
30
Prepare Network: Multi-Layer Perceptron
Microsoft Build 2017 6/26/2018 1:18 PM Prepare Network: Multi-Layer Perceptron Model z = model(x): h1 = Dense(400, act = relu)(x) h2 = Dense(200, act = relu)(h1) r = Dense(10, act = None)(h2) return r Loss cross_entropy_with_softmax(z,Y) © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
31
Prepare Network: Convolutional Neural Network
Microsoft Build 2017 6/26/2018 1:18 PM Prepare Network: Convolutional Neural Network 28 pix z = model(x): h = Convolution2D((5,5),filt=8, …)(x) h = MaxPooling(…)(h) h = Convolution2D ((5,5),filt=16, …)((h) r = Dense(output_classes, act= None)(h) return r Model © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
32
An Sequence Example (many to many + 1:1)
Problem: Tagging entities in Air Traffic Controller (ATIS) data Rec show o Rec burbank From_city Rec to o Rec seattle To_city Rec flights o Rec tomorrow Date
33
Prepare Network: Recurrent Neural Network
1 𝑥 (t) E i = 943 O= 150 L i = 150 O= 300 D i = 300 O= 129 a = sigmoid ℎ (t-1) ℎ (t) 𝑦 (t) Text token Class label 1 x 943 z = model(): return Sequential([ Embedding(emb_dim=150), Recurrence(LSTM(hidden_dim=300), go_backwards=False), Dense(num_labels = 129) ])
34
Prepare Learner Many built-in learners
Microsoft Build 2017 6/26/2018 1:18 PM Prepare Learner Many built-in learners SGD, SGD with momentum, Adagrad, RMSProp, Adam, Adamax, AdaDelta, etc. Specify learning rate schedule and momentum schedule If wanted, specify minibatch size schedule lr_schedule = C.learning_rate_schedule([0.05]*3 + [0.025]*2 + [0.0125], C.UnitType.minibatch, epoch_size=100) sgd_learner = C.sgd(z.parameters, lr_schedule) © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
35
Overall Train Workflow
z = model(): return Sequential([ Embedding(emb_dim=150), Recurrence(LSTM(hidden_dim=300), go_backwards=False), Dense(num_labels = 129) ]) . #1 #2 #3 #96 Input feature ( 96 x 𝑥 (t)) t23 t1 t15 t9 t12 ATIS Train (mini-batch) 96 samples Loss cross_entropy_with_softmax(z,Y) One-hot encoded Label (Y: 96 x 129/sample Or word in sequence) Error classification_error(z,Y) Choose a learner (SGD, Adam, adagrad etc.) Trainer(model, (loss, error), learner) Trainer.train_minibatch({X, Y})
36
Distributed training Prepare data
Configure reader, network, learner (Python) Train: -- distributed! mpiexec --np 16 --hosts server1,server2,server3,server4 \ python my_cntk_script.py
37
6/26/2018 1:18 PM CNTK Deep Dive © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
38
CNTK Unique Features Symbolic loops over sequences with dynamic scheduling Turn graph into parallel program through minibatching Unique parallel training algorithms (1-bit SGD, Block Momentum)
39
Symbolic Loops over Sequential Data
Extend our example to a recurrent network (RNN) h1(t) = s(W1 x(t) + H1 h1(t-1) + b1) h1 = sigmoid(x @ W1 + past_value(h1) + b1) h2(t) = s(W2 h1(t) + H2 h2(t-1) + b2) h2 = W2 + H2 + b2) P(t) = softmax(Wout h2(t) + bout) P = Wout + bout) ce(t) = LT(t) log P(t) ce = cross_entropy(P, L) Scorpusce(t) = max no explicit notion of time
40
Symbolic Loops over Sequential Data
Extend our example to a recurrent network (RNN) h1(t) = s(W1 x(t) + R1 h1(t-1) + b1) h1 = sigmoid(x @ W1 + R1 + b1) h2(t) = s(W2 h1(t) + R2 h2(t-1) + b2) h2 = W2 + R2 + b2) P(t) = softmax(Wout h2(t) + bout) P = Wout + bout) ce(t) = LT(t) log P(t) ce = cross_entropy(P, L) Scorpusce(t) = max
41
Symbolic Loops over Sequential Data
• + s softmax W1 b1 Wout bout cross_entropy h1 P x y ce R1 z-1 W2 b2 h2 R2 h1 = sigmoid(x @ W1 + R1 + b1) h2 = W2 + R2 + b2) P = Wout + bout) ce = cross_entropy(P, L) CNTK automatically unrolls cycles at execution time cycles are detected with Tarjan’s algorithm Efficient and composable
42
Batch-Scheduling of Variable-Length Sequences
Minibatches containing sequences of different lengths are automatically packed and padded CNTK handles the special cases: past_value operation correctly resets state and gradient at sequence boundaries non-recurrent operations just pretend there is no padding (“garbage-in/garbage-out”) sequence reductions time steps computed in parallel sequence 1 sequence 2 sequence 3 padding parallel sequences sequence 3 sequence 7 sequence 5 sequence 6
43
Batch-Scheduling of Variable-Length Sequences
Minibatches containing sequences of different lengths are automatically packed and padded CNTK handles the special cases: past_value operation correctly resets state and gradient at sequence boundaries non-recurrent operations just pretend there is no padding (“garbage-in/garbage-out”) sequence reductions time steps computed in parallel sequence 1 sequence 2 sequence 3 padding parallel sequences sequence 4 schedule into the same slot, it may come for free! sequence 7 sequence 5 sequence 6
44
Batch-Scheduling of Variable-Length Sequences
Minibatches containing sequences of different lengths are automatically packed and padded Fully transparent batching Recurrent CNTK unrolls, handles sequence boundaries Non-recurrent operations parallel Sequence reductions mask time steps computed in parallel sequence 1 sequence 2 sequence 3 padding parallel sequences sequence 4 sequence 7 sequence 5 sequence 6
45
Data-Parallel Training
Data-parallelism: distribute minibatch over workers, all-reduce partial gradients all-reduce minibatch (sample frames) node 1 node 2 node 3 each node computes sub-mini-batch sub-gradient of one layer S
46
Data-Parallel Training
Data-parallelism: distribute minibatch over workers, all-reduce partial gradients minibatch (sample frames) node 1 node 2 node 3 each node computes sub-mini-batch sub-gradient of one layer 1/K-th stripe (K-1) concurrent transfers of 1/K-th data step 1: each node owns aggregating a stripe
47
Data-Parallel Training
Data-parallelism: distribute minibatch over workers, all-reduce partial gradients minibatch (sample frames) node 1 node 2 node 3 each node computes sub-mini-batch sub-gradient of one layer ring algorithm O(2 (K-1)/K M) O(1) w.r.t. K step 1: each node owns aggregating a stripe step 2: aggregated stripes are redistributed
48
Data-Parallel Training
Data-parallelism: distribute minibatch over workers, all-reduce partial gradients O(1) enough? Example: DNN, MB size 1024, 160M model parameters compute per MB: 1/7 second communication per MB: 1/9 second (640M over 6 GB/s) can’t even parallelize to 2 GPUs: communication cost already dominates!
49
Data-Parallel Training
How to reduce communication cost: communicate less each time 1-bit SGD: [F. Seide, H. Fu, J. Droppo, G. Li, D. Yu: “1-Bit Stochastic Gradient Descent...Distributed Training of Speech DNNs”, Interspeech 2014] Quantize gradients to 1 bit per value Trick: carry over quantization error to next minibatch communicate less often Automatic MB sizing [F. Seide, H. Fu, J. Droppo, G. Li, D. Yu: “ON Parallelizability of Stochastic Gradient Descent...”, ICASSP 2014] Block momentum [K. Chen, Q. Huo: “Scalable training of deep learning machines by incremental block training…,” ICASSP 2016] Very recent, very effective parallelization method Combines model averaging with error-residual idea
50
CNTK C# and R Bindings 6/26/2018 1:18 PM
© Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
51
C# API Overview Basic C# Training API Is Added In CNTK 2.2 Release
Microsoft Build 2017 6/26/2018 1:18 PM C# API Overview Basic C# Training API Is Added In CNTK 2.2 Release Supports Building of DNNs Supports DNN Training Supports DNN Evaluation End-To-End Examples in C# © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
52
CNTK C# API Highlights Over 100 Basic Operations To Build A Computation Network. Sigmoid, Tanh, ReLU, Plus, Minus, Convolution, Pooling, BatchNormalization… For example: Logistic Regression Loss Function: Function z = CNTKLib.Times(weightParam, input) + biasParam; Function loss = CNTKLib.CrossEntropyWithSoftmax(z, labelVariable); CNTK Function As A Primitive Element To Build A DNN Build a DNN through basic operation composition. For example to build a ResNet node: Function conv = CNTKLib.Pooling(CNTKLib.Convolution(convParam, input), PoolingType.Average, poolingWindowShape); Function resNetNode = CNTKLib.ReLU(CNTKLib.Plus(conv, input));
53
CNTK C# API Highlights Batching Supports Training Supports
Provides MinibatchSource and MinibacthData to help data loading and batching Training Supports Supports SGD optimizers commonly seen in DNN literatures: MomentumSGDLearner, AdamLearner, AdaGradLearner, etc. For example, to train a model with a ADAM Stochastic Optimizer: var parameterLearners = new List<Learner>() { Learner.AdamLearner(classifierOutput.Parameters(), learningRate, momentum) }; var trainer = Trainer.CreateTrainer(classifierOutput, trainingLoss, prediction, parameterLearners); Evaluation Supports
54
Hand-writing Data Classification (MNIST)
Build a Convolution Model Function conv = CNTKLib.ReLU(CNTKLib.Convolution(convParams, features, strides )); Function pooling = CNTKLib.Pooling(conv, PoolingType.Max, poolingWindow, stride, padding); Function classifier = TestHelper.Dense(pooling, numClasses, device, Activation.None); Prepare Data var minibatchSource = MinibatchSource.TextFormatMinibatchSource("Train_cntk_text.txt"), streamConfigurations, MinibatchSource.InfinitelyRepeat);
55
Hand-writing Data Classification(MNIST) Cont.
Training (One Iteration) var minibatchData = minibatchSource.GetNextMinibatch(minibatchSize, device); var arguments = new Dictionary<Variable, MinibatchData> { { input, minibatchData[featureStreamInfo] }, { labels, minibatchData[labelStreamInfo] } }; trainer.TrainMinibatch(arguments, device);
56
CNTK C# API Examples More Examples At:
CifarResNetClassifier: Shows how to do image classification using ResNet. Transfer learning: Demonstrates transfer learning using a pretrained ResNet model. LSTMSequenceClassifier: Shows how to build and train a recurrent neural network model.
57
CNTK R-Binding devtools::install_github("Microsoft/CNTK-R")
Comprehensive coverage of CNTK functionality Readers (CTF, Image, HTK) Layers library (fully-connected, convolution, pooling, dropout, LSTM, GRU, …) Loss functions Normalizers Optimizers (including 1-bit SGD) Model saving and loading Exposed via idiomatic R functional APIs Interop with R objects and packages like ggplot Fluent syntax for constructing layers based on magittr High performance, including support for multi-node and multi-GPU
58
6/26/2018 1:18 PM Implementation R bindings based on reticulate package ( Enables importing and calling Python functions from R Ensures full parity with functionality in Python bindings (such as layers library) Enables interop between R objects and NumPy However, requires full installation of Python bindings Converting back and forth between R and Python objects expensive However, not an issue if only done at the beginning and end of training Or using the native Python readers and deserializers © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
59
CIFAR-10 Training Example
Import libraries Define reader function
60
CIFAR-10 Training Example
Define feature and label variables in the graph Low-level operations start with an “op” prefix Use functional notion instead of operator overloading Normalize the inputs
61
CIFAR-10 Training Example
Define network with four convolution and two max-pool layers using the Layers library Notice use of the “For” loop construct from CNTK
62
CIFAR-10 Training Example
Define loss function Initialize the learning algorithm z$parameters notation used for accessing properties of a Python object
63
CIFAR-10 Training Example
Run the training algorithm until convergence Notice the use of fluent syntax using magittr pipe notation, “%>%”
64
CIFAR-10 Training Example
Convergence plot using ggplot
65
Future Work More examples and vignettes Tighter R integration
Particularly text and NLP Tighter R integration Formula objects Data frames and matrices Integration of trained models with summary and predict generic function Integration with TensorBoard
66
More Information Source code Vignettes and documentation
Vignettes and documentation Issues and feedback:
67
CNTK Moving Forward 6/26/2018 1:18 PM
© Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
68
Open Neural Network Exchange (ONNX)
An open source intermediate representation (IR) of computation graph ( With defined common OPs and their semantics Released on Sep. 7, 2017 Collaboration between Microsoft and Facebook A share library with a Caffe2 example as reference Permissive MIT license and no patents
69
ONNX Motivation Allow interoperability between frameworks
Starting with CNTK, Caffe2 and PyTorch Allow hardware vendor to focus on one IR in their backend optimization Allow train in one toolkit and deploy in another
70
ONNX Vision
71
Current Status V1 release in Github, focus on the basic
Support only inference, no loop and no RNN Caffe2 and PyTorch support now, CNTK support early October In V2, we expect to add: RNN support Loop and control Model Zoo Look forward to more support from other frameworks
72
CNTK Upcoming New Features
6/26/2018 1:18 PM CNTK Upcoming New Features Model optimization and codegen for fast inference Model compression via SVD, quantization, etc. Fast inference via codegen C# API for training Low-level C# API released in v2.2 (Sep. 2017) In process to design high-level API © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
73
CNTK Upcoming New Features (cont.)
6/26/2018 1:18 PM CNTK Upcoming New Features (cont.) CrossTalk Porting models created from other toolkits to CNTK Caffe, Caffe2 and TensorFlow Will use ONNX as intermediate representation Dynamic graph support Data-dependent graph Ease of debugging More examples, better documentation © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
74
6/26/2018 1:18 PM Conclusions © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
75
Conclusions Performance Programmability
6/26/2018 1:18 PM Conclusions Performance Speed: faster than others, 5-10x faster on recurrent networks Scalability: few lines of change to scale to thousands of GPUs Built-in readers: efficient distributed readers Programmability Powerful C++ library for enterprise users Intuitive and performant Python APIs Supports C#/.NET, UWP, R, Java, etc. Extensible via user custom layers, learners, readers, etc. © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
76
Thank You! https://github.com/Microsoft/CNTK
6/26/2018 1:18 PM Thank You! © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
77
Please evaluate this session Your feedback is important to us!
6/26/2018 1:18 PM Please evaluate this session Your feedback is important to us! The slide will be replaced onsite through Silver Fox Productions with an updated QR code. This slide is required. Do NOT delete or alter the slide. From your PC or Tablet visit MyIgnite at From your phone download and use the Ignite Mobile App by scanning the QR code above or visiting © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
78
6/26/2018 1:18 PM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.