MXNet Internals Cyrus M. Vahid, Principal Solutions Architect,

Slides:



Advertisements
Similar presentations
Reliable Scripting Using Push Logic Push Logic David Greaves, Daniel Gordon University of Cambridge Computer Laboratory Reliable Scripting.
Advertisements

Programming Paradigms and languages
System Programming Mr. M. V. Nikum (B.E.I.T). Introduction What is System? System is the collection of various components Ex:- College is a system What.
Functional Design and Programming Lecture 1: Functional modeling, design and programming.
Names and Scopes CS 351. Program Binding We should be familiar with this notion. A variable is bound to a method or current block e.g in C++: namespace.
Compiler Summary Mooly Sagiv html://
Improving Code Generation Honors Compilers April 16 th 2002.
Programmer's view on Computer Architecture by Istvan Haller.
Term 2, 2011 Week 1. CONTENTS Problem-solving methodology Programming and scripting languages – Programming languages Programming languages – Scripting.
CS/EE 217 GPU Architecture and Parallel Programming Midterm Review
A Level Computing#BristolMet Session Objectives#U2 S3 MUST use/read programme flow charts accurately SHOULD adapt the calculator programme to include a.
Essential components of the implementation are:  Formation of the network and weight initialization routine  Pixel analysis of images for symbol detection.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 330 Programming Language Structures Operational Semantics (Slides mainly.
Assignment 4: Deep Convolutional Neural Networks
A Single Intermediate Language That Supports Multiple Implemtntation of Exceptions Delvin Defoe Washington University in Saint Louis Department of Computer.
Introduction to Operating Systems Concepts
Chapter Goals Describe the application development process and the role of methodologies, models, and tools Compare and contrast programming language generations.
Functional Programming
Computers’ Basic Organization
TensorFlow– A system for large-scale machine learning
Deep Learning Software: TensorFlow
Component 1.6.
Winter 2009 Tutorial #6 Arrays Part 2, Structures, Debugger
Processes and threads.
CS427 Multicore Architecture and Parallel Computing
Artificial Neural Networks
Self Healing and Dynamic Construction Framework:
Spark Presentation.
Naming and Binding A computer system is merely a bunch of resources that are glued together with names Thus, much of system design is merely making choices.
Representation, Syntax, Paradigms, Types
Deep Learning Libraries
Intro to NLP and Deep Learning
CS 224S: TensorFlow Tutorial
Functions CIS 40 – Introduction to Programming in Python
Array Array is a variable which holds multiple values (elements) of similar data types. All the values are having their own index with an array. Index.
Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.
Overview of TensorFlow
Comparison Between Deep Learning Packages
Chapter 15 QUERY EXECUTION.
Neural Networks and Backpropagation
.NET and .NET Core 5.2 Type Operations Pan Wuming 2016.
Computer Programming.
Torch 02/27/2018 Hyeri Kim Good afternoon, everyone. I’m Hyeri. Today, I’m gonna talk about Torch.
INF 5860 Machine learning for image classification
Tensorflow in Deep Learning
Introduction to Tensorflow
An open-source software library for Machine Intelligence
Chapter 3: Operating-System Structures
CS/EE 217 – GPU Architecture and Parallel Programming
Representation, Syntax, Paradigms, Types
MNIST Dataset Training with Tensorflow
APACHE MXNET By Beni Mulyana.
Representation, Syntax, Paradigms, Types
Channels.
Neural Networks Geoff Hulten.
Vinit Shah, Joseph Picone and Iyad Obeid
Outline Chapter 2 (cont) OS Design OS structure
Programming Languages
Debugging Dataflow Graphs using TensorFlow Debugger.
Introduction to Data Structure
Representation, Syntax, Paradigms, Types
Tensorflow Tutorial Presented By :- Ankur Mali
Channels.
Channels.
TensorFlow: A System for Large-Scale Machine Learning
6- General Purpose GPU Programming
Deep Learning Libraries
Product Training Program
Machine Learning for Cyber
ONNX Training Discussion
Presentation transcript:

MXNet Internals Cyrus M. Vahid, Principal Solutions Architect, Principal Solutions Architect @ AWS Deep Learning cyrusmv@amazon.com June 2017

Computational Dependency 3 𝑢 𝑡 + x 2 𝑧 𝜆 k x x x a y b 1 1 𝑧=𝑥⋅𝑦 𝑘=𝑎⋅𝑏 𝑡=𝜆𝑧+𝑘

Execution Dependency in Matrix Operations 𝑥 1, 1 ⋯ 𝑥 1,𝑛 ⋮ ⋱ ⋮ 𝑥 𝑛,1 ⋯ 𝑥 𝑛,𝑛 𝑦 1, 1 ⋯ 𝑦 1,𝑛 ⋮ ⋱ ⋮ 𝑦 𝑛,1 ⋯ 𝑦 𝑛,𝑛 = 𝑖,𝑗 𝑥 𝑖 𝑦 𝑗 .

MXNet Architecture

Modules and Components Runtime Dependency Engine: Schedules and executes the operations according to their read/write dependency. Storage Allocator: Efficiently allocates and recycles memory blocks on host (CPU) and devices (GPUs). Resource Manager: Manages global resources, such as the random number generator and temporal space. NDArray: Dynamic, asynchronous n-dimensional arrays, which provide flexible imperative programs for MXNet. Symbolic Execution: Static symbolic graph executor, which provides efficient symbolic graph execution and optimization. Operator: Operators that define static forward and gradient calculation (backprop). SimpleOp: Operators that extend NDArray operators and symbolic operators in a unified fashion. Symbol Construction: Symbolic construction, which provides a way to construct a computation graph (net configuration). KVStore: Key-value store interface for efficient parameter synchronization. Data Loading(IO): Efficient distributed data loading and augmentation.

MXNet Basics NDArray: Manipulate multi-dimensional arrays in a command line paradigm (imperative). Symbol: Symbolic expression for neural networks (declarative). Module: Intermediate-level and high-level interface for neural network training and inference.  Loading Data: Feeding data into training/inference programs. Mixed Programming: Training algorithms developed using NDArrays in concert with Symbols.

NDArray The intention is to replicate numpy’s API, but optimized for GPU It provides matrix operations. API Docs are loated here. Tutorials are here

Symbols A symbol represents a multi-output symbolic expression. Composited by operators, such as simple matrix operations or a neural network layer. An operator can take several input variables, produce more than one output variables, and have internal state variables. A variable can be either free, which we can bind with value later, or an output of another symbol. Tutorial is here

Symbols vs NDArray Both are made to deliver multi-dimensional array operators. Symbol NDArray Declarative Imperative Hard to Debug Easy to debug Complecated to work with Easy to work ANN related + tensor operations Provides tensor operations Automatic differentiation No pre-defined differentiation Easy to build complex computations Must self develop Easy to save and load and visualize Back-end optimization No back-end optimization

Modules Commonly used code for training and inference in modularized in module API. We first create a network using symbol API and then Create a module passing symbol, context, list of vars, and label vars Then using module.fit we can train a model as in the tutorial

Modules and checkpoints After running each epoch, minibatch, or evaluation we can save the outcome of training in checkpoints using callbacks. This helps us stop the training if a model stops converging and simply pick the outcome of the best epoch. We can load the model from a saved checkpoint using loadcheckpoint

Loading Data with Data Iterators Training and inference modules in MXNet accept data iterators, which simplify this procedure, especially when reading large datasets from filesystems. A data iterator reads data batch by batch. Iterators are used to load data into symbols. Iterators are similar to python iterators. MXNet data iterator returns a batch of data at each next call. Iterators are similar to python iterators. When next is called at the end of an array, StopIteration exception is raised.

Data Batch Iterators operate on data. data is a list of NDArray, each of which has n (batch size) length first dimention. Example: RGB image of the size 224 x 224 has array shape of (n, 3, 224, 244) label is a list of NDArray, each of which often 1-dimensional with the shape (n,) pad is an integer shows how many examples are for merely used for padding, which should be ignored in the results.

MXNet Data Iterators io.NDArrayIter Iterating on either mx.nd.NDArray or numpy.ndarray. io.CSVIter Iterating on CSV files io.ImageRecordIter Iterating on image RecordIO files io.ImageRecordUInt8Iter Create iterator for dataset packed in recordio. io.MNISTIter Iterating on the MNIST dataset. recordio.MXRecordIO Read/write RecordIO format data. recordio.MXIndexedRecordIO Read/write RecordIO format data supporting random access. image.ImageIter Image data iterator with a large number of augmentation choices.

Building a Data Iterator An iterator should: Return a data batch or raise a StopIterator exception if reaching the end. Have reset() method to restart reading from the beginning Has provide_data and provide_label attributes. provide_data returns a list of (str, tuple) pairs to store data variable name and its shapre. provide_label does the same but for input labels.

Building a Data Iterator

Imperative vs Symbolic Programming Execution Flow is the same as flow of the code: Abstract functions are defined and compiled first, data binding happens next. Flexible but inefficient: Efficient Memory: 4 * 10 * 8 = 320 bytes Interim values are available No Operation Folding. Familiar coding paradigm. Memory: 2 * 10 * 8 = 160 bytes Interim values are not available Operation Folding: Folding multiple operations into one. We run one op. instead of many on GPU. This is possible because we have access to whole comp. graph

Mixed Programming MXNet permits mixing both styles into your code. Module abstracts the need for many of the symbolic operations and provides functions to simply run within your flow.

Cyrus M. Vahid cyrusmv@amazon.com