Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

Slides:



Advertisements
Similar presentations
Classification spotlights
Advertisements

ImageNet Classification with Deep Convolutional Neural Networks
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Unsupervised Learning With Neural Nets Deep Learning and Neural Nets Spring 2015.
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
A shallow introduction to Deep Learning
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
Deep Convolutional Nets
Object Recognizing. Deep Learning Success in 2012 DeepNet and speech processing.
ConvNets for Image Classification
Deep Residual Learning for Image Recognition
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Deep Learning and Its Application to Signal and Image Processing and Analysis Class III - Fall 2016 Tammy Riklin Raviv, Electrical and Computer Engineering.
Vision-inspired classification
Big data classification using neural network
Deep Residual Learning for Image Recognition
Handwritten Digit Recognition Using Stacked Autoencoders
Convolutional Neural Network
The Relationship between Deep Learning and Brain Function
Deep Learning Amin Sobhani.
an introduction to: Deep Learning
Data Mining, Neural Network and Genetic Programming
Data Mining, Neural Network and Genetic Programming
Computer Science and Engineering, Seoul National University
DeepCount Mark Lenson.
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Deep Learning Insights and Open-ended Questions
Article Review Todd Hricik.
Matt Gormley Lecture 16 October 24, 2016
CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.
Inception and Residual Architecture in Deep Convolutional Networks
Intelligent Information System Lab
Mini Presentations - part 2
Neural networks (3) Regularization Autoencoder
ECE 6504 Deep Learning for Perception
Supervised Training of Deep Networks
Deep learning and applications to Natural language processing
Lecture 5 Smaller Network: CNN
Convolutional Networks
Deep Belief Networks Psychology 209 February 22, 2013.
CS6890 Deep Learning Weizhen Cai
Machine Learning: The Connectionist
Deep Residual Learning for Image Recognition
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
State-of-the-art face recognition systems
Introduction to Deep Learning for neuronal data analyses
Introduction to Neural Networks
Image Classification.
Grid Long Short-Term Memory
cs540 - Fall 2016 (Shavlik©), Lecture 20, Week 11
Deep learning Introduction Classes of Deep Learning Networks
Very Deep Convolutional Networks for Large-Scale Image Recognition
network of simple neuron-like computing elements
CSC 578 Neural Networks and Deep Learning
A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE
Neural Networks Geoff Hulten.
On Convolutional Neural Network
Lecture: Deep Convolutional Neural Networks
Neural networks (3) Regularization Autoencoder
Convolutional Neural Network
CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Automatic Handwriting Generation
Natalie Lang Tomer Malach
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

Training set: Features extraction problem

Object detection

Raw image

Convolution 3x3 filter 5x5 or Feature Map

Activation map or feature map

Learning filters Convolution neural network learns the values of these filters on its own during the training process The AI guy specifies parameters such as number of filters, filter size, architecture of the network, etc. The number of filters is called the depth. The more number of filters we have, the more image features get extracted and the better our network becomes at recognizing patterns in unseen images.

Subsampling, down-sampling, or pooling Pooling achieves dimensionality reduction. Stride is the number of pixels by which we slide our filter matrix over the input matrix.

Convolutional Neural Networks

Advantages of CNN Character recognition, natural images Find edges, corners, endpoints, local 2-D structures Translation Invariance Convolution and sub-sampling layers are interleaved Sub-sampling smooth the data Exact position of detected feature not important, but the relative positions can be and can be captured by later layers 5x5, 4x4 common

Computer vision: Identify coffee mug

Why is computer vision hard?

Learning from tagged data (supervised)

Why deep learning? Deep Learning uses a neural network with several layers. The sequence of layers identifies features in stages like our brains seem to. On image and speech data, it often performs better than other methods

Building huge neural networks

AlexNet 2012 ImageNet computer image recognition competition Alex Krizhevsky of the University of Toronto won. 5 convolutional layers 60 million parameters 650,000 neurons 1 million training images Trained on two NVIDIA GPUs for a week Used hidden-unit dropout to reduce overfitting

DNN 2015 Using deep learning, Google and Microsoft both beat the best human score in the ImageNet challenge. Microsoft and the China University of Science and Technology announced a DNN that achieved IQ test scores at the college post-graduate level. Baidu announced that a deep learning system called Deep Speech 2 had learned both English and Mandarin. Deep learning had achieved superhuman levels of perception for the challenge.

Deep Learning Overview Train networks with many layers (vs. shallow nets with just a couple of layers) Multiple layers work to build an improved feature space First layer learns 1st order features (e.g. edges…) 2nd layer learns higher order features (combinations of first layer features, combinations of edges, etc.) In current models layers often learn in an unsupervised mode and discover general features of the input space – serving multiple tasks related to the unsupervised instances (image recognition, etc.) Then final layer features are fed into supervised layer(s) And entire network is often subsequently tuned using supervised training of the entire net, using the initial weightings learned in the unsupervised phase Could also do fully supervised versions, etc. (early BP attempts)

Learning from tagged data

AI will transform the internet

Deep network training We have always had good algorithms for learning the weights in networks with 1 hidden layer but these algorithms are not good at learning the weights for networks with many hidden layers What’s new: algorithms for training many-layer networks

Handwritten digits

What is this unit doing?

Hidden layer units become self-organised feature detectors 1 5 10 15 20 25 … … 1 strong +ve weight low/zero weight

What does this unit detect? 1 5 10 15 20 25 … … 1 strong +ve weight low/zero weight it will send strong signal for a horizontal line in the top row, ignoring everywhere else

What does this unit detect? 1 5 10 15 20 25 … … 1 strong +ve weight low/zero weight 63

What does this unit detect? 1 5 10 15 20 25 … … 1 strong +ve weight low/zero weight Strong signal for a dark area in the top left corner

What features might you expect a good NN to learn, when trained with data like this?

vertical lines 1 63

Horizontal lines 1 63

Small circles 1 63

But what about position invariance? Small circles 1 But what about position invariance? Our example unit detectors were tied to specific parts of the image 63

Successive layers can learn higher-level features etc … 1st layer: detect lines in specific positions 2nd layer: horizontal line, vertical line, upper loop, etc. etc … v

etc … etc … v What does this unit detect? 1st layer: detect lines in specific positions 2nd layer: horizontal line, vertical line, upper loop, etc. etc … v What does this unit detect?

Layers in brain

New way to train MLP

Train this layer first

Train this layer first then this layer

Train this layer first then this layer then this layer

Train this layer first then this layer then this layer then this layer

Train this layer first then this layer then this layer then this layer finally this layer

EACH of the (non-output) layers is trained to be an auto-encoder. Basically, it is forced to learn good features that describe what comes from the previous layer

Auto-encoding Unsupervised training input = output, identity mapping By making this happen with fewer units, this forces the hidden layer units to become good feature detectors Restricted Boltzmann Machine is an example of auto-encoder.

Deep auto-encoding Deep auto-encoder often performs dimensionality reduction better than principle component analysis.

Stacked Auto-Encoders Stack many (sparse) auto-encoders in succession and train them using greedy layer-wise training Drop the decode output layer each time

Face recognition

Dropout – Overfit avoidance Very common with current deep networks Won’t overfit one particular network structure Forces to regularize *Dropconnect – randomly drop connections Shakeout Instead of randomly discarding units as Dropout does at the training stage, our method randomly chooses to enhance or inverse the contributions of each unit to the next layer. Others – Dropin, Standout, etc. For each instance drop a node (hidden or input) and its connections with probability p and train Final net just has all averaged weights (actually scaled by 1-p) As if ensembling 2n different network substructures

Weaknesses of CNN Plain nets: stacking 3x3 convolution layers 56-layer net has higher training error and test error than 20-layers net

Google’s Artificial Brain 10 million randomly selected YouTube video thumbnails over the course of three days a neural network of 16,000 computer processors with one billion connections 20,000 output neurons 81.7% accuracy in detecting human faces, 76.7% accuracy when identifying human body parts 74.8% accuracy when identifying cats 15.8% accuracy in recognizing 20,000 object categories

can treat perturbation Residual Network Difference between an original image and a changed image Preserving base information Some Network residual can treat perturbation

Residual Network Deeper ResNets have lower training error

Results Deep Resnets can be trained without difficulties Deeper ResNets have lower training error, and also lower test error

Results 1st places in all five main tracks in “ILSVRC & COCO 2015 Competitions” ImageNet Classification ImageNet Detection ImageNet Localization COCO Detection COCO Segmentation

Deep net tools

user interface (UI)

Google’s Tensorflow Nodes represent operations Edges represent the flow of data Data are tensors A tensor of rank n is represented by an n-dimensional array Tensorflow is the flow of arrays in a computational graph.

Deep learning libraries

Object detection

Summary Residual nets can train to a depth of 200 layers. Deep networks naturally integrate low/mid/high level features and classifiers in an end-to-end multilayer fashion