Download presentation
1
Introduction to Convolutional Neural Networks
2
Acknowledgments This course is heavily based on Lecun’ , Ng, and Bengio’ tutorials … and many other presentations, blogs, and papers.
3
Agenda Course overview Introduction to Deep Learning
Classical Computer Vision vs. Deep learning Basic CNN Architecture Large Scale Image Classifications How deep should be Conv Nets? Deep Learning Applications Consider outlining your agenda verbally. May not need an agenda for 8 slides
4
Course overview Introduction CNN Training Intro to Deep Learning
Network topology, layers definition, forward propagation Caffe: Getting started and MNIST CNN Training Backward propagation Optimization for Deep Learning: SGD with momentum, rate adaptation, Adagrad and Nesterov Saddle points problem Caffe: CIFAR10
5
Course overview-2 GPGPU programming with CUDA Advanced CNN topics
Regularization: Dropout, Stochastic pooling Tricks of Trade: Data augmentation Imagenet training Localization and Detection with Convnets Overfeat , R-CNN, Spatial Pyramid Pooling with CNN CPU parallelization and performance optimization OpenMP, and BLAS/MKL Vtune
6
Course overview-3 Seminars Projects:
Training of MINIST, CIFAR-10, and Imagenet Re-implementation of convolutional layer Projects: New layers and algorithms for caffe New datasets Additional Topics: Unsupervised training with Auto-encoders, Siamese networks, Recurrent NN and LSTM Language Processing & Speech Recognition with DL, Reinforcement Learning and Games
7
Introduction to Deep Learning
8
Buzz…
9
Deep Learning – from Research to Technology
Deep Learning - breakthrough in computer vision, speech recognition and language processing
10
Classical Computer Vision Pipeline
11
Classical Computer Vision Pipeline.
CV experts Select / develop features: SURF, HoG, SIFT, RIFT, … Add on top of this Machine Learning for multi-class recognition and train classifier Feature Extraction: SIFT, HoG... Detection, Classification Recognition Classical CV feature definition is domain-specific and time-consuming
12
Deep Learning –based Vision Pipeline.
Build features automatically based on training data Joint training of feature extraction and classification DL experts: define NN topology and train NN Deep NN... Classification “The battle between SIFT/HOG vs Convolutional NN based features for recognition is over. CNN have won“ Prof. Malik, Berkeley
13
Computer Vision +Deep Learning + Machine Learning
We want to combine Deep Learning + CV + ML Use deep learning for feature extraction; Classical CV for Region detection , Pyramid pooling etc Use best ML methods for multi-class recognition Deep NN... Spatial Pyramid Pooling ML AdaBoost … Deep Learning promise: train good feature automatically, same method for different domain
14
Deep Learning Basics Deep Learning – is a set of machine learning algorithms based on multi-layer networks CAT DOG OUTPUTS HIDDEN NODES INPUTS
15
Deep Learning Basics Deep Learning – is a set of machine learning algorithms based on multi-layer networks CAT DOG Training
16
Deep Learning Basics Deep Learning – is a set of machine learning algorithms based on multi-layer networks CAT DOG
17
Deep Learning Basics Deep Learning – is a set of machine learning algorithms based on multi-layer networks CAT DOG
18
Deep Learning Taxonomy
Supervised: Convolutional NN ( LeCun) Recurrent NN (Schmidhuber ) Unsupervised Deep Belief Nets / Stacked RBMs (Hinton) Autoencoders (Bengio, LeCun, A. Ng, )
19
Convolutional Networks
20
Convolutional NN Modern Convolutional Neural Networks is extension of traditional Multi-layer Perceptron, based on 3 basic ideas: Local receptive fields with Shared weights (“convolutional filter”) Spatial / temporal sub-sampling (“pooling”) New type of non-linear activation function - ReLU LeCun paper (1998) on text recognition:
21
What is Convolutional NN ?
CNN - multi-layer NN architecture Convolutional + Non-Linear Layer Sub-sampling Layer Convolutional +Non-L inear Layer Fully connected layers Supervised Feature Extraction Classi- fication
22
What is Convolutional NN ?
2x2 Convolution + NL Sub-sampling Convolution + NL
23
CNN story: MNIST Lenet-5 (1996) : core of CNR check reading system, used by US banks.
24
CNN story: ILSVRC Imagenet data base: 14 mln labeled images, 20K categories
25
ILSVRC: Classification
26
Imagenet Classifications 2012
27
ILSVRC 2012: top rankers N Error-5 Algorithm Team Authors 1 0.153 Deep Conv. Neural Network Univ. of Toronto Krizhevsky et al 2 0.262 Features + Fisher Vectors + Linear classifier ISI Gunji et al 3 0.270 Features + FV + SVM OXFORD_VGG Simonyan et al 4 0.271 SIFT + FV + PQ + SVM XRCE/INRIA Perronin et al 5 0.300 Color desc. + SVM Univ. of Amsterdam van de Sande et al
28
Imagenet 2013: top rankers N Error-5 Algorithm Team Authors 1 0.117 Deep Convolutional Neural Network Clarifi Zeiler 2 0.129 Deep Convolutional Neural Networks Nat.Univ Singapore Min LIN 3 0.135 NYU Fergus 4 Andrew Howard 5 0.137 Overfeat Pierre Sermanet et al
29
Imagenet Classifications 2013
30
Conv Net Topology 5 convolutional layers
3 fully connected layers + soft-max 650K neurons , 60 Mln weights
31
Why ConvNet should be Deep?
Rob Fergus, NIPS 2013
32
Why ConvNet should be Deep?
33
Why ConvNet should be Deep?
34
Why ConvNet should be Deep?
35
Why ConvNet should be Deep?
36
Deep Learning Applications
37
Machine Learning Workflow
38
Traditional Machine Learning Carrier flow
39
Deep Learning Carrier Flow
Use pre-trained CNN for similar problem or re-train networks
40
CNN applications CNN is a big hammer Plenty low hanging fruits
You need just a right nail!
41
Conv NN: Detection Sermanet, CVPR 2014
42
Conv NN: Scene parsing Farabet, PAMI 2013
43
CNN: indoor semantic labeling RGBD
Farabet, 2013
44
Conv NN: Action Detection
Taylor, ECCV 2010
45
Conv NN: Image Processing
Eigen , ICCV 2010
46
Baidu Deep Speech: Scaling up end-to-end speech recognition
ASR system developed using e2e deep learning. Baidu system is significantly simpler than traditional systems, which rely on laboriously engineered processing pipelines. Deep speech does not need hand-designed components to model background noise, speaker variation etc, but instead directly learns them
47
RNN-based Language Models
48
Playing games Existing Go-playing computer programs are still not competitive with Go professionals on 19×19 boards
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.