ECE 599/692 – Deep Learning Lecture 5 – CNN: The Representative Power

Slides:



Advertisements
Similar presentations
Lecture 5: CNN: Regularization
Advertisements

ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
Convolutional Neural Network
Neural Networks The Elements of Statistical Learning, Chapter 12 Presented by Nick Rizzolo.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Lecture 3a Analysis of training of NN
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Multiple-Layer Networks and Backpropagation Algorithms
Machine Learning for Big Data
Data Mining, Neural Network and Genetic Programming
ECE 5424: Introduction to Machine Learning
Computer Science and Engineering, Seoul National University
Lecture 2. Basic Neurons To model neurons we have to idealize them:
Many slides and slide ideas thanks to Marc'Aurelio Ranzato and Michael Nielson.
COMP24111: Machine Learning and Optimisation
Applications of Deep Learning and how to get started with implementation of deep learning Presentation By : Manaswi Advisor : Dr.Chinmay.
Matt Gormley Lecture 16 October 24, 2016
Lecture 24: Convolutional neural networks
Deep Learning Hung-yi Lee 李宏毅.
CSE 473 Introduction to Artificial Intelligence Neural Networks
ECE 6504 Deep Learning for Perception
Training Techniques for Deep Neural Networks
Machine Learning: The Connectionist
CSC 578 Neural Networks and Deep Learning
ECE 599/692 – Deep Learning Lecture 6 – CNN: The Variants
Deep Learning Convoluted Neural Networks Part 2 11/13/
Bird-species Recognition Using Convolutional Neural Network
CSC 578 Neural Networks and Deep Learning
Introduction to Neural Networks
Face Recognition with Deep Learning Method
Image Classification.
CS 4501: Introduction to Computer Vision Training Neural Networks II
ECE 599/692 – Deep Learning Lecture 1 - Introduction
CSC 578 Neural Networks and Deep Learning
Artificial Neural Network & Backpropagation Algorithm
ECE 599/692 – Deep Learning Lecture 9 – Autoencoder (AE)
ECE 599/692 – Deep Learning Lecture 4 – CNN: Practical Issues
Tips for Training Deep Network
Hairong Qi, Gonzalez Family Professor
Very Deep Convolutional Networks for Large-Scale Image Recognition
Hairong Qi, Gonzalez Family Professor
Smart Robots, Drones, IoT
CSC 578 Neural Networks and Deep Learning
ECE599/692 - Deep Learning Lecture 14 – Recurrent Neural Network (RNN)
Neural Networks Geoff Hulten.
Deep Learning for Non-Linear Control
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
ECE – Pattern Recognition Lecture 16: NN – Back Propagation
Done.
Artificial Intelligence 10. Neural Networks
实习生汇报 ——北邮 张安迪.
Image Classification & Training of Neural Networks
Hairong Qi, Gonzalez Family Professor
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
David Kauchak CS158 – Spring 2019
Introduction to Neural Networks
Natalie Lang Tomer Malach
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
CSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 4 – Parametric Estimation
Prabhas Chongstitvatana Chulalongkorn University
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 14 – Gradient Descent
Principles of Back-Propagation
Overall Introduction for the Lecture
ECE – Pattern Recognition Midterm Review
Presentation transcript:

ECE 599/692 – Deep Learning Lecture 5 – CNN: The Representative Power Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi Email: hqi@utk.edu AICIP stands for Advanced Imaging and Collaborative Information Processing

Outline Lecture 3: Core ideas of CNN Lecture 4: Practical issues Receptive field Pooling Shared weight Derivation of BP in CNN Lecture 4: Practical issues The learning slowdown problem Quadratic cost function Cross-entropy + sigmoid Log-likelihood + softmax Overfitting and regularization L2 vs. L1 normalization Dropout Artificial expanding the training set Weight initialization How to choose hyper-parameters Learning rate, early stopping, learning schedule, regularization parameter, mini-batch size, Grid search Others Momentum-based GD Lecture 5: The representative power of NN Lecture 6: Variants of CNN From LeNet to AlexNet to GoogleNet to VGG to ResNet Lecture 7: Implementation Lecture 8: Applications of CNN

The universality theorem Neural networks with a single hidden layer can be used to approximate any continuous functions to any desired precision

Visual proof One input and one hidden layer Weight selection (first layer) and the step function Bias selection and the location of the step function Weight selection (2nd layer) and the rectangular function (”bump”) Two inputs and two hidden layers From “bump” to “tower” Accumulating the ”bumps” or “towers”

Beyond sigmoid neuron The activation function needs to be well defined as z goes to both positive and negative infinity What about ReLU? What about linear neuron?

Why deep network? If two hidden layers can compute any function, why multiple layers or deep networks? Shallow networks require exponentially more elements to compute than do deep networks

Why are deep networks hard to train? The unstable gradient problem Gradient vanishing Gradient exploding

Acknowledgement All figures from this presentation are based on Nielsen’s NN book, Chapters 4 and 5.