ECE 599/692 – Deep Learning Lecture 5 – CNN: The Representative Power

Slides:

Advertisements

Similar presentations

Lecture 5: CNN: Regularization

Advertisements

ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.

Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.

Convolutional Neural Network

Neural Networks The Elements of Statistical Learning, Chapter 12 Presented by Nick Rizzolo.

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Lecture 3a Analysis of training of NN

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Multiple-Layer Networks and Backpropagation Algorithms

Machine Learning for Big Data

Data Mining, Neural Network and Genetic Programming

ECE 5424: Introduction to Machine Learning

Computer Science and Engineering, Seoul National University

Lecture 2. Basic Neurons To model neurons we have to idealize them:

Many slides and slide ideas thanks to Marc'Aurelio Ranzato and Michael Nielson.

COMP24111: Machine Learning and Optimisation

Applications of Deep Learning and how to get started with implementation of deep learning Presentation By : Manaswi Advisor : Dr.Chinmay.

Matt Gormley Lecture 16 October 24, 2016

Lecture 24: Convolutional neural networks

Deep Learning Hung-yi Lee 李宏毅.

CSE 473 Introduction to Artificial Intelligence Neural Networks

ECE 6504 Deep Learning for Perception

Training Techniques for Deep Neural Networks

Machine Learning: The Connectionist

CSC 578 Neural Networks and Deep Learning

ECE 599/692 – Deep Learning Lecture 6 – CNN: The Variants

Deep Learning Convoluted Neural Networks Part 2 11/13/

Bird-species Recognition Using Convolutional Neural Network

CSC 578 Neural Networks and Deep Learning

Introduction to Neural Networks

Face Recognition with Deep Learning Method

Image Classification.

CS 4501: Introduction to Computer Vision Training Neural Networks II

ECE 599/692 – Deep Learning Lecture 1 - Introduction

CSC 578 Neural Networks and Deep Learning

Artificial Neural Network & Backpropagation Algorithm

ECE 599/692 – Deep Learning Lecture 9 – Autoencoder (AE)

ECE 599/692 – Deep Learning Lecture 4 – CNN: Practical Issues

Tips for Training Deep Network

Hairong Qi, Gonzalez Family Professor

Very Deep Convolutional Networks for Large-Scale Image Recognition

Hairong Qi, Gonzalez Family Professor

Smart Robots, Drones, IoT

CSC 578 Neural Networks and Deep Learning

ECE599/692 - Deep Learning Lecture 14 – Recurrent Neural Network (RNN)

Neural Networks Geoff Hulten.

Deep Learning for Non-Linear Control

Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824

ECE – Pattern Recognition Lecture 16: NN – Back Propagation

Artificial Intelligence 10. Neural Networks

实习生汇报 ——北邮张安迪.

Image Classification & Training of Neural Networks

Hairong Qi, Gonzalez Family Professor

Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824

David Kauchak CS158 – Spring 2019

Introduction to Neural Networks

Natalie Lang Tomer Malach

CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning

Hairong Qi, Gonzalez Family Professor

ECE – Pattern Recognition Lecture 4 – Parametric Estimation

Prabhas Chongstitvatana Chulalongkorn University

Hairong Qi, Gonzalez Family Professor

ECE – Pattern Recognition Lecture 14 – Gradient Descent

Principles of Back-Propagation

Overall Introduction for the Lecture

ECE – Pattern Recognition Midterm Review

Presentation transcript:

ECE 599/692 – Deep Learning Lecture 5 – CNN: The Representative Power Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi Email: hqi@utk.edu AICIP stands for Advanced Imaging and Collaborative Information Processing

Outline Lecture 3: Core ideas of CNN Lecture 4: Practical issues Receptive field Pooling Shared weight Derivation of BP in CNN Lecture 4: Practical issues The learning slowdown problem Quadratic cost function Cross-entropy + sigmoid Log-likelihood + softmax Overfitting and regularization L2 vs. L1 normalization Dropout Artificial expanding the training set Weight initialization How to choose hyper-parameters Learning rate, early stopping, learning schedule, regularization parameter, mini-batch size, Grid search Others Momentum-based GD Lecture 5: The representative power of NN Lecture 6: Variants of CNN From LeNet to AlexNet to GoogleNet to VGG to ResNet Lecture 7: Implementation Lecture 8: Applications of CNN

The universality theorem Neural networks with a single hidden layer can be used to approximate any continuous functions to any desired precision

Visual proof One input and one hidden layer Weight selection (first layer) and the step function Bias selection and the location of the step function Weight selection (2nd layer) and the rectangular function (”bump”) Two inputs and two hidden layers From “bump” to “tower” Accumulating the ”bumps” or “towers”

Beyond sigmoid neuron The activation function needs to be well defined as z goes to both positive and negative infinity What about ReLU? What about linear neuron?

Why deep network? If two hidden layers can compute any function, why multiple layers or deep networks? Shallow networks require exponentially more elements to compute than do deep networks

Why are deep networks hard to train? The unstable gradient problem Gradient vanishing Gradient exploding

Acknowledgement All figures from this presentation are based on Nielsen’s NN book, Chapters 4 and 5.