ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

Beyond Linear Separability
Slides from: Doug Gray, David Poole
NEURAL NETWORKS Backpropagation Algorithm
Lecture 13 – Perceptrons Machine Learning March 16, 2010.
Artificial Neural Networks
Lecture 14 – Neural Networks
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –SVM –Multi-class SVMs –Neural Networks –Multi-layer Perceptron Readings:
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2008 Shreekanth Mandayam ECE Department Rowan University.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.
Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Fall 2004 Shreekanth Mandayam ECE Department Rowan University.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2010 Shreekanth Mandayam ECE Department Rowan University.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Spring 2002 Shreekanth Mandayam Robi Polikar ECE Department.
Artificial Neural Networks
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2006 Shreekanth Mandayam ECE Department Rowan University.
Artificial Neural Networks
Machine Learning Chapter 4. Artificial Neural Networks
Classification / Regression Neural Networks 2
Neural Network Introduction Hung-yi Lee. Review: Supervised Learning Training: Pick the “best” Function f * Training Data Model Testing: Hypothesis Function.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
ECE 6504: Deep Learning for Perception
A note about gradient descent: Consider the function f(x)=(x-x 0 ) 2 Its derivative is: By gradient descent. x0x0 + -
Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.
Non-Bayes classifiers. Linear discriminants, neural networks.
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –(Finish) Backprop –Convolutional Neural Nets.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Supervised Learning –General Setup, learning from data –Nearest Neighbour.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Recurrent Neural Networks (RNNs) –BackProp Through Time (BPTT) –Vanishing / Exploding.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16.
CS 188: Artificial Intelligence Learning II: Linear Classification and Neural Networks Instructors: Stuart Russell and Pat Virtue University of California,
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Decision/Classification Trees Readings: Murphy ; Hastie 9.2.
Neural Networks William Cohen [pilfered from: Ziv; Geoff Hinton; Yoshua Bengio; Yann LeCun; Hongkak Lee - NIPs 2010 tutorial ]
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Xintao Wu University of Arkansas Introduction to Deep Learning 1.
Multinomial Regression and the Softmax Activation Function Gary Cottrell.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Today’s Lecture Neural networks Training
Neural networks and support vector machines
ECE 5424: Introduction to Machine Learning
Dhruv Batra Georgia Tech
ECE 5424: Introduction to Machine Learning
Dhruv Batra Georgia Tech
ECE 5424: Introduction to Machine Learning
第 3 章 神经网络.
Dhruv Batra Georgia Tech
CSE 473 Introduction to Artificial Intelligence Neural Networks
ECE 5424: Introduction to Machine Learning
CSE P573 Applications of Artificial Intelligence Neural Networks
Classification / Regression Neural Networks 2
Machine Learning Today: Reading: Maria Florina Balcan
Goodfellow: Chap 6 Deep Feedforward Networks
CS 4501: Introduction to Computer Vision Training Neural Networks II
Artificial Intelligence Chapter 3 Neural Networks
CSE 573 Introduction to Artificial Intelligence Neural Networks
Neural Networks Geoff Hulten.
Artificial Intelligence Chapter 3 Neural Networks
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Artificial Intelligence Chapter 3 Neural Networks
Neural networks (1) Traditional multi-layer perceptrons
Artificial Intelligence 10. Neural Networks
Fig. 1. A diagram of artificial neural network consisting of multilayer perceptron. This simple diagram is for a conceptual explanation. When the logistic.
Artificial Intelligence Chapter 3 Neural Networks
Backpropagation and Neural Nets
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Image recognition.
Dhruv Batra Georgia Tech
Dhruv Batra Georgia Tech
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design

Administrativia Scholar –Anybody not have access? –Please post questions on Scholar Forum. –Please check scholar forums. You might not know you have a doubt. Sign up for Presentations – HRBWFdAlXKPIzlEwfw1-u7rBw9TJ8/edit#gid= https://docs.google.com/spreadsheets/d/1m76E4mC0wfRjc4 HRBWFdAlXKPIzlEwfw1-u7rBw9TJ8/edit#gid= (C) Dhruv Batra2

Plan for Today Notation + Setup Neural Networks Chain Rule + Backprop (C) Dhruv Batra3

Supervised Learning Input: x (images, text, s…) Output: y(spam or non-spam…) (Unknown) Target Function –f: X  Y(the “true” mapping / reality) Data –(x 1,y 1 ), (x 2,y 2 ), …, (x N,y N ) Model / Hypothesis Class –g: X  Y –y = g(x) = sign(w T x) Learning = Search in hypothesis space –Find best g in model class. (C) Dhruv Batra4

Basic Steps of Supervised Learning Set up a supervised learning problem Data collection –Start with training data for which we know the correct outcome provided by a teacher or oracle. Representation –Choose how to represent the data. Modeling –Choose a hypothesis class: H = {g: X  Y} Learning/Estimation –Find best hypothesis you can in the chosen class. Model Selection –Try different models. Picks the best one. (More on this later) If happy stop –Else refine one or more of the above (C) Dhruv Batra5

Error Decomposition (C) Dhruv Batra6 Reality Modeling Error model class Estimation Error Optimization Error

Error Decomposition (C) Dhruv Batra7 Reality Estimation Error Optimization Error Modeling Error model class

Error Decomposition (C) Dhruv Batra8 Reality Modeling Error model class Estimation Error Optimization Error Higher-Order Potentials

Biological Neuron (C) Dhruv Batra9

Recall: The Neuron Metaphor Neurons accept information from multiple inputs, transmit information to other neurons. Artificial neuron Multiply inputs by weights along edges Apply some function to the set of inputs at each node 10 Image Credit: Andrej Karpathy, CS231n

Types of Neurons Linear Neuron Logistic Neuron Perceptron Potentially more. Require a convex loss function for gradient descent training. 11Slide Credit: HKUST

Activation Functions sigmoid vs tanh (C) Dhruv Batra12

A quick note (C) Dhruv Batra13Image Credit: LeCun et al. ‘98

Rectified Linear Units (ReLU) (C) Dhruv Batra14 [Krizhevsky et al., NIPS12]

Limitation A single “neuron” is still a linear decision boundary What to do? Idea: Stack a bunch of them together! (C) Dhruv Batra15

Multilayer Networks Cascade Neurons together The output from one layer is the input to the next Each Layer has its own sets of weights (C) Dhruv Batra16 Image Credit: Andrej Karpathy, CS231n

Universal Function Approximators Theorem –3-layer network with linear outputs can uniformly approximate any continuous function to arbitrary accuracy, given enough hidden units [Funahashi ’89] (C) Dhruv Batra17

Neural Networks Demo – pprox.htmlhttp://neuron.eng.wayne.edu/bpFunctionApprox/bpFunctionA pprox.html (C) Dhruv Batra18

Key Computation: Forward-Prop (C) Dhruv Batra19 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Key Computation: Back-Prop (C) Dhruv Batra20 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

(C) Dhruv Batra21

(C) Dhruv Batra22

Visualizing Loss Functions Sum of individual losses (C) Dhruv Batra23

Detour (C) Dhruv Batra24

Logistic Regression as a Cascade (C) Dhruv Batra25 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Forward Propagation On board (C) Dhruv Batra26