Prof. Pushpak Bhattacharyya, IIT Bombay

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Beyond Linear Separability
CS344: Principles of Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 11, 12: Perceptron Training 30 th and 31 st Jan, 2012.
A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 15, 16: Perceptrons and their computing power 6 th and.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
CS 4700: Foundations of Artificial Intelligence
CS621: Artificial Intelligence Lecture 24: Backpropagation Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Computer Science and Engineering
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Perceptrons; # of regions;
 Diagram of a Neuron  The Simple Perceptron  Multilayer Neural Network  What is Hidden Layer?  Why do we Need a Hidden Layer?  How do Multilayer.
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
CS621: Artificial Intelligence Lecture 11: Perceptrons capacity Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS344 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29 Introducing Neural Nets.
Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Linear Separability,
A note about gradient descent: Consider the function f(x)=(x-x 0 ) 2 Its derivative is: By gradient descent. x0x0 + -
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
Non-Bayes classifiers. Linear discriminants, neural networks.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29: Perceptron training and.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 4: 24/08/2004, 25/08/2004,
Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Artificial Neural Networks:
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21: Perceptron training and convergence.
Pushpak Bhattacharyya Computer Science and Engineering Department
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Backpropagation and Application.
COMP53311 Other Classification Models: Neural Network Prepared by Raymond Wong Some of the notes about Neural Network are borrowed from LW Chan’s notes.
CS623: Introduction to Computing with Neural Nets (lecture-9) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 20: Neural Net Basics: Perceptron.
Learning: Neural Networks Artificial Intelligence CMSC February 3, 2005.
CS621: Artificial Intelligence Lecture 10: Perceptrons introduction Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Today’s Lecture Neural networks Training
Fall 2004 Backpropagation CS478 - Machine Learning.
Other Classification Models: Neural Network
ECE 5424: Introduction to Machine Learning
Learning with Perceptrons and Neural Networks
Learning in Neural Networks
CS623: Introduction to Computing with Neural Nets (lecture-5)
Other Classification Models: Neural Network
CS344: Introduction to Artificial Intelligence (associated lab: CS386)
CSE 473 Introduction to Artificial Intelligence Neural Networks
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
CS621: Artificial Intelligence
Machine Learning Today: Reading: Maria Florina Balcan
CSC 578 Neural Networks and Deep Learning
CS623: Introduction to Computing with Neural Nets (lecture-2)
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Chapter 3. Artificial Neural Networks - Introduction -
Neural Networks Chapter 5
Neural Network - 2 Mayank Vatsa
CS623: Introduction to Computing with Neural Nets (lecture-4)
Multilayer Perceptron & Backpropagation
CS 621 Artificial Intelligence Lecture 25 – 14/10/05
Lecture Notes for Chapter 4 Artificial Neural Networks
CS623: Introduction to Computing with Neural Nets (lecture-9)
CS621 : Artificial Intelligence
CS621: Artificial Intelligence
CS623: Introduction to Computing with Neural Nets (lecture-5)
CS623: Introduction to Computing with Neural Nets (lecture-3)
CS344 : Introduction to Artificial Intelligence
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
Outline Announcement Neural networks Perceptrons - continued
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Presentation transcript:

Prof. Pushpak Bhattacharyya, IIT Bombay CS 621 Artificial Intelligence Lecture 24 - 11/10/05 Prof. Pushpak Bhattacharyya Feedforward Nets 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Perceptron Cannot compute non-linearly separable data Real life problems are typically non-linear 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Basic Computing Paradigm Setting up hyperplanes Use higher power surfaces Tolerate error Use multiple perceptrons 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

A quadratic surface can separate the data. Difficult to train. 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Pocket Algorithm Algorithm evolved in 1985 – essentially uses PTA Basic Idea: Always preserve the best weight obtained so far in the “pocket” Change weights, if found better (i.e. changed weights result in reduced error). Tolerate error Used in connectionist expert systems 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Multilayer Feedforward Network Geometrically y x h2 h1 x x2 x1 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Algebraically LINEARIZATION X1  X2 = X1 X2’ + X1’X2 = OR (AND (X1 , X2’ ) , AND (X1’ , X2 )) 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Example Output layer neurons Input layer neurons Hidden layer neurons 1 & 3 are also called computation neurons x2 x1 y 0.5 1 1.5 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Hidden Layer Neurons They contribute to the power of network. How many hidden layers ? How many neurons/layer ? Pure feed-forward network – no jumping of connections 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay XOR Example = 0.5 w1=1 w2=1 x1x2 1 1 x1x2 1.5 -1 -1 1.5 x1 x2 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Constraints Constraints on neurons in multi-layer perceptrons : The compute-neurons must be non-linear. Non linearity is the source of power. 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Explanation y y = m1(h1.w1 + h2.w2) + c1 h1 = m2(w3.x1 + w4.x2) + c2 h2 = m3(w5.x1 + w6.x2) + c3 Substituting h1 & h2 y = k1x1 + k2x2 + c’ w2 w1 h2 h1 w5 w3 w6 w4 x1 x2 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Explanation (Contd) y = mx + c yU yL y > yU is regarded as y = 1 y < yL is regarded as y = 0 yU > yL 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Linear Neuron Can a linear neuron compute XOR. We want y = w1x1 + w2x2 + c : characteristic y w2 w1 x2 x1 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Linear Neuron (Contd 1) for (1,1), (0,0) y < yL For (0,1), (1,0) y > yU yU > yL Can (w1, w2, c) be found 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Linear Neuron (Contd 2) (0,0) y = w1.0 + w2.0 + c = c y < yL c < yL - (1) (0,1) y = w1.1 + w2.0 + c y > yU w1 + c > yU - (2) 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Linear Neuron (Contd 3) 1,0 w2 + c > yU - (3) 1,1 w1 + w2 + c < yL - (4) yU > yL - (5) 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Linear Neuron (Contd 4) c < yL - (1) w1 + c > yU - (2) w2 + c > yU - (3) w1 + w2 + c < yL - (4) yU > yL - (5) Inconsistent 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Observations A linear neuron cannot compute XOR A multilayer network with linear characteristic neurons is collapsible to a single linear neuron. Therefore addition of layers does not contribute to computing power. Neurons in feedforward network must be non-linear Threshold elements will do iff we can linearize a non-linearly function. 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Linearity Linearity is not in general possible – Need to know the function in closed form. Very large space even for boolean data. 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Training Algorithm Looks at the pre-classified data Arrives at weight values 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Why won’t PTA do? Since we do not know desired outputs at hidden layer neurons, PTA cannot be applied. So apply a training method called GRADIENT DESCENT. 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Minima E Parameters (w1,w2….) 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Gradient Descent Ensured by GRADIENT DESCENT E – error wmn - parameter 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Sigmoid neurons Gradient Descent needs a derivative computation - not possible in perceptron due to the discontinuous step function used!  Sigmoid neurons with easy-to-compute derivatives required! (Radial basis functions are also differentiable) Computing power comes from non-linearity of sigmoid function. 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Summary Feed-forward network, pure or non-pure networks XOR computed by multi layer perceptron Non-linearity : must Gradient Descent Sigmoid 11-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay