CS623: Introduction to Computing with Neural Nets (lecture-9)

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

Perceptron Lecture 4.
Beyond Linear Separability
CS344: Principles of Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 11, 12: Perceptron Training 30 th and 31 st Jan, 2012.
NEURAL NETWORKS Perceptron
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Simple Neural Nets For Pattern Classification
The back-propagation training algorithm
September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.
September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
CS623: Introduction to Computing with Neural Nets (lecture-6) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Artificial Neural Networks
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 6: 22/09/2004, 24/09/2004,
Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Linear Separability,
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
Non-Bayes classifiers. Linear discriminants, neural networks.
Akram Bitar and Larry Manevitz Department of Computer Science
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29: Perceptron training and.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 4: 24/08/2004, 25/08/2004,
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
Chapter 2 Single Layer Feedforward Networks
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21: Perceptron training and convergence.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 36, 37: Hardness of training.
Perceptrons Michael J. Watts
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
CS623: Introduction to Computing with Neural Nets (lecture-12) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-17) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
CS623: Introduction to Computing with Neural Nets (lecture-7) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-9) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Evolutionary Computation Evolving Neural Network Topologies.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
CS623: Introduction to Computing with Neural Nets (lecture-18) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Neural networks.
Fall 2004 Backpropagation CS478 - Machine Learning.
Artificial Neural Networks
Chapter 2 Single Layer Feedforward Networks
CS623: Introduction to Computing with Neural Nets (lecture-5)
One-layer neural networks Approximation problems
CS344: Introduction to Artificial Intelligence (associated lab: CS386)
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
CS621: Artificial Intelligence
CSC 578 Neural Networks and Deep Learning
CS623: Introduction to Computing with Neural Nets (lecture-2)
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Neural Network - 2 Mayank Vatsa
Pushpak Bhattacharyya Computer Science and Engineering Department
CS623: Introduction to Computing with Neural Nets (lecture-4)
Pushpak Bhattacharyya Computer Science and Engineering Department
CS621: Artificial Intelligence
Neuro-Computing Lecture 2 Single-Layer Perceptrons
CS623: Introduction to Computing with Neural Nets (lecture-5)
CS623: Introduction to Computing with Neural Nets (lecture-3)
CS344 : Introduction to Artificial Intelligence
CS 621 Artificial Intelligence Lecture 29 – 22/10/05
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Prof. Pushpak Bhattacharyya, IIT Bombay
CS621: Artificial Intelligence Lecture 14: perceptron training
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
CS623: Introduction to Computing with Neural Nets (lecture-11)
Outline Announcement Neural networks Perceptrons - continued
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Presentation transcript:

CS623: Introduction to Computing with Neural Nets (lecture-9) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Transformation from set-spltting to positive linear confinement: for proving NP-completeness of the latter S = {s1, s2, s3} c1 = {s1, s2}, c2 = {s2, s3} x3 (0,0,1) -ve (0,1,1) +ve (0,0,0) +ve x1 (1,0,0) -ve (0,1,0) -ve (1,1,0) +ve x2

Proving the transformation Statement Set-splitting problem has a solution if and only if positive linear confinement problem has a solution. Proof in two parts: if part and only if part If part Given Set-splitting problem has a solution. To show that the constructed Positive Linear Confinement (PLC) problem has a solution i.e. to show that since S1 and S2 exist, P1 and P2 exist which confine the positive points

Proof of if part P1 and P2 are as follows: ai = -1, if si ε S1 P1 : a1x1 + a2x2+ ... + anxn = -1/2 -- Eqn A P2 : b1x1 + b2x2 + ... + bnxn = -1/2 -- Eqn B ai = -1, if si ε S1 = n, otherwise bi = -1, if si ε S2

What is proved Origin (+ve point) is on one side of P1 +ve points corresponding to ci’s are on the same side as the origin. -ve points corresponding to S1 are on the opposite side of P1

Illustrative Example Example +ve points: -ve points: S = {s1, s2, s3} c1 = {s1, s2}, c2 = {s2, s3} Splitting : S1 = {s1, s3}, S2 = {s2} +ve points: (<0, 0, 0> ,+), (<1, 1, 0>,+), (<0, 1, 1>,+) -ve points: (<1, 0, 0>,-), (<0, 1, 0>,-), (<0, 0, 1>,-)

Example (contd.) The constructed planes are: P1 : P2: a1x1 + a2x2 + a3x3 = -1/2 -x1 + 3x2 – x3 = -1/2 P2: b1x1 + b2x2 + b3x3 = -1/2 3x1 – x2 + 3x3 = -1/2

Graphic for Example S1 S2 P1 <1, 1, 0> + <0, 0, 0> + <0, 1, 1> + <1, 0, 0> - <0, 0, 1> - <0, 1, 0> - S2

Proof of only if part Given +ve and -ve points constructed from the set-splitting problem, two hyperplanes P1 and P2 have been found which do positive linear confinement To show that S can be split into S1 and S2

Proof (Only if part) contd. Let the two planes be: P1: a1x1 + a2x2+ ... + anxn = θ1 P2 : b1x1 + b2x2 + ... + bnxn = θ2 Then, S1 = {elements corresponding to -ve points separated by P1} S2 = {elements corresponding to -ve points separated by P2}

Proof (Only if part) contd. Suppose ci ⊂ S1, then every element in ci is contained in S1 Let e1i, e2i, ..., emii be the elements of ci corresponding to each element Evaluating for each co-efficient, we get, a1 < θ1, a2 < θ1, ..., ami < θ1 -- (1) But a1 + a2 + ... + am > θ1 -- (2) and 0 > θ1 -- (3) CONTRADICTION

What has been shown Positive Linear Confinement is NP-complete. Confinement on any set of points of one kind is NP-complete (easy to show) The architecture is special- only one hidden layer with two nodes The neurons are special, 0-1 threshold neurons, NOT sigmoid Hence, can we generalize and say that FF NN training is NP-complete? Not rigorously, perhaps; but strongly indicated

Accelerating Backpropagation

Quickprop: Fahlman, ‘88 Assume a parabolic approximation to the error surface Parabolic Approximation E Actual Error surface True error minimum Wt-1 Wt New wt by QP w

Quickprop basically makes use of the second derivative E= aw2 + bw + c First derivative, E’= δE/ δw= 2aw + b E’(t-1)= 2aw(t-1) + b --- (1) E’(t)= 2aw(t) + b --- (2) Solving, a= [E’(t) – E’(t-1)]/2[w(t) – w(t-1)] B= E’(t) - [E’(t) – E’(t-1)]w(t)/[w(t) – w(t-1)]

Next weight w(t+1) This is found by minimizing E(t+1) Set E’(t+1) to 0 2aw(t+1) + b = 0 Substituting for a and b w(t+1)-w(t)= [w(t)-w(t-1)] X [E’(t)]/[E’(t-1)-E’(t)] i.e. Δw(t) = Δw(t-1) E’(t) E’(t-1)-E’(t)

How to fix the hidden layer

By trial and error mostly No theory yet Read papers by Eric Baum Heuristics exist like the mean of the sizes of the i/p and the o/p layers (arithmetic, geometric and harmonic)

Grow the network: Tiling Algorithm A kind of divide and conquer strategy Given the classes in the data, run the perceptron training algorithm If linearly separable, convergence without any hidden layer If not, do as well as you can (pocket algorithm) This will produce classes with misclassified points

Tiling Algorithm (contd) Take the class with misclassified points and break into subclasses which contain no outliers Run PTA again after recruiting the required number of perceptrons Do this until homogenous classes are obtained Apply the same procedure for the first hidden layer to obtain the second hidden layer and so on

Illustration XOR problem Classes are (1, 0) (0, 0) (0, 1) (1, 1)

As best a classification as possible +ve (0,1) (1,1) -ve +ve (0,0) -ve (1,0)

Classes with error (1,0) (0,0) (0,1) (1,1) outlier

How to achieve this classification Give the labels as shown: eqv to an OR problem - + (1,0) (0,1) (0,0) (1,1) outlier

The partially developed n/w Get the first neuron in the hidden layer, which computes OR h1 0.5 1.0 1.0 x1 x2

Break the incorrect class + (1,0) (0,1) (1,0) - (0,1) (1,1) (1,1) outlier Don’t care: Make + (0,0)

Solve classification for h2 + (0,0) - (1,1) (0,1) (1,0) This is x1x2

Next stage of the n/w Computes x1x2 Computes x1+x2 h1 h2 -1.5 0.5 1.0 -1.0 -1.0 1.0 x1 x2

Getting the output layer Solve a tiling algo problem for the hidden layer x2 x1 h1 (x1+x2) x1x2 y 1 + - (0,0) (1,1) (0,1) (1,0) AND problem

Final n/w y 0.5 Computes x1x2 1.0 1.0 Computes x1x2 Computes x1+x2 h1 AND n/w 0.5 Computes x1x2 1.0 1.0 Computes x1x2 Computes x1+x2 h1 h2 0.5 -1.5 1.0 -1.0 -1.0 1.0 x2 x1

Lab exercise Implement the tiling algorithm and run it for XOR Majority IRIS data