CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Perceptrons; # of regions;

Slides:



Advertisements
Similar presentations
Perceptron Lecture 4.
Advertisements

CS344: Principles of Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 11, 12: Perceptron Training 30 th and 31 st Jan, 2012.
G53MLE | Machine Learning | Dr Guoping Qiu
CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 15, 16: Perceptrons and their computing power 6 th and.
PERCEPTRON LEARNING David Kauchak CS 451 – Fall 2013.
Lecture 11 Neural Networks L. Manevitz Dept of Computer Science Tel: 2420.
CS Perceptrons1. 2 Basic Neuron CS Perceptrons3 Expanded Neuron.
Linear Discriminant Functions
Intro. ANN & Fuzzy Systems Lecture 8. Learning (V): Perceptron Learning.
Neural Networks (II) Simple Learning Rule
Simple Neural Nets For Pattern Classification
A Review: Architecture
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed.
September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.
An Illustrative Example
Artificial Neural Networks
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 36-37: Foundation of Machine Learning.
1 Chapter 20 Section Slide Set 2 Perceptron examples Additional sources used in preparing the slides: Nils J. Nilsson’s book: Artificial Intelligence:
CS621: Artificial Intelligence Lecture 11: Perceptrons capacity Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 6: 22/09/2004, 24/09/2004,
CS344 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29 Introducing Neural Nets.
Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Linear Separability,
CS623: Introduction to Computing with Neural Nets Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 41,42– Artificial Neural Network, Perceptron, Capacity 2 nd, 4 th Nov,
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
Feed-Forward Neural Networks 主講人 : 虞台文. Content Introduction Single-Layer Perceptron Networks Learning Rules for Single-Layer Perceptron Networks – Perceptron.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
Linear Discrimination Reading: Chapter 2 of textbook.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21 Computing power of Perceptrons and Perceptron Training.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29: Perceptron training and.
Linear Classification with Perceptrons
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 4: 24/08/2004, 25/08/2004,
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21: Perceptron training and convergence.
Supervised learning network G.Anuradha. Learning objectives The basic networks in supervised learning Perceptron networks better than Hebb rule Single.
November 21, 2013Computer Vision Lecture 14: Object Recognition II 1 Statistical Pattern Recognition The formal description consists of relevant numerical.
Perceptrons Michael J. Watts
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
Start with student evals. What function does perceptron #4 represent?
CS623: Introduction to Computing with Neural Nets (lecture-9) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 20: Neural Net Basics: Perceptron.
CS621: Artificial Intelligence Lecture 10: Perceptrons introduction Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS621 : Artificial Intelligence
CS623: Introduction to Computing with Neural Nets (lecture-5)
CS344: Introduction to Artificial Intelligence (associated lab: CS386)
Pushpak Bhattacharyya Computer Science and Engineering Department
CS621: Artificial Intelligence
CS623: Introduction to Computing with Neural Nets (lecture-2)
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Pushpak Bhattacharyya Computer Science and Engineering Department
CS623: Introduction to Computing with Neural Nets (lecture-4)
Pushpak Bhattacharyya Computer Science and Engineering Department
CS344 : Introduction to Artificial Intelligence
CS621: Artificial Intelligence
Neuro-Computing Lecture 2 Single-Layer Perceptrons
ARTIFICIAL INTELLIGENCE
CS621: Artificial Intelligence Lecture 12: Counting no
CS623: Introduction to Computing with Neural Nets (lecture-5)
CS623: Introduction to Computing with Neural Nets (lecture-3)
CS344 : Introduction to Artificial Intelligence
Lecture 8. Learning (V): Perceptron Learning
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Prof. Pushpak Bhattacharyya, IIT Bombay
CS621: Artificial Intelligence Lecture 14: perceptron training
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
CS621 : Artificial Intelligence
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Presentation transcript:

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Perceptrons; # of regions; training and convergence 14 th March, 2011

Functions in one-input Perceptron w.0 > θ i.e.0 > θ w.1 > θw > θ w.0 ≤ θ i.e. θ ≥ 0 w.1 ≤ θw ≤ θ w.0 ≤ θ i.e.θ ≥ 0 w.1 > θw > θ w.0 > θ i.e.0 < θ w.1 ≤ θw ≤ θ True function θ functionIdentity functionComplement function θ X Y

Functions in Simple Perceptron

Multiple representations of a concept AND y = x 1 & x 2 θ ≥ 0 w 1 ≤ θ w 2 ≤ θ w 1 + w 2 > θ θ = 0.5 W 2 = 0.4W 1 = 0.4 x1x1 x2x2 x2x2 x1x1 y θ W1W1 W2W2 The concept of ANDing Plain English Linear Inequalities Truth Table Hyperplanes Algebra Perceptron

Inductive Bias Once we have decided to use a particular representation, we have assumed “inductive bias” The inductive bias of a learning algorithm is the set of assumptions that the learner uses to predict outputs given inputs that it has not encountered (Mitchell, 1980). You can refer to: A theory of the Learnable LG Valiant - Communications of the ACM, 1984

Fundamental Observation The number of TFs computable by a perceptron is equal to the number of regions produced by 2 n hyper-planes,obtained by plugging in the values in the equation ∑ i=1 n w i x i = θ

The geometrical observation Problem: m linear surfaces called hyper- planes (each hyper-plane is of (d-1)-dim) in d-dim, then what is the max. no. of regions produced by their intersection? i.e., R m,d = ?

Co-ordinate Spaces We work in the space or the space W2 W1 Ѳ X1 X2 (0,0) (1,0) (0,1) (1,1) Hyper-plane (Line in 2-D) W1 = W2 = 1, Ѳ = 0.5 X1 + x2 = 0.5 General equation of a Hyperplane: Σ Wi Xi = Ѳ

Regions produced by lines X1 X2 L1 L2 L3 L4 Regions produced by lines not necessarily passing through origin L1: 2 L2:2+2 = 4 L3:2+2+3 = 7 L4: = 11 New regions created = Number of intersections on the incoming line by the original lines Total number of regions = Original number of regions + New regions created

Number of computable functions by a neuron P1, P2, P3 and P4 are planes in the space w1w2 Ѳ x1 x2 Y

Number of computable functions by a neuron (cont…) P1 produces 2 regions P2 is intersected by P1 in a line. 2 more new regions are produced. Number of regions = 2+2 = 4 P3 is intersected by P1 and P2 in 2 intersecting lines. 4 more regions are produced. Number of regions = = 8 P4 is intersected by P1, P2 and P3 in 3 intersecting lines. 6 more regions are produced. Number of regions = = 14 Thus, a single neuron can compute 14 Boolean functions which are linearly separable. P2 P3 P4

Points in the same region X1X1 X2X2 If W 1 *X 1 + W 2 *X 2 > Ѳ W 1 ’*X 1 + W 2 ’*X 2 > Ѳ ’ Then If and share a region then they compute the same function

No. of Regions produced by Hyperplanes

Number of regions founded by n hyperplanes in d-dim passing through origin is given by the following recurrence relation we use generating function as an operating function Boundary condition: 1 hyperplane in d-dim n hyperplanes in 1-dim, Reduce to n points thru origin The generating function is

From the recurrence relation we have, R n-1,d corresponds to ‘shifting’ n by 1 place, => multiplication by x R n-1,d-1 corresponds to ‘shifting’ n and d by 1 place => multiplication by xy On expanding f(x,y) we get

After all this expansion, since other two terms become zero

This implies also we have, Comparing coefficients of each term in RHS we get,

Comparing co-efficients we get

Perceptron Training Algorithm (PTA) Preprocessing: 1. The computation law is modified to y = 1 if ∑w i x i > θ y = o if ∑w i x i < θ ... θ, ≤ w1w1 w2w2 wnwn x1x1 x2x2 x3x3 xnxn... θ, < w1w1 w2w2 w3w3 wnwn x1x1 x2x2 x3x3 xnxn w3w3

PTA – preprocessing cont… 2. Absorb θ as a weight  3. Negate all the zero-class examples... θ w1w1 w2w2 w3w3 wnwn x2x2 x3x3 xnxn x1x1 w0=θw0=θ x 0 = θ w1w1 w2w2 w3w3 wnwn x2x2 x3x3 xnxn x1x1

Example to demonstrate preprocessing OR perceptron 1-class,, 0-class Augmented x vectors:- 1-class,, 0-class Negate 0-class:-

Example to demonstrate preprocessing cont.. Now the vectors are x 0 x 1 x 2 X X X X

Perceptron Training Algorithm 1. Start with a random value of w ex: 2. Test for wx i > 0 If the test succeeds for i=1,2,…n then return w 3. Modify w, w next = w prev + x fail

Tracing PTA on OR-example w= wx 1 fails w= wx 4 fails w= wx 2 fails w= wx 1 fails w= wx 4 fails w= wx 2 fails w= wx 4 fails w= success