CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 20: Neural Net Basics: Perceptron.

Slides:



Advertisements
Similar presentations
Perceptron Lecture 4.
Advertisements

CS344: Principles of Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 11, 12: Perceptron Training 30 th and 31 st Jan, 2012.
CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 15, 16: Perceptrons and their computing power 6 th and.
A Review: Architecture
CES 514 – Data Mining Lecture 8 classification (contd…)
20.5 Nerual Networks Thanks: Professors Frank Hoffmann and Jiawei Han, and Russell and Norvig.
Connectionist Modeling Some material taken from cspeech.ucd.ie/~connectionism and Rich & Knight, 1991.
September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.
September 14, 2010Neural Networks Lecture 3: Models of Neurons and Neural Networks 1 Visual Illusions demonstrate how we perceive an “interpreted version”
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Neurons, Neural Networks, and Learning 1. Human brain contains a massively interconnected net of (10 billion) neurons (cortical cells) Biological.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
START OF DAY 4 Reading: Chap. 3 & 4. Project Topics & Teams Select topics/domains Select teams Deliverables – Description of the problem – Selection.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 25: Perceptrons; # of regions;
CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31 and 32– Brain and Perceptron.
1 Pattern Classification X. 2 Content General Method K Nearest Neighbors Decision Trees Nerual Networks.
CS621: Artificial Intelligence Lecture 11: Perceptrons capacity Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 19: Fuzzy Logic and Neural Net Based IR.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 6: 22/09/2004, 24/09/2004,
CS344 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29 Introducing Neural Nets.
Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Linear Separability,
CS623: Introduction to Computing with Neural Nets Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 41,42– Artificial Neural Network, Perceptron, Capacity 2 nd, 4 th Nov,
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
Linear Discrimination Reading: Chapter 2 of textbook.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21 Computing power of Perceptrons and Perceptron Training.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29: Perceptron training and.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 4: 24/08/2004, 25/08/2004,
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21: Perceptron training and convergence.
CS 621 Artificial Intelligence Lecture /11/05 Guest Lecture by Prof
November 21, 2013Computer Vision Lecture 14: Object Recognition II 1 Statistical Pattern Recognition The formal description consists of relevant numerical.
Perceptrons Michael J. Watts
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
CS623: Introduction to Computing with Neural Nets (lecture-7) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-9) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
March 31, 2016Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 1 … let us move on to… Artificial Neural Networks.
CS621: Artificial Intelligence Lecture 10: Perceptrons introduction Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS621 : Artificial Intelligence
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
CS623: Introduction to Computing with Neural Nets (lecture-5)
CS344: Introduction to Artificial Intelligence (associated lab: CS386)
Pushpak Bhattacharyya Computer Science and Engineering Department
Machine Learning. Support Vector Machines A Support Vector Machine (SVM) can be imagined as a surface that creates a boundary between points of data.
CS621: Artificial Intelligence
Perceptrons for Dummies
CS623: Introduction to Computing with Neural Nets (lecture-2)
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Pushpak Bhattacharyya Computer Science and Engineering Department
CS623: Introduction to Computing with Neural Nets (lecture-4)
Machine Learning. Support Vector Machines A Support Vector Machine (SVM) can be imagined as a surface that creates a boundary between points of data.
Machine Learning. Support Vector Machines A Support Vector Machine (SVM) can be imagined as a surface that creates a boundary between points of data.
Artificial Intelligence Lecture No. 28
CS 621 Artificial Intelligence Lecture /10/05 Prof
CS621: Artificial Intelligence
CS 621 Artificial Intelligence Lecture 27 – 21/10/05
CS621: Artificial Intelligence Lecture 12: Counting no
CS623: Introduction to Computing with Neural Nets (lecture-5)
CS623: Introduction to Computing with Neural Nets (lecture-3)
CS344 : Introduction to Artificial Intelligence
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Prof. Pushpak Bhattacharyya, IIT Bombay
CS621: Artificial Intelligence Lecture 14: perceptron training
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
CS621 : Artificial Intelligence
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Presentation transcript:

CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 20: Neural Net Basics: Perceptron

Turing Machine & Von Neumann Machine

Challenges to Symbolic AI Motivation for challenging Symbolic AI A large number of computations and information process tasks that living beings are comfortable with, are not performed well by computers! The Differences Brain computation in living beings TM computation in computers Pattern Recognition Numerical Processing Learning oriented Programming oriented Distributed & parallel processing Centralized & serial processing Content addressable Location addressable

Perceptron

The Perceptron Model A perceptron is a computing element with input lines having associated weights and the cell having a threshold value. The perceptron model is motivated by the biological neuron. Output = y wnwn W n-1 w1w1 X n-1 x1x1 Threshold = θ

θ 1 y Step function / Threshold function y = 1 for Σw i x i >=θ =0 otherwise ΣwixiΣwixi

Features of Perceptron Input output behavior is discontinuous and the derivative does not exist at Σw i x i = θ Σw i x i - θ is the net input denoted as net Referred to as a linear threshold element - linearity because of x appearing with power 1 y= f(net): Relation between y and net is non-linear

Computation of Boolean functions AND of 2 inputs X1 x2 y The parameter values (weights & thresholds) need to be found. y w1w1 w2w2 x1x1 x2x2 θ

Computing parameter values w1 * 0 + w2 * 0 = 0; since y=0 w1 * 0 + w2 * 1 <= θ  w2 <= θ; since y=0 w1 * 1 + w2 * 0 <= θ  w1 <= θ; since y=0 w1 * 1 + w2 *1 > θ  w1 + w2 > θ; since y=1 w1 = w2 = = 0.5 satisfy these inequalities and find parameters to be used for computing AND function.

Other Boolean functions OR can be computed using values of w1 = w2 = 1 and = 0.5 XOR function gives rise to the following inequalities: w1 * 0 + w2 * 0 = 0 w1 * 0 + w2 * 1 > θ  w2 > θ w1 * 1 + w2 * 0 > θ  w1 > θ w1 * 1 + w2 *1 <= θ  w1 + w2 <= θ No set of parameter values satisfy these inequalities.

Threshold functions n # Boolean functions (2^2^n) #Threshold Functions (2 n^2 ) K 1008 Functions computable by perceptrons - threshold functions #TF becomes negligibly small for larger values of #BF. For n=2, all functions except XOR and XNOR are computable.

Computational Capacity of Perceptrons

Separating plane ∑ w i x i = θ defines a linear surface in the (W,θ) space, where W= is an n- dimensional vector. A point in this (W,θ) space defines a perceptron. y x1x1... θ w1w1 w2w2 w3w3 wnwn x2x2 x3x3 xnxn

The Simplest Perceptron θ y x1x1 w1w1 Depending on different values of w and θ, four different functions are possible

Simplest perceptron contd f4f3f2f1x θ≥0 w≤0 θ≥0 w>0 θ<0 w≤0 θ<0 W<0 0-function Identity Function Complement Function True-Function

Counting the #functions for the simplest perceptron For the simplest perceptron, the equation is w.x=θ. Substituting x=0 and x=1, we get θ=0 and w=θ. These two lines intersect to form four regions, which correspond to the four functions. θ=0 w=θ R1 R2 R3 R4

Fundamental Observation The number of TFs computable by a perceptron is equal to the number of regions produced by 2 n hyper- planes,obtained by plugging in the values in the equation ∑ i=1 n w i x i = θ Intuition: How many lines are produced by the existing planes on the new plane? How many regions are produced on the new plane by these lines?

The geometrical observation Problem: m linear surfaces called hyper-planes (each hyper-plane is of (d-1)-dim) in d-dim, then what is the max. no. of regions produced by their intersection? i.e. R m,d = ?

Concept forming examples Max #regions formed by m lines in 2-dim isR m,2 = R m-1,2 + ? The new line intersects m-1 lines at m-1 points and forms m new regions. R m,2 = R m-1,2 + m, R 1,2 = 2 Max #regions formed by m planes in 3 dimensions is R m,3 = R m-1,3 + R m-1,2, R 1,3 = 2

Concept forming examples contd.. Max #regions formed by m planes in 4 dimensions is R m,4 = R m-1,4 + R m-1,3, R 1,4 = 2 R m,d = R m-1,d + R m-1,d-1 Subject to R 1,d = 2 R m,1 = 2

General Equation R m,d = R m-1,d + R m-1,d-1 Subject to R 1,d = 2 R m,1 = 2 All the hyperplanes pass through the origin.

Method of Observation for lines in 2-D R m,2 = R m-1,2 + m R m-1,2 = R m-2,2 + m-1 R m-2,2 = R m-3,2 + m-2 : R 2,2 = R 1,2 + 2 Therefore, R m,2 = R m-1,2 + m = 2 + m + (m-1) + (m-2) + …+ 2 = 1 + ( … + m) = 1 + [m(m+1)]/2

Method of generating function R m,2 = R m-1,2 + m f(x) = R 1,2 x + R 2,2 x 2 + R 3,2 x 3 + … + R i,2 x m + … + α –>Eq1 xf(x) = R 1,2 x 2 + R 2,2 x 3 + R 3,2 x 4 + … + R i,2 x m+1 + … + α –>Eq2 Observe that R m,2 - R m-1,2 = m

Method of generating functions cont… Eq1 – Eq2 gives (1-x)f(x) = R 1,2 x + (R 2,2 - R 1,2 )x 2 + (R 3,2 - R 2,2 )x 3 + … + (R m,2 - R m-1,2 )x m + … + α (1-x)f(x) = R 1,2 x + (2x 2 + 3x 3 + …+mx m +..) = 2x 2 + 3x 3 + …+mx m +.. f(x) = (2x 2 + 3x 3 + …+mx m +..)(1-x) -1

Method of generating functions cont… f(x) = (2x 2 + 3x 3 + …+mx m +..)(1+x+x 2 +x 3 …)  Eq3 Coeff of x m is R m,2 = ( …+m) = 1+[m(m+1)/2]

The general problem of m hyperplanes in d dimensional space c(m,d) = c(m-1,d) + c(m-1,d-1) subject to c(m,1)= 2 c(1,d)= 2

Generating function f(x,y) = R 1,1 xy + R 1,2 xy 2 + R 1,3 xy 3 + … + R 2,1 x 2 y + R 2,2 x 2 y 2 + R 2,3 x 2 y R 3,1 x 3 y + R 3,2 x 3 y 2 + ……… f(x,y) = ∑ m≥1 ∑ n≥1 R m,d x m y d

# of regions formed by m hyperplanes passing through origin in the d dimensional space c(m,d)= 2.Σ d-1 i=0 m-1 c i