04-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 21 - 04/10/05 Prof. Pushpak Bhattacharyya Artificial Neural Networks:

Slides:



Advertisements
Similar presentations
CS344: Principles of Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 11, 12: Perceptron Training 30 th and 31 st Jan, 2012.
Advertisements

Slides from: Doug Gray, David Poole
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Perceptron Learning Rule
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Machine Learning Neural Networks
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Artificial Neural Networks
Data Mining with Neural Networks (HK: Chapter 7.5)
LOGO Classification III Lecturer: Dr. Bo Yuan
December 7, 2010Neural Networks Lecture 21: Hopfield Network Convergence 1 The Hopfield Network The nodes of a Hopfield network can be updated synchronously.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
CS621: Artificial Intelligence Lecture 24: Backpropagation Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-6) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Artificial Neural Networks
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Artificial Neural Networks
Introduction to Neural Networks Debrup Chakraborty Pattern Recognition and Machine Learning 2006.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
Artificial Neural Network Supervised Learning دكترمحسن كاهاني
NEURAL NETWORKS FOR DATA MINING
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
Neural Networks and Fuzzy Systems Hopfield Network A feedback neural network has feedback loops from its outputs to its inputs. The presence of such loops.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29: Perceptron training and.
Instructor: Prof. Pushpak Bhattacharyya 13/08/2004 CS-621/CS-449 Lecture Notes CS621/CS449 Artificial Intelligence Lecture Notes Set 4: 24/08/2004, 25/08/2004,
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21: Perceptron training and convergence.
Pushpak Bhattacharyya Computer Science and Engineering Department
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Chapter 18 Connectionist Models
IE 585 History of Neural Networks & Introduction to Simple Learning Rules.
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam en Universiteit Utrecht
Chapter 6 Neural Network.
CS623: Introduction to Computing with Neural Nets (lecture-12) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
J. Kubalík, Gerstner Laboratory for Intelligent Decision Making and Control Artificial Neural Networks II - Outline Cascade Nets and Cascade-Correlation.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Fall 2004 Backpropagation CS478 - Machine Learning.
Artificial Neural Networks
CS623: Introduction to Computing with Neural Nets (lecture-5)
Real Neurons Cell structures Cell body Dendrites Axon
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
CSE P573 Applications of Artificial Intelligence Neural Networks
CS621: Artificial Intelligence
CS623: Introduction to Computing with Neural Nets (lecture-2)
Chapter 3. Artificial Neural Networks - Introduction -
CSE 573 Introduction to Artificial Intelligence Neural Networks
CS 621 Artificial Intelligence Lecture 25 – 14/10/05
CS623: Introduction to Computing with Neural Nets (lecture-5)
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Prof. Pushpak Bhattacharyya, IIT Bombay
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
CS623: Introduction to Computing with Neural Nets (lecture-11)
Presentation transcript:

Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture /10/05 Prof. Pushpak Bhattacharyya Artificial Neural Networks: Models, Algorithms, Applications

Prof. Pushpak Bhattacharyya, IIT Bombay 2 Softcomputing Neural Networks is a computing device composed of very simple processing elements called neurons which are interconnected. The processing power in such devices comes from many connections. There are different models of neural network like Perceptron, Back Propogation, Hopfield, Adaptive Resonance Theory, Kohonen Net. The first three models are supervised learning algorithms and the rest are unsupervised.

Prof. Pushpak Bhattacharyya, IIT Bombay 3 Fundamental Unit: Perceptron These are single neurons: motivated by Brain Cells. ∑ n i=1 w i x i is compared with θ. If the net input is more than the threshold the neuron fires. i.e., assumes a state 1, else the state is 0.

Prof. Pushpak Bhattacharyya, IIT Bombay 4 Computing with Perceptron X 1 X 2 Y w w 2 < θ From this, one possible 0. w w 2 < θ solution is 1. w w 2 < θ θ= 1.5 and 1. w w 2 > θ w 1 =1.0 and w 2 =1.0

Prof. Pushpak Bhattacharyya, IIT Bombay 5 Geometrical View

Prof. Pushpak Bhattacharyya, IIT Bombay 6 Perceptron Training Algorithm (PTA): Preprocess

Prof. Pushpak Bhattacharyya, IIT Bombay 7 Pre-processing for PTA The threshold is absorbed as a weight and a new input line is introduced which is applied a constant input of -1. Also we negate the components of the vector in the 0- class so that only the test ∑ n i=1 w i x i > θ is performed In PTA we need to find the values

Prof. Pushpak Bhattacharyya, IIT Bombay 8 Perceptron Training Algorithm Input : Preprocessed vectors X i, i = 1 to m 1. Choose random values for w i, i = 0 to n 2. i:= 1 3. IF W.X > 0 goto 6 4. W := W + X i 5. goto 2 6. i := i+1 7. IF (i > m) goto 9 8. goto 3 9. End Thus whenever an example is misclassified, we add it to the weight vector and this goes on until the test W.X i > 0 is satisfied for all input vectors.

Prof. Pushpak Bhattacharyya, IIT Bombay 9 Example of PTA AND function 1-class: { } 0-class: {,, } Augmented classes: 1-class: { } 0-class: {,, } Negating the 0-class vectors, the total set of vectors is: X 1 = X 2 = X 3 = X 4 =

Prof. Pushpak Bhattacharyya, IIT Bombay 10 Example of PTA (contd) Start with a random value for the weight vector: W = W.X 1 = 0 and hence fails W new = W old + X i = fails for X 2 Keep on adding the failed vectors. Finally the result must come. WHY ?

Prof. Pushpak Bhattacharyya, IIT Bombay 11 Perceptron Training Convergence Theorem Whatever be the initial choice of the weights, the Perceptron Training Algorithm will eventually converge by finding the correct weight values, provided the function being trained is linearly separable.

Prof. Pushpak Bhattacharyya, IIT Bombay 12 Non-Linearity How to deal with Non- linearity: 1. Tolerate some error e.g. in Case of XOR Classify the X points but include one O point also 2. Separate by higher order surface but in that case the neuron does not remain a linear threshold element.

Prof. Pushpak Bhattacharyya, IIT Bombay 13 Non-Linearity (contd) Introduce more hyperplanes so that one segment encloses points of only one kind and no point of the other kind.

Prof. Pushpak Bhattacharyya, IIT Bombay 14 Multilayer Perceptrons Multilayer Perceptrons are more powerful than single layer perceptron. The neurons have Sigmoid function as input output relationship. Merits of Sigmoid: 1. Biological Plausibility: mimics the behavior of actual brain cells. 2. Easy to compute derivative: dy/dx=y(1-y).

Prof. Pushpak Bhattacharyya, IIT Bombay 15 Backpropagation Algorithm

Prof. Pushpak Bhattacharyya, IIT Bombay 16 Backpropagation Algorithm (contd) Each neuron in one layer is connected to all neurons in the next layer if any. Let I = be the input vector, O = be the output vector, T = be the target vector Then error E = ½∑ n i=1 (t i -o i ) 2

Prof. Pushpak Bhattacharyya, IIT Bombay 17 BP: Gradient Descent Δw ji α –δE/δw ji where, Δw ji = change in weight between i th (feeding) and j th (fed) neuron. With learning constant η as constant of proportionality, Δw ji = –ηδE/δw ji From this the weight change values for the whole network can be easily found. The weight change always reduces the error. Hence BP is by nature a greedy strategy.

Prof. Pushpak Bhattacharyya, IIT Bombay 18 Analysis of BP Influence of Learning Rate: Large value gives fast progress, but oscillation about minima. Too small value makes the progress of the algorithm very slow. Symmetry Breaking: If mapping demands different weights, but we start with same weights everywhere, then BP will never converge. Momentum Factor: Momentum factor hastens the operating point towards minima and once near the minima it dampens the oscillations.

Prof. Pushpak Bhattacharyya, IIT Bombay 19 Local Minima Due to the Greedy nature of BP, it can get stuck in local minimum m and will never be able to reach the global minimum g as the error can only decrease by weight change.

Prof. Pushpak Bhattacharyya, IIT Bombay 20 Some useful tips for BP If the network weights do not change, one of the following might have happened: a) Learning Rate too small. b) Network paralysis due to neurons operating near saturation region (1 or 0) as the inputs are high positive or negative. In this case scale the inputs. c) Network stuck in local minimum. In this case start with a fresh random set of weights.

Prof. Pushpak Bhattacharyya, IIT Bombay 21 Tips (contd) If there is an equal distribution of positive and negative values in the input then use the tanh(x) curve.

Prof. Pushpak Bhattacharyya, IIT Bombay 22 Application Loan Defaulter recognition: classification task. NameAgeSexAddrIncome (rs/yr) Y/N A.K. Das 43M 5 th Street, Juhu Reclama tion 5 lakhsN S. Singh 52M A.S. Marg, Powai 0.7Y

Prof. Pushpak Bhattacharyya, IIT Bombay 23 Network Design Input layer with 4 neurons (in actuality hundreds of attributes). Hidden layer with 2 neurons. Output layer with a single neuron. Learning rate of about 0.6 and momentum factor of about 0.3. Training can be done with very efficient packages/hardware

Prof. Pushpak Bhattacharyya, IIT Bombay 24 Hopfield net Inspired by associative memory which means memory retrieval is not by address, but by part of the data. Consists of N neurons fully connected with symmetric weight strength w ij = w ji No self connection. So the weight matrix is 0- diagonal and symmetric. Each computing element or neuron is a linear threshold element with threshold = 0.

Prof. Pushpak Bhattacharyya, IIT Bombay 25 Computation

Prof. Pushpak Bhattacharyya, IIT Bombay 26 Stability Asynchronous mode of operation: at any instant a randomly selected neuron compares the net input with the threshold. In the synchronous mode of operation all neurons update themselves simultaneously at any instant of time. Since there are feedback connections in the Hopfield Net, the question of stability arises. At every time instant the network evolves and finally settles into a stable state. How does the Hopfield Net function as associative memory ? One needs to store or stabilize a vector which is the memory element.

Prof. Pushpak Bhattacharyya, IIT Bombay 27 Example w 12 = w 21 = 5 w 13 = w 31 = 3 w 23 = w 32 = 2 At time t=0 s 1 (t) = 1 s 2 (t) = -1 s 3 (t) = 1 Unstable state. Neuron 1 will flip. A stable pattern is called an attractor for the net.

Prof. Pushpak Bhattacharyya, IIT Bombay 28 Energy Consideration Stable patterns correspond to minimum energy states. Energy at state E = -1/2∑ j ∑ j<>i w ji x i x j Change in energy always comes out to be negative in the asynchronous mode of operation. Energy always decreases. Stability ensured.

Prof. Pushpak Bhattacharyya, IIT Bombay 29 Association Finding Association detection a very important problem in data mining. A Hopfield net can be trained to store associations. Classic example People who buy bread also buy butter Stores keep bread and butter together. Sweet shops serving Bengali sweets also keep popular Bengali magazines.

Prof. Pushpak Bhattacharyya, IIT Bombay 30 Hopfield Net as a Storehouse of Associations Get the features of situations in terms of pairs Train the Hopfield net with an appropriate learning algorithm (Hopfield rule). Use this to detect associations in future examples.

Prof. Pushpak Bhattacharyya, IIT Bombay 31 Conclusion Neural net and fuzzy logic are alternate ways of computation that provide ways of discovering knowledge from high volume of data. First crucial step is feature selection. The features can be fed into the neural net. The features can be described qualitatively supported with profile for data mining tasks.

Prof. Pushpak Bhattacharyya, IIT Bombay 32 References Books: R.J.Schalkoff, Artificial Neural Networks, McGraw Hill Publishers, G.J. Klir and T.A. Folger, Fuzzy Sets, Uncertainty and Information, Prentice Hall India, E. Cox, Fuzzy Logic, Charles River Media Inc. Publication, J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, Journals Neural Computation, IEEE Transactions on Neural Nets, Data and Knowledge Engineering Journal, SIGKDD, IEEE Expert etc.