Neural Networks. Plan Perceptron  Linear discriminant Associative memories  Hopfield networks  Chaotic networks Multilayer perceptron  Backpropagation.

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

Slides from: Doug Gray, David Poole
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Computer Science Department FMIPA IPB 2003 Neural Computing Yeni Herdiyeni Computer Science Dept. FMIPA IPB.
Machine Learning Neural Networks
Overview over different methods – Supervised Learning
Lecture 14 – Neural Networks
Simple Neural Nets For Pattern Classification
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
The back-propagation training algorithm
Connectionist models. Connectionist Models Motivated by Brain rather than Mind –A large number of very simple processing elements –A large number of weighted.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed.
Neural Networks Marco Loog.
Connectionist Modeling Some material taken from cspeech.ucd.ie/~connectionism and Rich & Knight, 1991.
An Illustrative Example
Back-Propagation Algorithm
AN INTERACTIVE TOOL FOR THE STOCK MARKET RESEARCH USING RECURSIVE NEURAL NETWORKS Master Thesis Michal Trna
Data Mining with Neural Networks (HK: Chapter 7.5)
Artificial Neural Networks
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
Artificial Neural Networks
Artificial Neural Networks
Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
Artificial Neural Network Supervised Learning دكترمحسن كاهاني
 Diagram of a Neuron  The Simple Perceptron  Multilayer Neural Network  What is Hidden Layer?  Why do we Need a Hidden Layer?  How do Multilayer.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
A note about gradient descent: Consider the function f(x)=(x-x 0 ) 2 Its derivative is: By gradient descent. x0x0 + -
Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 8: Neural Networks.
Multi-Layer Perceptron
Non-Bayes classifiers. Linear discriminants, neural networks.
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
EEE502 Pattern Recognition
IE 585 History of Neural Networks & Introduction to Simple Learning Rules.
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Chapter 6 Neural Network.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Lecture 12. Outline of Rule-Based Classification 1. Overview of ANN 2. Basic Feedforward ANN 3. Linear Perceptron Algorithm 4. Nonlinear and Multilayer.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Fall 2004 Backpropagation CS478 - Machine Learning.
Real Neurons Cell structures Cell body Dendrites Axon
CSE 473 Introduction to Artificial Intelligence Neural Networks
CSSE463: Image Recognition Day 17
Machine Learning Today: Reading: Maria Florina Balcan
CSC 578 Neural Networks and Deep Learning
Data Mining with Neural Networks (HK: Chapter 7.5)
Goodfellow: Chap 6 Deep Feedforward Networks
of the Artificial Neural Networks.
Artificial Neural Networks
CSSE463: Image Recognition Day 17
CSSE463: Image Recognition Day 17
Machine Learning: Lecture 4
Machine Learning: UNIT-2 CHAPTER-1
CSSE463: Image Recognition Day 17
CSSE463: Image Recognition Day 17
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
David Kauchak CS158 – Spring 2019
Presentation transcript:

Neural Networks

Plan Perceptron  Linear discriminant Associative memories  Hopfield networks  Chaotic networks Multilayer perceptron  Backpropagation

Perceptron Historically, the first neural net Inspired by human brain Proposed  By Rosenblatt  Between 1957 et 1961 The brain was appearing as the best computer Goal: associated input patterns to recognition outputs Akin to a linear discriminant

Perceptron Constitution 1/0 ∑ ∑ ∑ {0/1} Input layer / retina Output layer Connections/Synapses

Perceptron Constitution 1/0 ∑ ∑ ∑ {0/1} Input layer / retina Output layer Connections/Synapses

Perceptron Constitution 1/0 ∑ ∑ ∑ {0/1} a j = ∑ i x i w ij x0x0 x1x1 x2x2 x3x3 w 0j w 1j w 2j w 3j o j =f(a j ) Sigmoid

Perceptron Constitution a j = ∑ i x i w ij x0x0 x1x1 x2x2 x3x3 w 0j w 1j w 2j w 3j o j =f(a j ) a j : activation of the j output neuron x i : activation of the i input neuron w i,j : connection parameter between input neuron i and output neuron j o j : decision rule o j = 0 for a j θ j

Perceptron Need an associated learning  Learning is supervised Based on a couple input pattern and desired output  If the activation of output neuron is OK => nothing happens  Otherwise – inspired by neurophysiological data If it is activated : decrease the value of the connection If it is unactivated : increase the value of the connection  Iterated until the output neurons reach the desired value

Perceptron Supervised learning  How to decrease or increase the connections ?  Learning rule of Widrow-Hoff  Closed to Hebbian learning w i,j (t+1) = w i,j (t) +n(t j -o j )x i = w i,j (t) +∆w i,j Desired value of output neuron j Learning rate

11 Theory of linear discriminant Compute: g(x) = W T x + W o And: Choose: class 1 if g(x) > 0 class 2 otherwise But how to find W on the basis of the data ? g(x) > 0 g(x) < 0 g(x) = 0

12 Gradient descent: In general a sigmoid is used for the statistical interpretation: (0,1) The error could be least square: (Y – Y d ) 2 Or maximum likelihood: But at the end, you got the learning rule: Easy to derive = Y(1-Y) Class 1 if Y > 0.5 and 2 otherwise

Perceptron limitations Limitations  Not always easy to learn  But above all, cannot separate not linearly separable data Why so ?  The XOR kills NN researches for 20 years (Minsky and Papert were responsable) Consequence  We had to wait for the magical hidden layer  And for backpropagation 1,0 0,1 0,0 1,1

Associative memories Around 1970 Two types  Hetero-associative  And auto-associative We will treat here only auto-associative Make an interesting connections between neurosciences and physics of complex systems John Hopfield

Auto-associative memories Constitution Input Fully connected neural networks

Associative memories Hopfield -> DEMO Fully connected graphs Input layer = Output layer = Networks The connexions have to be symmetric OUTIN It is again an hebbian learning rule

Associative memories Hopfield The newtork becomes a dynamical machine It has been shown to converge into a fixed point This fixed point is a minimal of a Lyapunov energy These fixed point are used for storing «patterns » Discrete time and asynchronous updating  input in {-1,1}  x i  sign(  j w ij x j )

Mémoires associatives Hopfield The learning is done by Hebbian learning Over all patterns to learn:

19 My researches: Chaotic encoding of memories in brain

Multilayer perceptron x INPUT I neurons h HIDDEN L neurons o OUTPUT J neurons Connection Matrices WZ Constitution

Multilayer Perceptron a j = ∑ i x i w ij x0x0 x1x1 x2x2 x3x3 w 0j w 1j w 2j w 3j o j =f(a j ) Z j0 Z j1 Z j2 Constitution

Error backpropagation Learning algorithm How it proceeds :  Inject an input  Get the output  Compute the error with respect to the desired output  Propagate this error back from the output layer to the input layer of the network  Just a consequence of the chaining derivative of the gradient descent

Backpropagation Select a derivable transfert function  Classicaly used : The logistics  And its derivative

Backpropagation The algorithm 1. Inject an entry

Backpropagation Algorithm 1. Inject an entry 2. Compute the intermediate h h=f(W*x)

Backpropagation Algorithm 1. Inject an entry 2. Compute the intermediate h o=f(Z*h) 3. Compute the output o h=f(W*x)

Backpropagation Algorithm 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output  sortie =f’(Zh)*(t - o) o=f(Z*h)

Backpropagation Algorithm 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output 5. Adjust Z on the basis of the error Z (t+1) =Z (t) +n  sortie h = Z (t) + ∆ (t) Z  sortie =f’(Zh)*(t - o)

Backpropagation Algorithm 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output 5. Adjust Z on the basis of the error 6. Compute the error on the hidden layer  cachée =f’(Wx)*(Z  sortie ) Z (t+1) =Z (t) +n  sortie h = Z (t) + ∆ (t) Z

Backpropagation Algorithm 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output 5. Adjust Z on the basis of the error 6. Compute the error on the hidden layer 7. Adjust W on the basis of this error W (t+1) =W (t) +n  cachée x = W (t) + ∆ (t) W  cachée =f’(Wx)*(Z  sortie )

Backpropagation Algorithm 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output 5. Adjust Z on the basis of the error 6. Compute the error on the hidden layer 7. Adjust W on the basis of this error W (t+1) =W (t) +n  cachée x = W (t) + ∆ (t) W

Backpropagation Algorithm 1. Inject an entry 2. Compute the intermediate h 3. Compute the output o 4. Compute the error output 5. Adjust Z on the basis of the error 6. Compute the error on the hidden layer 7. Adjust W on the basis of this error

Neural network Simple linear discriminant

Neural networks Few layers – Little learning

Neural networks More layers – More learning

36

37

Neural networks Tricks Favour simple NN (you can add the structure in the error) Few layers are enough (theoretically only one) Exploit cross validation…