Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Multi-Layer Perceptron (MLP)
Slides from: Doug Gray, David Poole
NEURAL NETWORKS Backpropagation Algorithm
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Artificial Neural Networks (1)
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Computer Science Department FMIPA IPB 2003 Neural Computing Yeni Herdiyeni Computer Science Dept. FMIPA IPB.
Artificial Neural Networks
Artificial Neural Networks - Introduction -
Artificial Neural Networks - Introduction -
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Machine Learning Neural Networks
Simple Neural Nets For Pattern Classification
Connectionist models. Connectionist Models Motivated by Brain rather than Mind –A large number of very simple processing elements –A large number of weighted.
20.5 Nerual Networks Thanks: Professors Frank Hoffmann and Jiawei Han, and Russell and Norvig.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Neural Networks Marco Loog.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Data Mining with Neural Networks (HK: Chapter 7.5)
CS 4700: Foundations of Artificial Intelligence
CS 484 – Artificial Intelligence
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Rohit Ray ESE 251. What are Artificial Neural Networks? ANN are inspired by models of the biological nervous systems such as the brain Novel structure.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way
Artificial Neural Networks
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Computer Science and Engineering
Artificial Neural Networks
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
Neural Networks Kasin Prakobwaitayakit Department of Electrical Engineering Chiangmai University EE459 Neural Networks The Structure.
NEURAL NETWORKS FOR DATA MINING
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
1 Pattern Classification X. 2 Content General Method K Nearest Neighbors Decision Trees Nerual Networks.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Artificial Neural Networks An Introduction. What is a Neural Network? A human Brain A porpoise brain The brain in a living creature A computer program.
ADVANCED PERCEPTRON LEARNING David Kauchak CS 451 – Fall 2013.
Artificial Neural Networks Students: Albu Alexandru Deaconescu Ionu.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Dr.Abeer Mahmoud ARTIFICIAL INTELLIGENCE (CS 461D) Dr. Abeer Mahmoud Computer science Department Princess Nora University Faculty of Computer & Information.
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
Chapter 6 Neural Network.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Where are we? What’s left? HW 7 due on Wednesday Finish learning this week. Exam #4 next Monday Final Exam is a take-home handed out next Friday in class.
“Principles of Soft Computing, 2 nd Edition” by S.N. Sivanandam & SN Deepa Copyright  2011 Wiley India Pvt. Ltd. All rights reserved. CHAPTER 2 ARTIFICIAL.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
Machine Learning Supervised Learning Classification and Regression
Neural networks.
Fall 2004 Backpropagation CS478 - Machine Learning.
Artificial Neural Networks
Artificial Intelligence (CS 370D)
Real Neurons Cell structures Cell body Dendrites Axon
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Data Mining with Neural Networks (HK: Chapter 7.5)
Chapter 3. Artificial Neural Networks - Introduction -
Artificial Neural Networks
The Network Approach: Mind as a Web
Presentation transcript:

Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy

Biological inspiration Animals are able to react adaptively to changes in their external and internal environment, and they use their nervous system to perform these behaviours. An appropriate model/simulation of the nervous system should be able to produce similar responses and behaviours in artificial systems. The nervous system is build by relatively simple units, the neurons, so copying their behavior and functionality should be the solution.

3 The Structure of Neurons

4 A neuron only fires if its input signal exceeds a certain amount (the threshold) in a short time period. Synapses play role in formation of memory –Two neurons are strengthened when both neurons are active at the same time –The strength of connection is thought to result in the storage of information, resulting in memory. Synapses vary in strength –Good connections allowing a large signal –Slight connections allow only a weak signal. –Synapses can be either excitatory or inhibitory. The Structure of Neurons

Definition of Neural Network A Neural Network is a system composed of many simple processing elements operating in parallel which can acquire, store, and utilize experiential knowledge.

6 Features of the Brain Ten billion (10 10 ) neurons Neuron switching time >10 -3 secs Face Recognition ~0.1secs On average, each neuron has several thousand connections Hundreds of operations per second High degree of parallel computation Distributed representations Die off frequently (never replaced) Compensated for problems by massive parallelism

7 Brain vs. Digital Computer The Von Neumann architecture uses a single processing unit; –Tens of millions of operations per second –Absolute arithmetic precision The brain uses many slow unreliable processors acting in parallel

Brain vs. Digital Computer

What is Artificial Neural Network

Neurons vs. Units (1) -Each element of NN is a node called unit. -Units are connected by links. - Each link has a numeric weight.

Biological NN vs. Artificial NN NASA: A Prediction of Plant Growth in Space

Neuron or Node  Transfer Function  Activation Function  Activation Level or Threshold

 Transfer Function  Activation Function  Activation Level or Threshold Neuron or Node

 Transfer Function  Activation Function  Activation Level or Threshold Neuron or Node =

Perceptron  Transfer Function  Activation Function  Activation Level or Threshold A simple neuron used to classify inputs into one of two categories

How Perceptron Learns?  Start with random weights of w 1, w 2  Calculate X, apply Y and find output  If output is different than target then  Find error as e = target – output  If a is the learning rate, where  Then adjust w i as

Training Perceptrons Let us learn logical – OR function for two inputs, using threshold of zero (t = 0) and learning rate of 0.2 Initialize weights to a random value between -1 and +1 x1x1 x2x2 output First training data x 1 = 0, x 2 = 0 and expected output is 0 Apply the two formula, get X = (0 x – 0.2) + (0 x 0.4) = 0 Therefore Y = 0, so no error, i.e. e =0 So no change of threshold or no learning

Training Perceptrons Let us learn logical – OR function for two inputs, using threshold of zero (t = 0) and learning rate of 0.2 Now, for x 1 = 0, x 2 = 1 and expected output is 1 x1x1 x2x2 output Apply the two formula, get X = (0 x – 0.2) + (1 x 0.4) = 0.4 Therefore Y = 1, so no error, i.e. e =0 So no change of threshold or no learning

Training Perceptrons Let us learn logical – OR function for two inputs, using threshold of zero (t = 0) and learning rate of 0.2 Now, for x 1 = 1, x 2 = 0 and expected output is 1 x1x1 x2x2 output Apply the two formula, get X = (1 x – 0.2) + (0 x 0.4) = – 0.2 Therefore Y = 0, so error, e = (target – output) = 1 – 0 = 1 W2 not adjusted, because it did not contributed to error 0 So change weights according to

Training Perceptrons Let us learn logical – OR function for two inputs, using threshold of zero (t = 0) and learning rate of 0.2 Now, for x 1 = 1, x 2 = 1 and expected output is 1 x1x1 x2x2 output Apply the two formula, get X = (0 x – 0.2) + (1 x 0.4) = 0.4 Therefore Y = 1, so no error, no change of weights This is the end of first epoch The method runs again and repeat until classified correctly 0

Linear Separability Perceptrons can only learn models that are linearly separable Thus it can classify AND, OR functions but not XOR OR XOR However, most real-world problems are not linearly separable

Multilayer Neural Networks

Multilayer Feed Forward NN Examples architectures

Multilayer Feed Forward NN  Hidden layers solve the classification problem for non linear sets  The additional hidden layers can be interpreted geometrically as additional hyper-planes, which enhance the separation capacity of the network  How to train the hidden units for which the desired output is not known. The Backpropagation algorithm offers a solution to this problem

Back Propagation Algorithm

1.The network is initialized with weights 2.Next, the input pattern is applied and output is calculated (forward pass) 3.If error, then adjust the weights so that error will get smaller 4.Repeat the process until the error is minimal

Back Propagation Algorithm 1.Initialize network with weights, work out the output 2.Find the error for neuron B 3.Output (1 – Output) is necessary for sigmoid function, otherwise it would be (Target – Output), explained latter on

Back Propagation Algorithm 1.Initialize network with weights, work out the output 2.Find the error for neuron B 3.Change the weight. Let W + AB be the new weight of W AB 4.Calculate the Errors for the hidden layer neurons  Hidden layers do not have output target, So calculate error from output errors 5.Now, go back to step 3 to change the hidden layer weights

Back Propagation Algorithm Example

Gradient Descent Method The sigmoid function Let, i represents node of input layer, j for hidden layer nodes and k for output layer nodes, then Error signal Where d k is the desired value and y k is the output is the threshold value used for node j

Gradient Descent Method Error gradient for output node k is: Since y is defined as the sigmoid function of x and Similarly, error gradient for each node j in the hidden layer, as follows Now each weight in the network, w ij or w jk is updated, as follows

More Example Train the first four letters of the alphabet

More Example

Stopping Training 1.When to stop training? 2.Network recognizes all characters successfully 3.In practice, let the error fall to a lower value 4.This ensures all are being well recognized

Black dots are positive, others negative Two lines represent two hypothesis Thick line is complex hypothesis correctly classifies all data Thin line is simple hypothesis but incorrectly classifies some data The simple hypothesis makes some errors but reasonably closely represents the trend in the data The complex solution does not at all represent the full set of data Stopping training with Validation Set This stops overtraining or over fitting problem let the error fall to a lower value

Over fitting problem When over trained (becoming too accurate) the validation set error starts rising. If over trained it won’t be able to handle noisy data so well

Problems with Backpropagation Stuck with local minima Because, algorithm always changes to cause the error to fall One solution is to start with different random weights, train again Another solution is to use momentum to the weight change Weight change of an iteration depends on previous change

Network Size Most common use is one input, one hidden and one output layer, Input output depends on problem Let we like to recognize 5x7 grid (35 inputs) characters and 26 such characters (26 outputs) Number of hidden units and layers No hard and fast rule. For above problem 6 – 22 is fine With ‘traditional’ back-propagation a long NN gets stuck in local minima and does not learn well

Strengths and Weakness of BP Recognize patterns of the example type we provided (usually better than human) It can’t handle noisy data like face in a crowd In that case data preprocessing is necessary

References Chapter 11 of “AI Illuminated” by Ben Coppin. PDF provided in class