2101INT – Principles of Intelligent Systems Lecture 10.

Slides:

Advertisements

Similar presentations

Perceptron Lecture 4.

Advertisements

Slides from: Doug Gray, David Poole

Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.

1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)

Introduction to Neural Networks Computing

Artificial Neural Networks (1)

NEURAL NETWORKS Perceptron

1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.

Neural Network I Week 7 1. Team Homework Assignment #9 Read pp. 327 – 334 and the Week 7 slide. Design a neural network for XOR (Exclusive OR) Explore.

Tuomas Sandholm Carnegie Mellon University Computer Science Department

Kostas Kontogiannis E&CE

Artificial Neural Networks

Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.

Simple Neural Nets For Pattern Classification

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks: Concepts (Reading:

20.5 Nerual Networks Thanks: Professors Frank Hoffmann and Jiawei Han, and Russell and Norvig.

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

Connectionist Modeling Some material taken from cspeech.ucd.ie/~connectionism and Rich & Knight, 1991.

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Data Mining with Neural Networks (HK: Chapter 7.5)

CHAPTER 11 Back-Propagation Ming-Feng Yeh.

CS 484 – Artificial Intelligence

Artificial Neural Network

Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.

Artificial Neural Network

Artificial neural networks:

Some more Artificial Intelligence

MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way

Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.

Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.

Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10

Artificial Neural Network Yalong Li Some slides are from _24_2011_ann.pdf.

Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy

1 Machine Learning The Perceptron. 2 Heuristic Search Knowledge Based Systems (KBS) Genetic Algorithms (GAs)

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.

LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.

Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

Features of Biological Neural Networks 1)Robustness and Fault Tolerance. 2)Flexibility. 3)Ability to deal with variety of Data situations. 4)Collective.

Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.

Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:

Artificial Neural Networks Chapter 4 Perceptron Gradient Descent Multilayer Networks Backpropagation Algorithm 1.

Dr.Abeer Mahmoud ARTIFICIAL INTELLIGENCE (CS 461D) Dr. Abeer Mahmoud Computer science Department Princess Nora University Faculty of Computer & Information.

Chapter 8: Adaptive Networks

Neural Networks 2nd Edition Simon Haykin

Perceptrons Michael J. Watts

Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.

Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.

Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.

Neural networks.

Learning with Perceptrons and Neural Networks

Learning in Neural Networks

第 3 章神经网络.

Real Neurons Cell structures Cell body Dendrites Axon

with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017

Data Mining with Neural Networks (HK: Chapter 7.5)

Chapter 3. Artificial Neural Networks - Introduction -

Artificial Intelligence Chapter 3 Neural Networks

Lecture Notes for Chapter 4 Artificial Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

Introduction to Neural Network

David Kauchak CS158 – Spring 2019

Artificial Intelligence Chapter 3 Neural Networks

Presentation transcript:

2101INT – Principles of Intelligent Systems Lecture 10

Neural Networks Since we are interested in creating artificial intelligence in systems, it is reasonable that we would attempt to mimic the human brain The concept of an artificial neuron has been around since at least 1943 This type of field is usually described as artificial neural networks (ANNs), connectionism, parallel distributed processing or neural computation This field is also of interested to cognitive psychologists who seek to better understand the human brain

Neurons A neuron is a cell in the brain that collects, processes and disseminates electric signals On their own, neurons are not particularly complex Much of the brain’s information-processing capacity is thought to stem from the number of and inter- relationships between the neurons. As such is an emergent property of the neurons, since each of its own does not have the power of the whole The human brain contains about neurons, each on average connected to about 10,000 others

Neurons Signals in a brain are noisy “spike trains” of electrical energy

Neurons

The axon endings almost touch the dendrites or cell body of the next neuron – termed a synapse Electrical signals are transferred with the aid of neurotransmitters – chemicals which are released from one neuron and which bind to another Signal transmission depends on: – quantity of neurotransmitter – number and arrangement of receptors – neurotransmitter re-absorption – etc.

Carbon vs Silicon Elements: synapses vs 10 8 transistors Size: m vs m Energy: 30W vs 30W (CPU) Speed: 100Hz vs 10 9 Hz Architecture: Parallel/Distributed vs Serial/Centralised Fault Tolerant: Yes vs a Little Learns: Yes vs Maybe Intelligent: Usually vs Not Yet

Mathematical Model of a Neuron McCulloch and Pitts (1943) 737

Mathematical Model of a Neuron ANNs are composed of many units A link from unit j to unit i propagates the activation of unit j (a j ) to unit i Each link also has weight W j,i which determines the strength and sign of the link Each unit computes the weighted sum of its inputs, in i : And then applies an activation function to derive the output a i : A bias weight is also present

Activation Functions The activation function should create a “high” output (say 1) when the correct inputs are given and a “low” output (say 0) otherwise The function also needs to be non-linear (not of the form y = mx + c ), otherwise the network as a whole would be a simple linear function, which isn’t particularly powerful Two choices are the threshold (unit step) function and the sigmoid function = 1/(1+e -x ) 738

Neurons as Logic Gates A single neuron can implement the three most basic Boolean logic functions (and also NAND and NOR) A single unit cannot represent XOR 738

Usefulness of ANNs So neurons and ANNs can be designed to exhibit particular behaviours More often, they are used to learn to recognise/classify particular patterns. Rosenblatt’s work (1958) explicitly considered this problem when a teacher is providing advice to the ANN From this we derive the term supervised classification or supervised learning It was he who introduced perceptrons: ANNs that change their link weights when they make incorrect decisions/classifications

Networks of Neurons Two main categories: feed-forward networks and recurrent networks Feed-forward networks are acyclic: all links feed forward in the network. A feed forward network is simply a function of its current input. It has no internal state. Recurrent networks are cyclic: links can feed back into themselves. Thus, the activation levels of the network form a dynamic system, and can exhibit either stable, oscillatory or even chaotic behaviour. A recurrent network’s response will depend on its initial state, which depends on prior inputs 738

A Single Layer Perceptron A network with all inputs connected directly to the outputs is called a single-layer neural network or a perceptron network 740

What can single layer networks do? Already seen that they can implement simple Boolean logic functions Can also represent other more complex functions, like the majority function: returns T iff more than half of its inputs are T Decision tree representation would require O(2 n ) nodes So why can’t we represent XOR?

Linear Seperability The output of a threshold perceptron can be described as: or as vectors: Wx > 0 This equation defines a hyperplane in the input space 740

Linear Seperability cont. Functions that can be divided by such a hyperplane are termed linear seperable In general, threshold perceptrons can represent only linearly seperable functions There are times when such networks are sufficiently appropriate however 741

How does the brain learn? Brains learn by altering the strength of connections between neurons They learn online, often without the benefit of explicit training examples Hebb’s Postulate: "When an axon of cell A... excites[s] cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells so that A's efficiency as one of the cells firing B is increased."

Learning What is learning? Rosenblatt (1958) provided a learning scheme with the property that: “ if the patterns of the training set can be seperated by some choice of weights and threshold, then the scheme will eventually yield a satisfactory setting of the weights” 1. Pick a “representative set of patterns” – training set 2. Expose network to this set to adjust synaptic weights using a learning rule – such as minimise the error 741

Learning Minimise the error, in this case sum of square error SSE: Use gradient descent, to minimise the error wrt each particular link weight. Use chain rule: 741

Learning cont. Using the chain rule: First term reduces to: Second term: 741

Learning cont. Combining these: Definitions of terms – g’() is the derivative of the activation function – in i is the weighted sum of inputs 741

Updating weights Having calculated the impact of each weight on the overall error, can now adjust each W j accordingly: Note that the minus has been dropped from the previous equation: +ve error requires increased output  is called the learning rate The network is shown each training example, and the weights are updated Exposure to a complete set of training examples is termed an epoch The process is repeated until convergence occurs 742

Learning Examples 743

Learning Examples 748 Graph shows total error on a training set of 100 examples

Multilayer Feed Forward Networks Will now consider networks where the inputs do not connect directly to the outputs Introduce some intervening units between input and output which are termed hidden units Why? 744

Effect of Hidden Units Call these networks multilayer perceptron networks Hidden layers remove the restriction to linearly seperable functions Using a sigmoid activation function, two hidden units can classify a ridge, 4 a bump, >4 more bumps, etc 744

Weight learning in MLPs Similar procedure as for a single layer, except that the error must be propagated back through the hidden layers Gives rise to back propagation learning The equations for back propagation learning are derived on pages of the text 745pp

Performance comparison MLP vs Single

Learning in ANNs

Readings for Week Russell and Norvig, Chapter tml 25pp