Artificial Neural Networks

Slides:

Advertisements

Similar presentations

Artificial Neural Networks

Advertisements

Slides from: Doug Gray, David Poole

1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.

Artificial Neural Networks

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Tuomas Sandholm Carnegie Mellon University Computer Science Department

CS 4700: Foundations of Artificial Intelligence

Reading for Next Week Textbook, Section 9, pp A User’s Guide to Support Vector Machines (linked from course website)

Classification Neural Networks 1

Machine Learning Neural Networks

Overview over different methods – Supervised Learning

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Artificial Neural Networks ML Paul Scheible.

Artificial Neural Networks

Machine Learning Neural Networks.

Back-Propagation Algorithm

Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.

Artificial Neural Networks

Lecture 4 Neural Networks ICS 273A UC Irvine Instructor: Max Welling Read chapter 4.

MACHINE LEARNING 12. Multilayer Perceptrons. Neural Networks Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

LOGO Classification III Lecturer: Dr. Bo Yuan

CHAPTER 11 Back-Propagation Ming-Feng Yeh.

ICS 273A UC Irvine Instructor: Max Welling Neural Networks.

Foundations of Learning and Adaptive Systems ICS320

CS 484 – Artificial Intelligence

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:

Artificial Neural Networks

Computer Science and Engineering

1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.

CS464 Introduction to Machine Learning1 Artificial N eural N etworks Artificial neural networks (ANNs) provide a general, practical method for learning.

Machine Learning Chapter 4. Artificial Neural Networks

Artificial Neural Network Yalong Li Some slides are from _24_2011_ann.pdf.

Classification / Regression Neural Networks 2

LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.

CS 478 – Tools for Machine Learning and Data Mining Backpropagation.

Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.

Artificial Intelligence Chapter 3 Neural Networks Artificial Intelligence Chapter 3 Neural Networks Biointelligence Lab School of Computer Sci. & Eng.

EE459 Neural Networks Backpropagation

Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.

Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:

SUPERVISED LEARNING NETWORK

Artificial Neural Networks (Cont.) Chapter 4 Perceptron Gradient Descent Multilayer Networks Backpropagation Algorithm 1.

Artificial Neural Network

Learning Neural Networks (NN) Christina Conati UBC

Multilayer Neural Networks (sometimes called “Multilayer Perceptrons” or MLPs)

Artificial Neural Network. Introduction Robust approach to approximating real-valued, discrete-valued, and vector-valued target functions Backpropagation.

Artificial Neural Network. Introduction Robust approach to approximating real-valued, discrete-valued, and vector-valued target functions Backpropagation.

Learning: Neural Networks Artificial Intelligence CMSC February 3, 2005.

Machine Learning Supervised Learning Classification and Regression

Fall 2004 Backpropagation CS478 - Machine Learning.

第 3 章神经网络.

Artificial Neural Networks

Artificial Neural Networks

Machine Learning Today: Reading: Maria Florina Balcan

Artificial Neural Networks

Classification Neural Networks 1

Artificial Intelligence Chapter 3 Neural Networks

Neural Networks Geoff Hulten.

Lecture Notes for Chapter 4 Artificial Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

Neural Networks ICS 273A UC Irvine Instructor: Max Welling

Machine Learning: Lecture 4

Machine Learning: UNIT-2 CHAPTER-1

Artificial Intelligence Chapter 3 Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

Seminar on Machine Learning Rada Mihalcea

Artificial Intelligence Chapter 3 Neural Networks

Presentation transcript:

Artificial Neural Networks Threshold units Gradient descent Multilayer networks Backpropagation Hidden layer representations Example: Face recognition Advanced topics CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Connectionist Models Consider humans Neuron switching time ~.001 second Number of neurons ~1010 Connections per neuron ~104-5 Scene recognition time ~.1 second 100 inference step does not seem like enough must use lots of parallel computation! Properties of artificial neural nets (ANNs): Many neuron-like threshold switching units Many weighted interconnections among units Highly parallel, distributed process Emphasis on tuning weights automatically CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

When to Consider Neural Networks Input is high-dimensional discrete or real-valued (e.g., raw sensor input) Output is discrete or real valued Output is a vector of values Possibly noisy data Form of target function is unknown Human readability of result is unimportant Examples: Speech phoneme recognition [Waibel] Image classification [Kanade, Baluja, Rowley] Financial prediction CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

ALVINN drives 70 mph on highways CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Perceptron CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Decision Surface of Perceptron Represents some useful functions What weights represent g(x1,x2) = AND(x1,x2)? But some functions not representable e.g., not linearly separable therefore, we will want networks of these ... CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Perceptron Training Rule CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Gradient Descent CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Gradient Descent CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Gradient Descent CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Gradient Descent CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Gradient Descent CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Summary Perceptron training rule guaranteed to succeed if Training examples are linearly separable Sufficiently small learning rate h Linear unit training rule uses gradient descent Guaranteed to converge to hypothesis with minimum squared error Given sufficiently small learning rate h Even when training data contains noise Even when training data not separable by H CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Incremental (Stochastic) Gradient Descent Batch mode Gradient Descent: Do until satisfied: Incremental mode Gradient Descent: Do until satisfied: - For each training example d in D Incremental Gradient Descent can approximate Batch Gradient Descent arbitrarily closely if h made small enough CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Multilayer Networks of Sigmoid Units CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Multilayer Decision Space CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Sigmoid Unit CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks The Sigmoid Function Sort of a rounded step function Unlike step function, can take derivative (makes learning possible) CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Error Gradient for a Sigmoid Unit CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Backpropagation Algorithm CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

More on Backpropagation Gradient descent over entire network weight vector Easily generalized to arbitrary directed graphs Will find a local, not necessarily global error minimum In practice, often works well (can run multiple times) Often include weight momentum a Minimizes error over training examples Will it generalize well to subsequent examples? Training can take thousands of iterations -- slow! Using network after training is fast CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Learning Hidden Layer Representations CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Learning Hidden Layer Representations CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Output Unit Error during Training CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Hidden Unit Encoding CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Input to Hidden Weights CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Convergence of Backpropagation Gradient descent to some local minimum Perhaps not global minimum Momentum can cause quicker convergence Stochastic gradient descent also results in faster convergence Can train multiple networks and get different results (using different initial weights) Nature of convergence Initialize weights near zero Therefore, initial networks near-linear Increasingly non-linear functions as training progresses CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Expressive Capabilities of ANNs Boolean functions: Every Boolean function can be represented by network with a single hidden layer But that might require an exponential (in the number of inputs) hidden units Continuous functions: Every bounded continuous function can be approximated with arbitrarily small error by a network with one hidden layer [Cybenko 1989; Hornik et al. 1989] Any function can be approximated to arbitrary accuracy by a network with two hidden layers [Cybenko 1988] CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Overfitting in ANNs CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Overfitting in ANNs CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Neural Nets for Face Recognition 90% accurate learning head pose, and recognizing 1-of-20 faces CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Learned Network Weights CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Alternative Error Functions CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Chapter 4 Artificial Neural Networks Recurrent Networks CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Neural Network Summary physiologically (neurons) inspired model powerful (accurate), slow, opaque (hard to understand resulting model) bias: preferential based on gradient descent finds local minimum effect by initial conditions, parameters neural units linear linear threshold sigmoid CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Neural Network Summary (cont) gradient descent convergence linear units limitation: hyperplane decision surface learning rule multilayer network advantage: can have non-linear decision surface backpropagation to learn backprop learning rule learning issues units used CS 5751 Machine Learning Chapter 4 Artificial Neural Networks

Neural Network Summary (cont) learning issues (cont) batch versus incremental (stochastic) parameters initial weights learning rate momentum cost (error) function sum of squared errors can include penalty terms recurrent networks simple backpropagation through time CS 5751 Machine Learning Chapter 4 Artificial Neural Networks