CPSC 322 Introduction to Artificial Intelligence December 3, 2004.

Slides:



Advertisements
Similar presentations
Artificial Intelligence 12. Two Layer ANNs
Advertisements

Multi-Layer Perceptron (MLP)
Machine Learning: Intro and Supervised Classification
Slides from: Doug Gray, David Poole
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
G53MLE | Machine Learning | Dr Guoping Qiu
NEURAL NETWORKS Perceptron
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 20 Jim Martin.
For Wednesday Read chapter 19, sections 1-3 No homework.
Support Vector Machines
Introduction to Artificial Intelligence (G51IAI)
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Simple Neural Nets For Pattern Classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Connectionist models. Connectionist Models Motivated by Brain rather than Mind –A large number of very simple processing elements –A large number of weighted.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
Neural Networks Marco Loog.
Connectionist Modeling Some material taken from cspeech.ucd.ie/~connectionism and Rich & Knight, 1991.
CPSC 322 Introduction to Artificial Intelligence November 29, 2004.
Chapter 6: Multilayer Neural Networks
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
CPSC 322 Introduction to Artificial Intelligence December 1, 2004.
Data Mining with Neural Networks (HK: Chapter 7.5)
CS 4700: Foundations of Artificial Intelligence
In.  This presentation will only make sense if you view it in presentation mode (hit F5). If you don’t do that there will be multiple slides that are.
ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.
Computer Science and Engineering
Artificial Neural Networks
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
1 Machine Learning The Perceptron. 2 Heuristic Search Knowledge Based Systems (KBS) Genetic Algorithms (GAs)
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Artificial Intelligence Techniques Multilayer Perceptrons.
1 Chapter 20 Section Slide Set 2 Perceptron examples Additional sources used in preparing the slides: Nils J. Nilsson’s book: Artificial Intelligence:
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Moving Around in Scratch The Basics… -You do want to have Scratch open as you will be creating a program. -Follow the instructions and if you have questions.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
ADVANCED PERCEPTRON LEARNING David Kauchak CS 451 – Fall 2013.
1 CSC 9010 Spring Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
Linear Classification with Perceptrons
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Sorting: Implementation Fundamental Data Structures and Algorithms Klaus Sutner February 24, 2004.
Perceptrons Gary Cottrell. Cognitive Science Summer School 2 Perceptrons: A bit of history Frank Rosenblatt studied a simple version of a neural net called.
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
Chapter 2 Single Layer Feedforward Networks
Course Overview  What is AI?  What are the Major Challenges?  What are the Main Techniques?  Where are we failing, and why?  Step back and look at.
Chapter 18 Connectionist Models
Chapter 6 Neural Network.
Neural Networks References: “Artificial Intelligence for Games” "Artificial Intelligence: A new Synthesis"
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
CompSci Today’s Topics Computer Science Noncomputability Upcoming Special Topic: Enabled by Computer -- Decoding the Human Genome Reading Great.
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
Announcements 1. Textbook will be on reserve at library 2. Topic schedule change; modified reading assignment: This week: Linear discrimination, evaluating.
Introduction to the TLearn Simulator n CS/PY 231 Lab Presentation # 5 n February 16, 2005 n Mount Union College.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Neural networks.
Fall 2004 Backpropagation CS478 - Machine Learning.
Done Done Course Overview What is AI? What are the Major Challenges?
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Wed June 12 Goals of today’s lecture. Learning Mechanisms
Artificial Intelligence 12. Two Layer ANNs
AI and Machine Learning
The Network Approach: Mind as a Web
Presentation transcript:

CPSC 322 Introduction to Artificial Intelligence December 3, 2004

Things... Slides for the last couple of weeks will be up this weekend. Did I mention that you have a final exam at noon on Friday, December 10, in MCML 166? Check the announcements part of the web page occasionally between now and then in case anything important comes up.

This should be fun Artificial Intelligence and Interactive Digital Entertainment First Annual Conference June 1-3, 2005 in Marina Del Rey, California

Learning Definition: learning is the adaptive changes that occur in a system which enable that system to perform the same task or similar tasks more efficiently or more effectively over time. This could mean: The range of behaviors is expanded: the agent can do more The accuracy on tasks is improved: the agent can do things better The speed is improved: the agent can do things faster

Learning is about choosing the best representation That’s certainly true in a logic-based AI world. Our arch learner started with some internal representation of an arch. As examples were presented, the arch learner modified its internal representation to either make the representation accommodate positive examples (generalization) or exclude negative examples (specialization). There’s really nothing else the learner could modify... the reasoning system is what it is. So any learning problem can be mapped onto one of choosing the best representation...

Learning is about search...but wait, there’s more! By now, you’ve figured out that the arch learner was doing nothing more than searching the space of possible representations, right? So learning, like everything else, boils down to search. If that wasn’t obvious, you probably will want to do a little extra preparation for the final exam....

Same problem - different representation The arch learner could have represented the arch concept as a decision tree if we wanted

Same problem - different representation The arch learner could have represented the arch concept as a decision tree if we wanted arch

Same problem - different representation The arch learner could have represented the arch concept as a decision tree if we wanted do upright blocks support sideways block? no yes not arch arch

Same problem - different representation The arch learner could have represented the arch concept as a decision tree if we wanted do upright blocks support sideways block? no yes not arch do upright blocks touch each other? arch not arch no yes

Same problem - different representation The arch learner could have represented the arch concept as a decision tree if we wanted do upright blocks support sideways block? no yes not arch do upright blocks touch each other? is the not arch top block either a rectangle or a wedge? no yes not arch arch

Other issues with learning by example The learning process requires that there is someone to say which examples are positive and which are negative. This approach must start with a positive example to specialize or generalize from. Learning by example is sensitive to the order in which examples are presented. Learning by example doesn’t work well with noisy, randomly erroneous data.

Reinforcement learning Learning operator sequences based on reward or punishment Let’s say you want your robot to learn how to vacuum the living room iRobot Roomba 4210 Discovery Floorvac Robotic Vacuum $ It’s the ideal Christmas gift!

Reinforcement learning Learning operator sequences based on reward or punishment Let’s say you want your robot to learn how to vacuum the living room goto livingroom vacuum floor goto trashcan empty bag Good Robot!

Reinforcement learning Learning operator sequences based on reward or punishment Let’s say you want your robot to learn how to vacuum the living room goto livingroomgoto trashcanvacuum floor goto trashcangoto livingroomempty bag Good Robot! Bad Robot!

Reinforcement learning Learning operator sequences based on reward or punishment Let’s say you want your robot to learn how to vacuum the living room goto livingroomgoto trashcanvacuum floor goto trashcangoto livingroomempty bag Good Robot! Bad Robot! (actually, there are no bad robots, there is only bad behavior...and Roomba can’t really empty its own bag)

Reinforcement learning Should the robot learn from success? How does it figure out which part of the sequence of actions is right (credit assignment problem)? goto livingroomgoto trashcanvacuum floor goto trashcangoto livingroomempty bag Good Robot! Bad Robot!

Reinforcement learning Should the robot learn from failure? How does it figure out which part of the sequence of actions is wrong (blame assignment problem)? goto livingroomgoto trashcanvacuum floor goto trashcangoto livingroomempty bag Good Robot! Bad Robot!

Reinforcement learning However you answer those questions, the learning task again boils down to the search for the best representation. Here’s another type of learning that searches for the best representation, but the representation is very different from what you’ve seen so far.

Learning in neural networks The perceptron is one of the earliest neural network models, dating to the early 1960s. x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks The perceptron can’t compute everything, but what it can compute it can learn to compute. Here’s how it works. Inputs are 1 or 0. Weights are reals (-n to +n). Each input is multiplied by its corresponding weight. If the sum of the products is greater than the threshold, then the perceptron outputs 1, otherwise the output is 0. x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks The perceptron can’t compute everything, but what it can compute it can learn to compute. Here’s how it works. The output, 1 or 0, is a guess or prediction about the input: does it fall into the desired classification (output = 1) or not (output = 0)? x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks That’s it? Big deal. No, there’s more to it.... x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks That’s it? Big deal. No, there’s more to it.... Say you wanted your perceptron to classify arches. x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks That’s it? Big deal. No, there’s more to it.... Say you wanted your perceptron to classify arches. That is, you present inputs representing an arch, and the output should be 1. x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks That’s it? Big deal. No, there’s more to it.... Say you wanted your perceptron to classify arches. That is, you present inputs representing an arch, and the output should be 1. You present inputs not representing an arch, and the output should be 0. x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks That’s it? Big deal. No, there’s more to it.... Say you wanted your perceptron to classify arches. That is, you present inputs representing an arch, and the output should be 1. You present inputs not representing an arch, and the output should be 0. If your perceptron does that correctly for all inputs, it knows the concept of arch. x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks But what if you present inputs for an arch, and your perceptron outputs a 0??? x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks But what if you present inputs for an arch, and your perceptron outputs a 0??? What could be done to make it more likely that the output will be 1 the next time the ‘tron sees those same inputs? x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks But what if you present inputs for an arch, and your perceptron outputs a 0??? What could be done to make it more likely that the output will be 1 the next time the ‘tron sees those same inputs? You increase the weights. Which ones? How much? x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks But what if you present inputs for not an arch, and your perceptron outputs a 1? x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks But what if you present inputs for not an arch, and your perceptron outputs a 1? What could be done to make it more likely that the output will be 0 the next time the ‘tron sees those same inputs? x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Learning in neural networks But what if you present inputs for not an arch, and your perceptron outputs a 1? What could be done to make it more likely that the output will be 0 the next time the ‘tron sees those same inputs? You decrease the weights. Which ones? How much? x1x1 x2x2 x3x3 xnxn  w1w1 w2w2 w3w3 wnwn inputs weights sum threshold

Let’s make one... First we need to come up with a representation language. We’ll abstract away most everything to make it simple.

Let’s make one... First we need to come up with a representation language. We’ll abstract away most everything to make it simple. All training examples have three blocks. A and B are upright blocks. A is always left of B. C is a sideways block. Our language will assume those things always to be true. The only things our language will represent are the answers to these five questions...

Let’s make one... yes = 1, no = 0 Does A support C? Does B support C? Does A touch C? Does B touch C? Does A touch B?

Let’s make one... yes = 1, no = 0 Does A support C?1 Does B support C?1 Does A touch C?1 Does B touch C?1 Does A touch B?0 AB C arch

Let’s make one... yes = 1, no = 0 Does A support C?1 Does B support C?1 Does A touch C?1 Does B touch C?1 Does A touch B?1 AB C not arch

Let’s make one... yes = 1, no = 0 Does A support C?0 Does B support C?0 Does A touch C?1 Does B touch C?1 Does A touch B?0 AB C not arch

Let’s make one... yes = 1, no = 0 Does A support C?0 Does B support C?0 Does A touch C?1 Does B touch C?1 Does A touch B?0 AB C not arch and so on.....

Our very simple arch learner x1x1 x2x2 x3x3 x5x5  x4x4

x1x1 x2x2 x3x3 x5x5  x4x

Our very simple arch learner  AB C arch

Our very simple arch learner  AB C arch sum = 1* *0 + 1* *0 + 0*0.5

Our very simple arch learner  AB C arch sum = = 0 which is not > threshold so output is 0 0

Our very simple arch learner  AB C arch sum = = 0 which is not > threshold so output is 0 ‘tron said no when it should say yes so increase weights where input = 1 0

Our very simple arch learner  AB C arch sum = = 0 which is not > threshold so output is 0 so we increase the weights where the input is 1 0

Our very simple arch learner  AB C not arch now we look at the next example

Our very simple arch learner  AB C not arch sum = = -0.1 which is not > 0.5 so output is 0 0

Our very simple arch learner  AB C not arch sum = = -0.1 which is not > 0.5 so output is 0 that’s the right output for this input, so we don’t touch the weights 0

Our very simple arch learner  AB C not arch now we look at the next example

Our very simple arch learner  AB C not arch sum = = 0.7 which is > 0.5 so output = 1 1

Our very simple arch learner  AB C not arch the ‘tron said yes when it should have said no, so we decrease the weights where the inputs = 1 1

Our very simple arch learner  AB C not arch the ‘tron said yes when it should have said no, so we decrease the weights where the inputs = 1 1

We could do this for days......but let’s have the computer do it all for us. First, take a look at the training examples we’ve constructed...

Training Set a b a b a in class? supports supports touches touches touches c c c c b yes no no no no

Training Set a b a b a in class? supports supports touches touches touches c c c c b no no no no no

Now let’s look at the program The program isn’t in CILOG!

What’s going on? The perceptron goes through the training set, making a guess for each example and comparing it to the actual answer. Based on that comparison, the perceptron adjusts weights up, down, or not at all. For some concepts, the process converges on a set of weights such that the perceptron guesses correctly for every example in the training set -- that’s when the weights stop changing.

What’s going on? Another way of looking at it: Each of the possible inputs (2 5 in our case) maps onto a 1 or a 0. The perceptron is trying to find a set of weights such that it can draw a line through the set of all inputs and say “these inputs belong on one side of the line (output = 1) and those belong on the other side (output = 0)”.

What’s going on? one set of weights

What’s going on? another set of weights

What’s going on? still another set of weights

What’s going on? still another set of weights The perceptron looks for linear separability. That is, in the n-space defined by the inputs, it’s looking for a line or plane that divides the inputs.

Observations This perceptron can learn concepts involving “and” and “or” This perceptron can’t learn “exclusive or” (try ex3 in the Scheme code). Not linearly separble... it won’t converge Even a network of perceptrons arranged in a single layer can’t compute XOR (as well as other things)

Observations The representation for the learned concept (e.g., the arch concept) is just five numbers. What does that say about the physical symbol system hypothesis? However, if you know which questions or relations each individual weight is associated with, you still have a sense of what the numbers/weights mean.

Observations This is another example of intelligence as search for the best representation. The final set of weights that was the solution to the arch-learning problem is not the only solution. In fact, there are infinitely many solutions, corresponding to the infinitely many ways to draw the line or plane.

Beyond perceptrons Perceptrons were popular in the 1960s, and some folks thought they were the key to intelligent machines. But you needed multiple-layer perceptron networks to compute some things, but to make those work you had to build them by hand. Multiple-layer perceptron nets don’t necessarily converge, and when they don’t converge, they don’t learn. Building big nets by hand was too time consuming, so interest in perceptrons died off in the 1970s.

The return of the perceptron In the 1980s, people figured out that with some changes, multiple layers of perceptron-like units could compute anything. First, you get rid of the threshold -- replace the step function that generated output with a continuous function. Then you put hidden layers between the inputs and the output(s).

The return of the perceptron input layer hidden layer output layer

The return of the perceptron Then for each input example you let the activation feed forward to the outputs, compare outputs to desired outputs, and then backpropagate the information about the differences to inform decisions about how to adjust the weights in the multiple layers of network. Changes to weights give rise to immediate changes in output, because there’s no threshold.

The return of the perceptron This generation of networks is called neural networks, or backpropagation nets, or connectionist networks (don’t use perceptron now) They’re capable of computing anything computable They learn well They degrade gracefully They handle noisy input well They’re really good at modelling low-level perceptual processing and simple learning but...

The return of the perceptron This generation of networks is called neural networks, or backpropagation nets, or connectionist networks (don’t use perceptron now) They can’t explain their reasoning Neither can we, easily...what the pattern of weights in the hidden layers correspond to is opaque Not a lot of success yet in modelling higher levels of cognitive behavior (planning, story understanding, etc.)

The return of the perceptron But this discussion of perceptrons should put you in a position to read Chapter 11.2 and see how those backprop networks came about. Which leads us to the end of this term...

Things... Good bye! Thanks for being nice to the new guy. We’ll see you at noon on Friday, December 10, in MCML 166 for the final exam Have a great holiday break, and come visit next term.

Questions?