Artificial Intelligence 9. Perceptron

Slides:



Advertisements
Similar presentations
Classification Classification Examples
Advertisements

G53MLE | Machine Learning | Dr Guoping Qiu
What we will cover here What is a classifier
Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.
The Perceptron Algorithm (Primal Form) Repeat: until no mistakes made within the for loop return:. What is ?
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
Induction of Decision Trees
Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.
Support Vector Machines Classification
Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Naïve Bayes Classifier Ke Chen Extended by Longin Jan Latecki COMP20411 Machine Learning.
Artificial Intelligence 7. Decision trees
Last lecture summary Naïve Bayes Classifier. Bayes Rule Normalization Constant LikelihoodPrior Posterior Prior and likelihood must be learnt (i.e. estimated.
Classification: Feature Vectors
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
ADVANCED PERCEPTRON LEARNING David Kauchak CS 451 – Fall 2013.
Non-Bayes classifiers. Linear discriminants, neural networks.
Artificial Intelligence 8. Supervised and unsupervised learning Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.
Perceptrons Michael J. Watts
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Bayesian Learning Reading: Tom Mitchell, “Generative and discriminative classifiers: Naive Bayes and logistic regression”, Sections 1-2. (Linked from.
Neural networks and support vector machines
CSSE463: Image Recognition Day 14
Artificial Neural Networks
Artificial Neural Networks
Large Margin classifiers
Artificial Intelligence
Decision trees (concept learnig)
Naïve Bayes Classifier
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Classification Algorithms
CSE543: Machine Learning Lecture 2: August 6, 2014
Data Science Algorithms: The Basic Methods
Decision Trees: Another Example
Naïve Bayes Classifier
Perceptrons Lirong Xia.
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Classification with Perceptrons Reading:
Naïve Bayes Classifier
Ch 2. Concept Map ⊂ ⊂ Single Layer Perceptron = McCulloch – Pitts Type Learning starts in Ch 2 Architecture, Learning Adaline : Linear Learning.
CS 4/527: Artificial Intelligence
CS 188: Artificial Intelligence
Machine Learning Chapter 3. Decision Tree Learning
Machine Learning: Lecture 3
Perceptron as one Type of Linear Discriminants
Naïve Bayes Classifier
Generative Models and Naïve Bayes
Play Tennis ????? Day Outlook Temperature Humidity Wind PlayTennis
Machine Learning Chapter 3. Decision Tree Learning
CS480/680: Intro to ML Lecture 01: Perceptron 9/11/18 Yao-Liang Yu.
Artificial Intelligence Lecture No. 28
Ensemble learning.
The Naïve Bayes (NB) Classifier
Support Vector Machines
CS621: Artificial Intelligence
Artificial Intelligence 10. Neural Networks
Chapter - 3 Single Layer Percetron
Generative Models and Naïve Bayes
Lecture 8. Learning (V): Perceptron Learning
A task of induction to find patterns
Machine Learning: Decision Tree Learning
Perceptron Learning Rule
A task of induction to find patterns
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Perceptron Learning Rule
Perceptron Learning Rule
Perceptrons Lirong Xia.
Data Mining CSCI 307, Spring 2019 Lecture 6
Presentation transcript:

Artificial Intelligence 9. Perceptron Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka

Outline Feature space Perceptrons The averaged perceptron Lecture slides http://www.jaist.ac.jp/~tsuruoka/lectures/

Feature space Instances are represented by vectors in a feature space

Feature space Instances are represented by vectors in a feature space 正例 <Outlook = sunny, Temperature = cool, Humidity = normal> 負例 <Outlook = rain, Temperature = high, Humidity = high>

Separating instances with a hyperplane Find a hyperplane that separates the positive and negative examples

Perceptron learning Can always find such a hyperplane if the given examples are linearly separable

Linear classification Binary classification with a linear model : instance : feature vector : weight vector bias If the inner product of the feature vector with the linear weights is greater than or equal to zero, then it is classified as a positive example, otherwise it is classified as a negative example

The Perceptron learning algorithm Initialize the weight vector Choose an example (randomly) from the training data If it is not classified correctly, If it is a positive example If it is a negative example Step 2 and 3 are repeated until all examples are correctly classified.

Learning the concept OR Training data Negative Positive Positive Positive

Iteration 1 x1 Wrong!

Iteration 2 x4 Wrong!

Iteration 3 x2 OK!

Iteration 4 x3 OK!

Iteration 5 x1 Wrong!

Separating hyperplane Final weight vector t 1 Separating hyperplane s 1 s and t are the input (the second and the third elements of the feature vector)

Why the update rule works When a positive example has not been correctly classified This values was too small Original value This is always positive The update rule makes it less likely for the perceptron to make the same mistake

Convergence The Perceptron training algorithm converges after a finite number of iterations to a hyperplane that perfectly classifies the training data, provided the training examples are linearly separable. The number of iterations can be very large The algorithm does not converge if the training data are not linearly separable

Learning the PlayTennis concept Final weight vector Feature space 11 binary features Perceptron learning Converged in 239 steps  Bias Outlook = Sunny -3 Outlook = Overcast 5 Outlook = Rain -2 Temperature = Hot Temperature = Mild 3 Temperature = Cool Humidity = High -4 Humidity = Normal 4 Wind = Strong Wind = Weak

Averaged Perceptron A variant of the Perceptron learning algorithm Output the weight vector which is averaged over iterations rather than the final weight vector Do not wait until convergence Determine when to stop by observing the performance on the validation set Practical and widely used

Naive Bayes vs Perceptrons The naive Bayes model assumes conditional independence between features Adding informative features does not necessarily improve the performance Percetrons allow one to incorporate diverse types of features The training takes longer