Artificial Intelligence 9. Perceptron

Slides:

Advertisements

Similar presentations

Classification Classification Examples

Advertisements

G53MLE | Machine Learning | Dr Guoping Qiu

What we will cover here What is a classifier

Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.

The Perceptron Algorithm (Primal Form) Repeat: until no mistakes made within the for loop return:. What is ?

Introduction to Neural Networks John Paxton Montana State University Summer 2003.

Induction of Decision Trees

Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.

Support Vector Machines Classification

Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.

CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.

Naïve Bayes Classifier Ke Chen Extended by Longin Jan Latecki COMP20411 Machine Learning.

Artificial Intelligence 7. Decision trees

Last lecture summary Naïve Bayes Classifier. Bayes Rule Normalization Constant LikelihoodPrior Posterior Prior and likelihood must be learnt (i.e. estimated.

Classification: Feature Vectors

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.

ADVANCED PERCEPTRON LEARNING David Kauchak CS 451 – Fall 2013.

Non-Bayes classifiers. Linear discriminants, neural networks.

Artificial Intelligence 8. Supervised and unsupervised learning Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Perceptrons Michael J. Watts

Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.

Bayesian Learning Reading: Tom Mitchell, “Generative and discriminative classifiers: Naive Bayes and logistic regression”, Sections 1-2. (Linked from.

Neural networks and support vector machines

CSSE463: Image Recognition Day 14

Artificial Neural Networks

Artificial Neural Networks

Large Margin classifiers

Artificial Intelligence

Decision trees (concept learnig)

Naïve Bayes Classifier

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Classification Algorithms

CSE543: Machine Learning Lecture 2: August 6, 2014

Data Science Algorithms: The Basic Methods

Decision Trees: Another Example

Naïve Bayes Classifier

Perceptrons Lirong Xia.

Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)

Classification with Perceptrons Reading:

Naïve Bayes Classifier

Ch 2. Concept Map ⊂ ⊂ Single Layer Perceptron = McCulloch – Pitts Type Learning starts in Ch 2 Architecture, Learning Adaline : Linear Learning.

CS 4/527: Artificial Intelligence

CS 188: Artificial Intelligence

Machine Learning Chapter 3. Decision Tree Learning

Machine Learning: Lecture 3

Perceptron as one Type of Linear Discriminants

Naïve Bayes Classifier

Generative Models and Naïve Bayes

Play Tennis ????? Day Outlook Temperature Humidity Wind PlayTennis

Machine Learning Chapter 3. Decision Tree Learning

CS480/680: Intro to ML Lecture 01: Perceptron 9/11/18 Yao-Liang Yu.

Artificial Intelligence Lecture No. 28

Ensemble learning.

The Naïve Bayes (NB) Classifier

Support Vector Machines

CS621: Artificial Intelligence

Artificial Intelligence 10. Neural Networks

Chapter - 3 Single Layer Percetron

Generative Models and Naïve Bayes

Lecture 8. Learning (V): Perceptron Learning

A task of induction to find patterns

Machine Learning: Decision Tree Learning

Perceptron Learning Rule

A task of induction to find patterns

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Perceptron Learning Rule

Perceptron Learning Rule

Perceptrons Lirong Xia.

Data Mining CSCI 307, Spring 2019 Lecture 6

Presentation transcript:

Artificial Intelligence 9. Perceptron Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka

Outline Feature space Perceptrons The averaged perceptron Lecture slides http://www.jaist.ac.jp/~tsuruoka/lectures/

Feature space Instances are represented by vectors in a feature space

Feature space Instances are represented by vectors in a feature space 正例 <Outlook = sunny, Temperature = cool, Humidity = normal> 負例 <Outlook = rain, Temperature = high, Humidity = high>

Separating instances with a hyperplane Find a hyperplane that separates the positive and negative examples

Perceptron learning Can always find such a hyperplane if the given examples are linearly separable

Linear classification Binary classification with a linear model ：　instance ：　feature vector ：　weight vector bias If the inner product of the feature vector with the linear weights is greater than or equal to zero, then it is classified as a positive example, otherwise it is classified as a negative example

The Perceptron learning algorithm Initialize the weight vector Choose an example (randomly) from the training data If it is not classified correctly, If it is a positive example If it is a negative example Step 2 and 3 are repeated until all examples are correctly classified.

Learning the concept OR Training data Negative Positive Positive Positive

Iteration 1 x1 Wrong!

Iteration 2 x4 Wrong!

Iteration 3 x2 OK!

Iteration 4 x3 OK!

Iteration 5 x1 Wrong!

Separating hyperplane Final weight vector t 1 Separating hyperplane s 1 s and t are the input (the second and the third elements of the feature vector)

Why the update rule works When a positive example has not been correctly classified This values was too small Original value This is always positive The update rule makes it less likely for the perceptron to make the same mistake

Convergence The Perceptron training algorithm converges after a finite number of iterations to a hyperplane that perfectly classifies the training data, provided the training examples are linearly separable. The number of iterations can be very large The algorithm does not converge if the training data are not linearly separable

Learning the PlayTennis concept Final weight vector Feature space 11 binary features Perceptron learning Converged in 239 steps　 Bias Outlook = Sunny -3 Outlook = Overcast 5 Outlook = Rain -2 Temperature = Hot Temperature = Mild 3 Temperature = Cool Humidity = High -4 Humidity = Normal 4 Wind = Strong Wind = Weak

Averaged Perceptron A variant of the Perceptron learning algorithm Output the weight vector which is averaged over iterations rather than the final weight vector Do not wait until convergence Determine when to stop by observing the performance on the validation set Practical and widely used

Naive Bayes vs Perceptrons The naive Bayes model assumes conditional independence between features Adding informative features does not necessarily improve the performance Percetrons allow one to incorporate diverse types of features The training takes longer