CS 188: Artificial Intelligence Learning II: Linear Classification and Neural Networks Instructors: Stuart Russell and Pat Virtue University of California,

Slides:

Advertisements

Similar presentations

1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)

Advertisements

1 Image Classification MSc Image Processing Assignment March 2003.

G53MLE | Machine Learning | Dr Guoping Qiu

Navneet Goyal, BITS-Pilani Perceptrons. Labeled data is called Linearly Separable Data (LSD) if there is a linear decision boundary separating the classes.

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.

Classification Neural Networks 1

Machine Learning Neural Networks

Lecture 14 – Neural Networks

Simple Neural Nets For Pattern Classification

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

CES 514 – Data Mining Lecture 8 classification (contd…)

Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed.

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

An Illustrative Example

Artificial Neural Networks

CS 4700: Foundations of Artificial Intelligence

September 28, 2010Neural Networks Lecture 7: Perceptron Modifications 1 Adaline Schematic Adjust weights i1i1i1i1 i2i2i2i2 inininin …  w 0 + w 1 i 1 +

Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.

Machine learning Image source:

Machine learning Image source:

Neural Networks Lecture 8: Two simple learning algorithms

Classification III Tamara Berg CS Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell,

Radial Basis Function Networks

Computer Science and Engineering

Multi-Layer Perceptrons Michael J. Watts

Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.

Classification: Feature Vectors

Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.

Linear Discrimination Reading: Chapter 2 of textbook.

Non-Bayes classifiers. Linear discriminants, neural networks.

Handwritten digit recognition

Linear Classification with Perceptrons

CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.

Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:

CS621 : Artificial Intelligence

Chapter 6 Neural Network.

Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.

Convolutional Neural Network

Neural Networks References: “Artificial Intelligence for Games” "Artificial Intelligence: A new Synthesis"

Page 1 CS 546 Machine Learning in NLP Review 1: Supervised Learning, Binary Classifiers Dan Roth Department of Computer Science University of Illinois.

Lecture 12. Outline of Rule-Based Classification 1. Overview of ANN 2. Basic Feedforward ANN 3. Linear Perceptron Algorithm 4. Nonlinear and Multilayer.

CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Neural networks and support vector machines

Artificial Neural Networks

Dan Roth Department of Computer and Information Science

Artificial Intelligence

Artificial neural networks

ECE 5424: Introduction to Machine Learning

Lecture 24: Convolutional neural networks

Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)

Neural Networks CS 446 Machine Learning.

Machine Learning: The Connectionist

CS 4/527: Artificial Intelligence

CSSE463: Image Recognition Day 17

Introduction to Neural Networks

CS 188: Artificial Intelligence

Classification Neural Networks 1

Artificial Intelligence Chapter 3 Neural Networks

Lecture Notes for Chapter 4 Artificial Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

CSSE463: Image Recognition Day 17

Artificial Intelligence Chapter 3 Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

CSSE463: Image Recognition Day 17

Introduction to Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

Presentation transcript:

CS 188: Artificial Intelligence Learning II: Linear Classification and Neural Networks Instructors: Stuart Russell and Pat Virtue University of California, Berkeley

Regression vs Classification 1 x y 1 x y

Threshold perceptron as linear classifier

Binary Decision Rule  A threshold perceptron is a single unit that outputs  y = h w (x) = 1 when w.x  0 = 0 when w.x < 0  In the input vector space  Examples are points x  The equation w.x=0 defines a hyperplane  One side corresponds to y=1  Other corresponds to y=0 w 0 : -3 w free : 4 w money : 2 free money y=1 (SPAM) y=0 (HAM) w.x=0

Example Dear Stuart, I’m leaving Macrosoft to return to academia. The money is is great here but I prefer to be free to do my own research; and I really love teaching undergrads! Do I need to finish my BA first before applying? Best wishes Bill w 0 : -3 w free : 4 w money : 2 free money y=1 (SPAM) y=0 (HAM) w.x=0 x 0 : 1 x free : 1 x money : 1 w.x = -3x1 + 4x1 + 2x1 = 3

Weight Updates

Perceptron learning rule  If true y  h w (x) (an error), adjust the weights  If w.x < 0 but the output should be y=1  This is called a false negative  Should increase weights on positive inputs  Should decrease weights on negative inputs  If w.x > 0 but the output should be y=0  This is called a false positive  Should decrease weights on positive inputs  Should increase weights on negative inputs  The perceptron learning rule does this:  w  w +  (y – h w (x)) x learning rate +1, -1, or 0 (no error)

Example Dear Stuart, I wanted to let you know that I have decided to leave Macrosoft and return to academia. The money is is great here but I prefer to be free to pursue more interesting research and I really love teaching undergraduates! Do I need to finish my BA first before applying? Best wishes Bill w 0 : -3 w free : 4 w money : 2 free money y=1 (SPAM) y=0 (HAM) w.x=0 x 0 : 1 x free : 1 x money : 1 w.x = -3x1 + 4x1 + 2x1 = 3 w  w +  (y – h w (x)) x  = 0.5 w  (-3,4,2) (0 – 1) (1,1,1) = (-3.5,3.5,1.5)

Perceptron convergence theorem  A learning problem is linearly separable iff there is some hyperplane exactly separating +ve from –ve examples  Convergence: if the training data are separable, perceptron learning applied repeatedly to the training set will eventually converge to a perfect separator Separable Non-Separable

Example: Earthquakes vs nuclear explosions 63 examples, 657 updates required

Perceptron convergence theorem  A learning problem is linearly separable iff there is some hyperplane exactly separating +ve from –ve examples  Convergence: if the training data are separable, perceptron learning applied repeatedly to the training set will eventually converge to a perfect separator  Convergence: if the training data are non-separable, perceptron learning will converge to a minimum-error solution provided the learning rate  is decayed appropriately (e.g.,  =1/t) Separable Non-Separable

Perceptron learning with fixed 

Perceptron learning with decaying 

Other Linear Classifiers 1 x y  Support Vector Machines (SVM)  Maximize margin between boundary and nearest points 1 x y

Neural Networks

Very Loose Inspiration: Human Neurons

Simple Model of a Neuron (McCulloch & Pitts, 1943)  Inputs a i come from the output of node i to this node j (or from “outside”)  Each input link has a weight w i,j  There is an additional fixed input a 0 with bias weight w 0,j  The total input is in j =  i w i,j a i  The output is a j = g(in j ) = g(  i w i,j a i ) = g(w.a)

Single Neuron

Minimize Single Neuron Loss

Choice of Activation Function

Multiclass Classification

softmax function

Multilayer Perceptrons  A multilayer perceptron is a feedforward neural network with at least one hidden layer (nodes that are neither inputs nor outputs)  MLPs with enough hidden nodes can represent any function

Neural Network Equations w 133

Minimize Neural Network Loss

Error Backpropagation a 22

Deep Learning: Convolutional Neural Networks  LeNet5 – Lecun, et al, 1998  Convnets for digit recognition LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE (1998):

Convolutional Neural Networks  Alexnet – Alex Krizhevsky, Geoffrey Hinton, et al, 2012  Convnets for image classification  More data & more compute power Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems

Deep Learning: GoogLeNet Szegedy, Christian, et al. ”Going deeper with convolutions." CVPR (2015).

Neural Nets  Incredible success in the last three years  Data (ImageNet)  Compute power  Optimization  Activation functions (ReLU)  Regularization  Reducing overfitting (d ropout)  Software packages  Caffe (UC Berkeley)  Theano (Université de Montréal)  Torch (Facebook, Yann Lecun)  TensorFlow (Google)

Practical Issues