Machine learning Image source: https://www.coursera.org/course/mlhttps://www.coursera.org/course/ml.

Slides:



Advertisements
Similar presentations
Learning from Observations Chapter 18 Section 1 – 3.
Advertisements

Classification Classification Examples
DECISION TREES. Decision trees  One possible representation for hypotheses.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Machine learning continued Image source:
An Overview of Machine Learning
Indian Statistical Institute Kolkata
Supervised learning Given training examples of inputs and corresponding outputs, produce the “correct” outputs for new inputs Two main scenarios: –Classification:
Recognition: A machine learning approach
Decision making in episodic environments
COMP 875: Introductions Name, year, research area/group
Cooperating Intelligent Systems
Learning From Observations
LEARNING DECISION TREES
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
Learning decision trees
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
ICS 273A Intro Machine Learning
Learning: Introduction and Overview
Machine learning Image source:
Introduction to machine learning
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Machine learning Image source:
More Machine Learning Linear Regression Squared Error L1 and L2 Regularization Gradient Descent.
CSE 185 Introduction to Computer Vision Pattern Recognition.
Machine Learning Overview Tamara Berg Language and Vision.
Machine Learning CPS4801. Research Day Keynote Speaker o Tuesday 9:30-11:00 STEM Lecture Hall (2 nd floor) o Meet-and-Greet 11:30 STEM 512 Faculty Presentation.
Machine Learning Overview Tamara Berg CS Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart.
Classification Tamara Berg CSE 595 Words & Pictures.
Mehdi Ghayoumi Kent State University Computer Science Department Summer 2015 Exposition on Cyber Infrastructure and Big Data.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Inductive learning Simplest form: learn a function from examples
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
LEARNING DECISION TREES Yılmaz KILIÇASLAN. Definition - I Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm.
Learning from observations
Learning from Observations Chapter 18 Through
CHAPTER 18 SECTION 1 – 3 Learning from Observations.
Learning from Observations Chapter 18 Section 1 – 3, 5-8 (presentation TBC)
Learning from Observations Chapter 18 Section 1 – 3.
Lecture 27: Recognition Basics CS4670/5670: Computer Vision Kavita Bala Slides from Andrej Karpathy and Fei-Fei Li
Machine Learning Overview Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart.
CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.
Classification II Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell,
Chapter 18 Section 1 – 3 Learning from Observations.
Learning From Observations Inductive Learning Decision Trees Ensembles.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Decision Tree Learning CMPT 463. Reminders Homework 7 is due on Tuesday, May 10 Projects are due on Tuesday, May 10 o Moodle submission: readme.doc and.
Machine learning & object recognition Cordelia Schmid Jakob Verbeek.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Learning from Observations
Learning from Observations
Machine learning Image source:
Introduce to machine learning
Learning from Data. Learning from Data Learning sensors actuators environment agent ? As an agent interacts with the world, it should learn about its.
Presented By S.Yamuna AP/CSE
Perceptrons Lirong Xia.
K Nearest Neighbor Classification
Decision making in episodic environments
Learning from Observations
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Speech recognition, machine learning
Learning from Observations
Machine Learning: Decision Tree Learning
Perceptrons Lirong Xia.
Speech recognition, machine learning
Presentation transcript:

Machine learning Image source:

Machine learning Definition –Getting a computer to do well on a task without explicitly programming it –Improving performance on a task based on experience

Learning for episodic tasks We have just looked at learning in sequential environments Now let’s consider the “easier” problem of episodic environments –The agent gets a series of unrelated problem instances and has to make some decision or inference about each of them –In this case, “experience” comes in the form of training data

Example: Image classification apple pear tomato cow dog horse inputdesired output

Training data apple pear tomato cow dog horse

Example 2: Seismic data classification Body wave magnitude Surface wave magnitude Nuclear explosions Earthquakes

Example 3: Spam filter

Example 4: Sentiment analysis

Example 5: Robot grasping L. Pinto and A. Gupta, Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours,” arXiv.org/abs/ arXiv.org/abs/ YouTube video

The basic supervised learning framework y = f(x) Learning: given a training set of labeled examples {(x 1,y 1 ), …, (x N,y N )}, estimate the parameters of the prediction function f Inference: apply f to a never before seen test example x and output the predicted value y = f(x) The key challenge is generalization outputclassification function input

Prediction Learning and inference pipeline Training Labels Training Samples Training Learning Features Inference Test Sample Learned model

Experimentation cycle Learn parameters on the training set Tune hyperparameters (implementation choices) on the held out validation set Evaluate performance on the test set Very important: do not peek at the test set! Generalization and overfitting –Want classifier that does well on never before seen data –Overfitting: good performance on the training/validation set, poor performance on test set

What’s the big deal?

Naïve Bayes classifier A single dimension or attribute of x

Decision tree classifier Example problem: decide whether to wait for a table at a restaurant, based on the following attributes: 1.Alternate: is there an alternative restaurant nearby? 2.Bar: is there a comfortable bar area to wait in? 3.Fri/Sat: is today Friday or Saturday? 4.Hungry: are we hungry? 5.Patrons: number of people in the restaurant (None, Some, Full) 6.Price: price range ($, $$, $$$) 7.Raining: is it raining outside? 8.Reservation: have we made a reservation? 9.Type: kind of restaurant (French, Italian, Thai, Burger) 10.WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60)

Decision tree classifier

Nearest neighbor classifier f(x) = label of the training example nearest to x All we need is a distance function for our inputs No training required! Test example Training examples from class 1 Training examples from class 2

K-nearest neighbor classifier For a new point, find the k closest points from training data Vote for class label with labels of the k points k = 5

K-nearest neighbor classifier Which classifier is more robust to outliers? Credit: Andrej Karpathy,

K-nearest neighbor classifier Credit: Andrej Karpathy,

Linear classifier Find a linear function to separate the classes f(x) = sgn(w 1 x 1 + w 2 x 2 + … + w D x D + b) = sgn(w  x + b)

NN vs. linear classifiers NN pros: +Simple to implement +Decision boundaries not necessarily linear +Works for any number of classes +Nonparametric method NN cons: –Need good distance function –Slow at test time Linear pros: +Low-dimensional parametric representation +Very fast at test time Linear cons: –Works for two classes –How to train the linear function? –What if data is not linearly separable?

Other machine learning scenarios Other prediction scenarios –Regression –Structured prediction Other supervision scenarios –Unsupervised learning –Semi-supervised learning –Active learning –Lifelong learning

Beyond simple classification: Structured prediction Image Word Source: B. Taskar

Structured Prediction Sentence Parse tree Source: B. Taskar

Structured Prediction Sentence in two languages Word alignment Source: B. Taskar

Structured Prediction Amino-acid sequence Bond structure Source: B. Taskar

Structured Prediction Many image-based inference tasks can loosely be thought of as “structured prediction” Source: D. Ramanan model

Unsupervised Learning Idea: Given only unlabeled data as input, learn some sort of structure The objective is often more vague or subjective than in supervised learning This is more of an exploratory/descriptive data analysis

Unsupervised Learning Clustering –Discover groups of “similar” data points

Unsupervised Learning Quantization –Map a continuous input to a discrete (more compact) output 1 2 3

Dimensionality reduction, manifold learning –Discover a lower-dimensional surface on which the data lives Unsupervised Learning

Density estimation –Find a function that approximates the probability density of the data (i.e., value of the function is high for “typical” points and low for “atypical” points) –Can be used for anomaly detection

Semi-supervised learning Lots of data is available, but only small portion is labeled (e.g. since labeling is expensive) –Why is learning from labeled and unlabeled data better than learning from labeled data alone? ?

Active learning The learning algorithm can choose its own training examples, or ask a “teacher” for an answer on selected inputs S. Vijayanarasimhan and K. Grauman, “Cost-Sensitive Active Visual Category Learning,” 2009

Lifelong learning

Lifelong learning

Xinlei Chen, Abhinav Shrivastava and Abhinav Gupta. NEIL: Extracting Visual Knowledge from Web Data. In ICCV 2013NEIL: Extracting Visual Knowledge from Web Data