ECE 5424: Introduction to Machine Learning

Slides:

Advertisements

Similar presentations

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Regression –Model selection, Cross-validation –Error decomposition.

Advertisements

Regularization David Kauchak CS 451 – Fall 2013.

Pattern Recognition and Machine Learning

CMPUT 466/551 Principal Source: CMU

Middle Term Exam 03/01 (Thursday), take home, turn in at noon time of 03/02 (Friday)

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –SVM –Multi-class SVMs –Neural Networks –Multi-layer Perceptron Readings:

Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.

ECE 5984: Introduction to Machine Learning

PATTERN RECOGNITION AND MACHINE LEARNING

Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Classification: Naïve Bayes Readings: Barber

ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Statistical Estimation (MLE, MAP, Bayesian) Readings: Barber 8.6, 8.7.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.

LOGISTIC REGRESSION David Kauchak CS451 – Fall 2013.

Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.

CS Statistical Machine learning Lecture 10 Yuan (Alan) Qi Purdue CS Sept

CSE 446 Logistic Regression Winter 2012 Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Unsupervised Learning: Kmeans, GMM, EM Readings: Barber

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Nearest Neighbour Readings: Barber 14 (kNN)

ISCG8025 Machine Learning for Intelligent Data and Information Processing Week 3 Practical Notes Application Advice *Courtesy of Associate Professor Andrew.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Classification: Logistic Regression –NB & LR connections Readings: Barber.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Supervised Learning –General Setup, learning from data –Nearest Neighbour.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Regression Readings: Barber 17.1, 17.2.

Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16.

Bias and Variance of the Estimator PRML 3.2 Ethem Chp. 4.

Machine Learning CUNY Graduate Center Lecture 6: Linear Regression II.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Decision/Classification Trees Readings: Murphy ; Hastie 9.2.

CS 2750: Machine Learning Linear Regression Prof. Adriana Kovashka University of Pittsburgh February 10, 2016.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Expectation Maximization –Principal Component Analysis (PCA) Readings:

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Introduction to Machine Learning Nir Ailon Lecture 11: Probabilistic Models.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Model selection –Error decomposition –Bias-Variance Tradeoff –Classification:

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.

Deep Feedforward Networks

ECE 5424: Introduction to Machine Learning

Dhruv Batra Georgia Tech

Probability Theory and Parameter Estimation I

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

Ch3: Model Building through Regression

CSE 4705 Artificial Intelligence

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

CS 2750: Machine Learning Linear Regression

Bias and Variance of the Estimator

Probabilistic Models for Linear Regression

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

10701 / Machine Learning Today: - Cross validation,

Pattern Recognition and Machine Learning

The Bias Variance Tradeoff and Regularization

Biointelligence Laboratory, Seoul National University

Overfitting and Underfitting

Support Vector Machine I

Model generalization Brief summary of methods

The Bias-Variance Trade-Off

Shih-Yang Su Virginia Tech

Reinforcement Learning (2)

Reinforcement Learning (2)

Presentation transcript:

ECE 5424: Introduction to Machine Learning Topics: (Finish) Regression Model selection, Cross-validation Error decomposition Readings: Barber 17.1, 17.2 Stefan Lee Virginia Tech

Administrative Project Proposal HW2 Reminder: Due: Fri 09/23, 11:55 pm NOTE: DEADLINE SHIFTED <=2pages, NIPS format HW2 Due: Wed 09/28, 11:55pm Implement linear regression, Naïve Bayes, Logistic Regression Reminder: Participation on Scholar forum is part of your grade Ask questions if you have them! (C) Dhruv Batra

Recap of last time (C) Dhruv Batra

Regression (C) Dhruv Batra

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

But, why? Why sum squared error??? Gaussians, Watson, Gaussians… (C) Dhruv Batra

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Is OLS Robust? Demo http://www.calpoly.edu/~srein/StatDemo/All.html Bad things happen when the data does not come from your model! How do we fix this? (C) Dhruv Batra

Robust Linear Regression y ~ Lap(w’x, b) On paper (C) Dhruv Batra

Plan for Today (Finish) Regression Error Decomposition Bayesian Regression Different prior vs likelihood combination Polynomial Regression Error Decomposition Bias-Variance Cross-validation (C) Dhruv Batra

Robustify via Prior Ridge Regression y ~ N(w’x, σ2) w ~ N(0, t2I) P(w | x,y) = (C) Dhruv Batra

Summary Likelihood Prior Name Gaussian Uniform Least Squares Ridge Regression Laplace Lasso Robust Regression Student (C) Dhruv Batra

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Example Demo http://www.princeton.edu/~rkatzwer/PolynomialRegression/ (C) Dhruv Batra

What you need to know Linear Regression Model Least Squares Objective Connections to Max Likelihood with Gaussian Conditional Robust regression with Laplacian Likelihood Ridge Regression with priors Polynomial and General Additive Regression (C) Dhruv Batra

New Topic: Model Selection and Error Decomposition (C) Dhruv Batra

Example for Regression Demo http://www.princeton.edu/~rkatzwer/PolynomialRegression/ How do we pick the hypothesis class? (C) Dhruv Batra

Model Selection How do we pick the right model class? Similar questions How do I pick magic hyper-parameters? How do I do feature selection? (C) Dhruv Batra

Errors Expected Loss/Error Training Loss/Error Validation Loss/Error Test Loss/Error Reporting Training Error (instead of Test) is CHEATING Optimizing parameters on Test Error is CHEATING (C) Dhruv Batra

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Slide Credit: Greg Shakhnarovich (C) Dhruv Batra Slide Credit: Greg Shakhnarovich

Typical Behavior a (C) Dhruv Batra

Overfitting Overfitting: a learning algorithm overfits the training data if it outputs a solution w when there exists another solution w’ such that: (C) Dhruv Batra Slide Credit: Carlos Guestrin

Error Decomposition Reality model class Modeling Error Estimation Error Optimization Error Modelling – right off the bat Estimation – finite sample Optimization – non-convex optimization error (C) Dhruv Batra

Error Decomposition Reality model class Modeling Error Estimation Error Optimization Error Modeling error goes up, estimation error decreases, possible optimization error too model class (C) Dhruv Batra

Higher-Order Potentials Error Decomposition Modeling Error Reality model class Higher-Order Potentials Estimation Error Optimization Error (C) Dhruv Batra

Error Decomposition Approximation/Modeling Error Estimation Error You approximated reality with model Estimation Error You tried to learn model with finite data Optimization Error You were lazy and couldn’t/didn’t optimize to completion (Next time) Bayes Error Reality just sucks (C) Dhruv Batra