CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x.

Slides:



Advertisements
Similar presentations
Neural networks Introduction Fitting neural networks
Advertisements

CHAPTER 2: Supervised Learning. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Learning a Class from Examples.
Pattern Recognition and Machine Learning
CHAPTER 10: Linear Discrimination
SVM—Support Vector Machines
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Machine Learning Week 2 Lecture 1.
CMPUT 466/551 Principal Source: CMU
Data mining in 1D: curve fitting
The loss function, the normal equation,
Artificial Intelligence Lecture 2 Dr. Bo Yuan, Professor Department of Computer Science and Engineering Shanghai Jiaotong University
Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c.
Linear Models Tony Dodd January 2007An Overview of State-of-the-Art Data Modelling Overview Linear models. Parameter estimation. Linear in the.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems
Oregon State University – Intelligent Systems Group 8/22/2003ICML Giorgio Valentini Dipartimento di Scienze dell Informazione Università degli Studi.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Part I: Classification and Bayesian Learning
Classification and Prediction: Regression Analysis
Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem
PATTERN RECOGNITION AND MACHINE LEARNING
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Machine Learning Week 4 Lecture 1. Hand In Data Is coming online later today. I keep test set with approx test images That will be your real test.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
EM and expected complete log-likelihood Mixture of Experts
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Perceptual and Sensory Augmented Computing Machine Learning, WS 13/14 Machine Learning – Lecture 14 Introduction to Regression Bastian Leibe.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
Aug. 27, 2003IFAC-SYSID2003 Functional Analytic Framework for Model Selection Masashi Sugiyama Tokyo Institute of Technology, Tokyo, Japan Fraunhofer FIRST-IDA,
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
Introduction to Machine Learning Supervised Learning 姓名 : 李政軒.
Text Classification 2 David Kauchak cs459 Fall 2012 adapted from:
Overview of the final test for CSC Overview PART A: 7 easy questions –You should answer 5 of them. If you answer more we will select 5 at random.
Christoph Eick: Learning Models to Predict and Classify 1 Learning from Examples Example of Learning from Examples  Classification: Is car x a family.
Concept learning, Regression Adapted from slides from Alpaydin’s book and slides by Professor Doina Precup, Mcgill University.
Bias and Variance of the Estimator PRML 3.2 Ethem Chp. 4.
CS Inductive Bias1 Inductive Bias: How to generalize on novel data.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16.
Bias and Variance of the Estimator PRML 3.2 Ethem Chp. 4.
Machine Learning 5. Parametric Methods.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 15: Mixtures of Experts Geoffrey Hinton.
CS 2750: Machine Learning The Bias-Variance Tradeoff Prof. Adriana Kovashka University of Pittsburgh January 13, 2016.
Machine Learning CUNY Graduate Center Lecture 6: Linear Regression II.
MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
Regression Machine Learning. Outline Regression vs Classification Linear regression – another discriminative learning method –As optimization 
Page 1 CS 546 Machine Learning in NLP Review 2: Loss minimization, SVM and Logistic Regression Dan Roth Department of Computer Science University of Illinois.
Linear Models Tony Dodd. 21 January 2008Mathematics for Data Modelling: Linear Models Overview Linear models. Parameter estimation. Linear in the parameters.
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
Computational Intelligence: Methods and Applications Lecture 14 Bias-variance tradeoff – model selection. Włodzisław Duch Dept. of Informatics, UMK Google:
Deep Feedforward Networks
CSE 4705 Artificial Intelligence
Bias and Variance of the Estimator
Probabilistic Models for Linear Regression
CS 2750: Machine Learning Line Fitting + Bias-Variance Trade-off
Hyperparameters, bias-variance tradeoff, validation
10701 / Machine Learning Today: - Cross validation,
Biointelligence Laboratory, Seoul National University
Overfitting and Underfitting
Neural networks (1) Traditional multi-layer perceptrons
Linear Discrimination
Introduction to Neural Networks
Presentation transcript:

CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x 2 ) Loss = error associated with one data point Risk = sum of all losses Pseudoinverse gives least-squares solution, NOT exact solutions Magnitude of w matters for SVMs.

HW 3 Will be released today. Probably harder than HW1 or HW2 Due Oct 6 (two Tuesdays from now) HW party: Oct 1. I wrote (some of) it. 

Downsides of using kernels Speed & memory – Need to store all training data, each test point must be computed against each training point SVMs only need subset of data (support vectors) Overfit

3 Perspectives on Linear Regression

1. Minimize Loss (see lecture) Take derivative of ||Xw – y|| 2, set to 0 Result: X’Xw = X’y

2. Projections

3. Gaussian noise

HW 3 – first problem has a question on this

Bias & Variance Bias: – Incorrect assumptions in your model – Your algorithm is only able to capture models of complexity C Variance – Sensitivity of your algorithm to noise in the data. – How much your model changes per “unit” change in the data.

Bias & Variance Bias vs. variance is a tradeoff Bias – you assume data is linear, when it’s nonlinear. Variance – you assume data could be polynomial, when it’s always linear. – By assuming data could be polynomial, lots of free parameters that move around if the training data changes. – High variance = “overfitting”

Bias & Variance If variance if too high, will often add bias in order to reduce variance. This is the reason regularization exists. – Increase bias, reduce variance. Usually depends on amount of data – More data  fix down all those free parameters. Will revisit this with random forests.

Problem 1 a) Do at home b) Follow the Gaussian noise interpretation of linear regression

Problem 2 Credit: Yun Park

Problem 2 Credit: Yun Park

Problem 3 & 4 3) Write loss function, find derivative. 4) Practice problems – “Extra for experts” is inaccurate – there is a very simple answer.