Linear regression with one variable

Slides:

Advertisements

Similar presentations

Neural networks Introduction Fitting neural networks

Advertisements

Regularization David Kauchak CS 451 – Fall 2013.

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:

Lecture 13 – Perceptrons Machine Learning March 16, 2010.

Logistic Regression Classification Machine Learning.

Artificial Neural Networks

Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.

Overview over different methods – Supervised Learning

Kernel Methods and SVM’s. Predictive Modeling Goal: learn a mapping: y = f(x;  ) Need: 1. A model structure 2. A score function 3. An optimization strategy.

CS 4700: Foundations of Artificial Intelligence

CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.

Collaborative Filtering Matrix Factorization Approach

Machine learning Image source:

More Machine Learning Linear Regression Squared Error L1 and L2 Regularization Gradient Descent.

Learning with large datasets Machine Learning Large scale machine learning.

Logistic Regression L1, L2 Norm Summary and addition to Andrew Ng’s lectures on machine learning.

1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.

1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:

ISCG8025 Machine Learning for Intelligent Data and Information Processing Week 2B *Courtesy of Associate Professor Andrew Ng’s Notes, Stanford University.

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:

Machine Learning Introduction Study on the Coursera All Right Reserved : Andrew Ng Lecturer:Much Database Lab of Xiamen University Aug 12,2014.

Model representation Linear regression with one variable

Andrew Ng Linear regression with one variable Model representation Machine Learning.

CS 478 – Tools for Machine Learning and Data Mining Backpropagation.

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:

ISCG8025 Machine Learning for Intelligent Data and Information Processing Week 3 Practical Notes Regularisation *Courtesy of Associate Professor Andrew.

M Machine Learning F# and Accord.net. Alena Dzenisenka Software architect at Luxoft Poland Member of F# Software Foundation Board of Trustees Researcher.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

Logistic Regression Week 3 – Soft Computing By Yosi Kristian.

Week 1 - An Introduction to Machine Learning & Soft Computing

The problem of overfitting

Regularization (Additional)

Logistic Regression (Classification Algorithm)

M Machine Learning F# and Accord.net.

Chapter 2-OPTIMIZATION

Neural Networks The Elements of Statistical Learning, Chapter 12 Presented by Nick Rizzolo.

Page 1 CS 546 Machine Learning in NLP Review 2: Loss minimization, SVM and Logistic Regression Dan Roth Department of Computer Science University of Illinois.

WEEK 2 SOFT COMPUTING & MACHINE LEARNING YOSI KRISTIAN Gradient Descent for Linear Regression.

Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.

Neural networks and support vector machines

Fall 2004 Backpropagation CS478 - Machine Learning.

Deep Feedforward Networks

Machine Learning & Deep Learning

The Gradient Descent Algorithm

Lecture 3: Linear Regression (with One Variable)

A Simple Artificial Neuron

Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)

LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS

Logistic Regression Classification Machine Learning.

Logistic Regression Classification Machine Learning.

Collaborative Filtering Matrix Factorization Approach

Logistic Regression Classification Machine Learning.

Logistic Regression.

Lecture Notes for Chapter 4 Artificial Neural Networks

The loss function, the normal equation,

How do we find the best linear regression line?

Mathematical Foundations of BME Reza Shadmehr

Neural networks (1) Traditional multi-layer perceptrons

Machine Learning Algorithms – An Overview

Backpropagation David Kauchak CS159 – Fall 2019.

What is machine learning

Logistic Regression Classification Machine Learning.

Multiple features Linear Regression with multiple variables

Multiple features Linear Regression with multiple variables

Logistic Regression Classification Machine Learning.

Logistic Regression Classification Machine Learning.

Logistic Regression Geoff Hulten.

Presentation transcript:

Linear regression with one variable Cost function Machine Learning

Training Set Hypothesis: ‘s: Parameters How to choose ‘s ? Size in feet2 (x) Price ($) in 1000's (y) 2104 460 1416 232 1534 315 852 178 … Hypothesis: ‘s: Parameters How to choose ‘s ?

Idea: Choose so that is close to for our training examples y x Idea: Choose so that is close to for our training examples

Cost function intuition I Linear regression with one variable Cost function intuition I Machine Learning

Simplified Hypothesis: Parameters: Cost Function: Goal:

(for fixed , this is a function of x) (function of the parameter ) y x

(function of the parameter ) (for fixed , this is a function of x) y x

(function of the parameter ) (for fixed , this is a function of x) y x

Cost function intuition II Linear regression with one variable Cost function intuition II Machine Learning

Hypothesis: Parameters: Cost Function: Goal:

(for fixed , this is a function of x) (function of the parameters ) Price ($) in 1000’s Size in feet2 (x)

(for fixed , this is a function of x) (function of the parameters )

(for fixed , this is a function of x) (function of the parameters )

(for fixed , this is a function of x) (function of the parameters )

(for fixed , this is a function of x) (function of the parameters )

Linear regression with one variable Gradient descent Machine Learning

Have some function Want Outline: Start with some Keep changing to reduce until we hopefully end up at a minimum

J(0,1) 1 0

J(0,1) 1 0

Gradient descent algorithm Correct: Simultaneous update Incorrect:

Gradient descent intuition Linear regression with one variable Gradient descent intuition Machine Learning

Gradient descent algorithm

If α is too small, gradient descent can be slow. If α is too large, gradient descent can overshoot the minimum. It may fail to converge, or even diverge.

at local optima Current value of

Gradient descent can converge to a local minimum, even with the learning rate α fixed. As we approach a local minimum, gradient descent will automatically take smaller steps. So, no need to decrease α over time.

Gradient descent for linear regression Linear regression with one variable Gradient descent for linear regression Machine Learning

Gradient descent algorithm Linear Regression Model

Gradient descent algorithm update and simultaneously

Gradient descent example 𝑡ℎ𝑒𝑡𝑎1=2 theta0 = - 1 alpha = 0.01 X y h error h-y (h-y)x 1 2 3 6 5 10

J(0,1) 1 0

(for fixed , this is a function of x) (function of the parameters )

(for fixed , this is a function of x) (function of the parameters )

(for fixed , this is a function of x) (function of the parameters )

(for fixed , this is a function of x) (function of the parameters )

(for fixed , this is a function of x) (function of the parameters )

(for fixed , this is a function of x) (function of the parameters )

(for fixed , this is a function of x) (function of the parameters )

(for fixed , this is a function of x) (function of the parameters )

(for fixed , this is a function of x) (function of the parameters )

Logistic Regression Classification Machine Learning

Classification Email: Spam / Not Spam? Online Transactions: Fraudulent (Yes / No)? Tumor: Malignant / Benign ? 0: “Negative Class” (e.g., benign tumor) 1: “Positive Class” (e.g., malignant tumor)

Classification: y = 0 or 1 can be > 1 or < 0 Logistic Regression:

Hypothesis Representation Logistic Regression Hypothesis Representation Machine Learning

Sigmoid function Logistic function Logistic Regression Model Want 1 0.5 Sigmoid function Logistic function

Logistic regression z 1 Suppose predict “ “ if predict “ “ if

Logistic Regression Cost function Machine Learning

Training set: m examples How to choose parameters ?

Cost function Linear regression: “non-convex” “convex”

Logistic regression cost function If y = 1 1

Logistic regression cost function If y = 0 1

Simplified cost function and gradient descent Logistic Regression Simplified cost function and gradient descent Machine Learning

Logistic regression cost function

Logistic regression cost function To fit parameters : To make a prediction given new : Output

Gradient Descent Want : Repeat (simultaneously update all )

Algorithm looks identical to linear regression! Gradient Descent Want : Repeat (simultaneously update all ) Algorithm looks identical to linear regression!

Gradient Descent Want : Repeat (simultaneously update all )

Algorithm looks identical to linear regression! Gradient Descent Want : Repeat (simultaneously update all ) Algorithm looks identical to linear regression!

Chain rule

Derivation of logistic regression

Now Derive From

code to compute code to compute code to compute code to compute theta = function [jVal, gradient] = costFunction(theta) jVal = [ ]; code to compute gradient(1) = [ ]; code to compute gradient(2) = [ ]; code to compute gradient(n+1) = [ ]; code to compute