Logistic Regression.

Logistic Regression

Linear regression fits the a line to a set of points.
Given x, you can use the line to predict y.

Logistic regression fits a logistic function (sigmoid) to a set of points and binary labels.
Given a new point, the sigmoid gives the predicted probability that the class is positive.

Logistic Regression For ease of notation, let x = (x0, x1, …, xn), where x0 =1. Let w = (w0, w1, …,wn), where w0 is the bias weight. Class y  {0,1}

Learning: Use training data to determine weights.

Learning: Use training data to determine weights.
To classify a new x, assign class y that maximizes P(y | x)

Logistic Regression: Learning Weights
Goal is to learn weights w. Let (x j , y j) be the jth training example and its label. We want: This is equivalent to: This is called “log of conditional likelihood”

We can write the log conditional likelihood this way:
since yl is either 0 or 1 This is what we want to maximize.

Use gradient ascent to maximize l(w).
This is called “Maximum likelihood estimation” (or MLE). Recall: We have: Let’s find the gradient with respect to wi :

Using chain rule and algebra

Stochastic Gradient Ascent for Logistic Regression
Start with small random initial weights, both positive and negative: w = (w0, w1, …wn) Repeat until convergence, or for some max number of epochs For each training example : Note again that w includes the bias weight w0, and x includes the bias term x0 = 1.

Homework 4, Part 2

Logistic Regression.

Similar presentations

Presentation on theme: "Logistic Regression."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Logistic Regression.

Similar presentations

Presentation on theme: "Logistic Regression."— Presentation transcript:

Similar presentations

About project

Feedback