Download presentation
Presentation is loading. Please wait.
1
Logistic Regression
2
Linear regression fits the a line to a set of points.
Given x, you can use the line to predict y.
3
Logistic regression fits a logistic function (sigmoid) to a set of points and binary labels.
Given a new point, the sigmoid gives the predicted probability that the class is positive.
4
Logistic Regression For ease of notation, let x = (x0, x1, …, xn), where x0 =1. Let w = (w0, w1, …,wn), where w0 is the bias weight. Class y {0,1}
6
Learning: Use training data to determine weights.
7
Learning: Use training data to determine weights.
To classify a new x, assign class y that maximizes P(y | x)
8
Logistic Regression: Learning Weights
Goal is to learn weights w. Let (x j , y j) be the jth training example and its label. We want: This is equivalent to: This is called “log of conditional likelihood”
9
We can write the log conditional likelihood this way:
since yl is either 0 or 1 This is what we want to maximize.
10
Use gradient ascent to maximize l(w).
This is called “Maximum likelihood estimation” (or MLE). Recall: We have: Let’s find the gradient with respect to wi :
11
Using chain rule and algebra
12
Stochastic Gradient Ascent for Logistic Regression
Start with small random initial weights, both positive and negative: w = (w0, w1, …wn) Repeat until convergence, or for some max number of epochs For each training example : Note again that w includes the bias weight w0, and x includes the bias term x0 = 1.
13
Homework 4, Part 2
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.