Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/

Similar presentations


Presentation on theme: "CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/"— Presentation transcript:

1 CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/ 6: Logistic Regression 1 CSC 4510 - M.A. Papalaskari - Villanova University T he slides in this presentation are adapted from: Andrew Ng’s ML course http://www.ml-class.org/http://www.ml-class.org/

2 Machine learning problems Supervised Learning – Classification – Regression Unsupervised learning Others: Reinforcement learning, recommender systems. Also talk about: Practical advice for applying learning algorithms. CSC 4510 - M.A. Papalaskari - Villanova University 2

3 Machine learning problems Supervised Learning – Classification – Regression Unsupervised learning Others: Reinforcement learning, recommender systems. Also talk about: Practical advice for applying learning algorithms. CSC 4510 - M.A. Papalaskari - Villanova University 3

4 Classification Email: Spam / Not Spam? Online Transactions: Fraudulent (Yes / No)? Tumor: Malignant / Benign ? 0: “Negative Class” (e.g., benign tumor) 1: “Positive Class” (e.g., malignant tumor)

5 Tumor Size Threshold classifier output at 0.5: If, predict “y = 1” If, predict “y = 0” Tumor Size Malignant ? (Yes) 1 (No) 0

6 Classification: y = 0 or 1 can be > 1 or < 0 Logistic Regression:

7 New model: Use Sigmoid function (or Logistic function) Logistic Regression Model Want Old regression model: 1 0.5 0

8 Interpretation of Hypothesis Output = estimated probability that y = 1 on input x Tell patient that 70% chance of tumor being malignant Example: If “probability that y = 1, given x, parameterized by ”

9 Logistic regression Suppose predict “ ” if predict “ “ if z 1

10 x1x1 x2x2 Decision Boundary 1 23 1 2 3 Predict “ “ if

11 Non-linear decision boundaries x1x1 x2x2 Predict “ “ if 1 1

12 Training set: How to choose parameters ? m examples

13 Cost function Linear regression:

14 Logistic regression cost function If y = 1 1 0

15 Logistic regression cost function If y = 0 1 0

16 Logistic regression cost function

17 Logistic regression cost function – more compact expression

18 Output Logistic regression cost function – more compact expression To fit parameters : To make a prediction given new :

19 Gradient Descent Want : Repeat (simultaneously update all )

20 Gradient Descent Want : (simultaneously update all ) Repeat Algorithm looks identical to linear regression!

21 Optimization algorithm Given, we have code that can compute - (for ) Optimization algorithms: -Gradient descent -Conjugate gradient -BFGS -L-BFGS BFGS= Broyden Fletcher Goldfarb Shanno Advantages: -No need to manually pick -Often faster than gradient descent. Disadvantages: -More complex

22 Multiclass classification Email foldering/tagging: Work, Friends, Family, Hobby Medical diagrams: Not ill, Cold, Flu Weather: Sunny, Cloudy, Rain, Snow

23 x1x1 x2x2 x1x1 x2x2 Binary classification: Multi-class classification:

24 x1x1 x2x2 One-vs-all (one-vs-rest): Class 1: Class 2: Class 3: x1x1 x2x2 x1x1 x2x2 x1x1 x2x2

25 One-vs-all Train a logistic regression classifier for each class to predict the probability that. On a new input, to make a prediction, pick the class that maximizes


Download ppt "CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website: www.csc.villanova.edu/~map/4510/"

Similar presentations


Ads by Google