Presentation is loading. Please wait.

Presentation is loading. Please wait.

Logistic Regression Classification Machine Learning.

Similar presentations


Presentation on theme: "Logistic Regression Classification Machine Learning."— Presentation transcript:

1 Logistic Regression Classification Machine Learning

2 Classification: y = or 1 can be > 1 or < 0 Logistic Regression:

3 Sigmoid function Logistic function Logistic Regression Model Want 1
0.5 Sigmoid function Logistic function

4 Logistic Regression Cost function Machine Learning

5 Training set: m examples How to choose parameters ?

6 Logistic regression cost function
If y = 1 1

7 Logistic regression cost function
If y = 0 1

8 Logistic regression cost function

9 Logistic regression cost function
To fit parameters : To make a prediction given new : Output

10 Gradient Descent Want : Repeat (simultaneously update all )

11 Algorithm looks identical to linear regression!
Gradient Descent Want : Repeat (simultaneously update all ) Algorithm looks identical to linear regression!

12 Example: function [jVal, gradient] = costFunction(theta)
jVal = (theta(1)-5)^ (theta(2)-5)^2; gradient = zeros(2,1); gradient(1) = 2*(theta(1)-5); gradient(2) = 2*(theta(2)-5); options = optimset(‘GradObj’, ‘on’, ‘MaxIter’, ‘100’); initialTheta = zeros(2,1); [optTheta, functionVal, exitFlag] ... = initialTheta, options);

13 code to compute code to compute code to compute code to compute
theta = function [jVal, gradient] = costFunction(theta) jVal = [ ]; code to compute gradient(1) = [ ]; code to compute gradient(2) = [ ]; code to compute gradient(n+1) = [ ]; code to compute

14 The problem of overfitting
Regularization The problem of overfitting Machine Learning

15 Example: Linear regression (housing prices)
Size Size Size Overfitting: If we have too many features, the learned hypothesis may fit the training set very well ( ), but fail to generalize to new examples (predict prices on new examples).

16 Example: Logistic regression
( = sigmoid function)

17 Addressing overfitting:
size of house no. of bedrooms Price no. of floors age of house average income in neighborhood Size kitchen size

18 Addressing overfitting:
Options: Reduce number of features. Manually select which features to keep. Model selection algorithm (later in course). Regularization. Keep all the features, but reduce magnitude/values of parameters . Works well when we have a lot of features, each of which contributes a bit to predicting .

19 Regularization Cost function Machine Learning

20 Suppose we penalize and make , really small.
Intuition Price Price Size of house Size of house Suppose we penalize and make , really small.

21 Regularization. Small values for parameters “Simpler” hypothesis Less prone to overfitting Housing: Features: Parameters:

22 Regularization. Price Size of house

23 In regularized linear regression, we choose to minimize
What if is set to an extremely large value (perhaps for too large for our problem, say )? Price Size of house

24 Regularized linear regression
Regularization Regularized linear regression Machine Learning

25 Regularized linear regression

26 Gradient descent Repeat

27 Regularized logistic regression
Regularization Regularized logistic regression Machine Learning

28 Regularized logistic regression.
x1 x2 Cost function:

29 Gradient descent Repeat

30 Evaluating a hypothesis
Advice for applying machine learning Evaluating a hypothesis Machine Learning

31 Evaluating your hypothesis
Fails to generalize to new examples not in training set. price size of house no. of bedrooms no. of floors size age of house average income in neighborhood kitchen size

32 Evaluating your hypothesis Dataset:
Size Price 2104 400 1600 330 2400 369 1416 232 3000 540 1985 300 1534 315 1427 199 1380 212 1494 243

33 Model selection and training/validation/test sets
Advice for applying machine learning Model selection and training/validation/test sets Machine Learning

34 Overfitting example Once parameters were fit to some set of data (training set), the error of the parameters as measured on that data (the training error xxxxx) is likely to be lower than the actual generalization error. price size

35 Model selection 1. 2. 3. 10. Choose How well does the model generalize? Report test set error Problem: is likely to be an optimistic estimate of generalization error. I.e. our extra parameter ( = degree of polynomial) is fit to test set.

36 Evaluating your hypothesis Dataset:
Size Price 2104 400 1600 330 2400 369 1416 232 3000 540 1985 300 1534 315 1427 199 1380 212 1494 243

37 Train/validation/test error
Training error: Cross Validation error: Test error:

38 Diagnosing bias vs. variance
Advice for applying machine learning Diagnosing bias vs. variance Machine Learning

39 Bias/variance High bias (underfit) “Just right” High variance
Price Size Price Price Size Size High bias (underfit) “Just right” High variance (overfit)

40 Cross validation error:
Bias/variance Training error: Cross validation error: error size price degree of polynomial d

41 Diagnosing bias vs. variance
Suppose your learning algorithm is performing less well than you were hoping. ( or is high.) Is it a bias problem or a variance problem? Bias (underfit): degree of polynomial d error (cross validation error) Variance (overfit): (training error)

42 Regularization and bias/variance
Advice for applying machine learning Regularization and bias/variance Machine Learning

43 High variance (overfit)
Linear regression with regularization Model: Price Size Price Price Size Size Large xx High bias (underfit) Intermediate xx “Just right” Small xx High variance (overfit)

44 Choosing the regularization parameter

45 Choosing the regularization parameter
Model: Try Pick (say) Test error:

46 Bias/variance as a function of the regularization parameter

47 Advice for applying machine learning
Learning curves Machine Learning

48 Learning curves error (training set size)

49 High bias price error size (training set size) price If a learning algorithm is suffering from high bias, getting more training data will not (by itself) help much. size

50 High variance (and small ) error price size (training set size) price If a learning algorithm is suffering from high variance, getting more training data is likely to help. size


Download ppt "Logistic Regression Classification Machine Learning."

Similar presentations


Ads by Google