Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS B553: A LGORITHMS FOR O PTIMIZATION AND L EARNING Gradient descent.

Similar presentations


Presentation on theme: "CS B553: A LGORITHMS FOR O PTIMIZATION AND L EARNING Gradient descent."— Presentation transcript:

1 CS B553: A LGORITHMS FOR O PTIMIZATION AND L EARNING Gradient descent

2 K EY C ONCEPTS Gradient descent Line search Convergence rates depend on scaling Variants: discrete analogues, coordinate descent Random restarts

3

4

5

6

7

8

9

10

11

12 Line search: pick step size to lead to decrease in function value

13 (Use your favorite univariate optimization method)  f(x-  f(x)) 

14 G RADIENT D ESCENT P SEUDOCODE Input: f, starting value x 1, termination tolerances For t=1,2,…,maxIters: Compute the search direction d t = -  f ( x t ) If || d t ||< ε g then: return “Converged to critical point”, output x t Find  t so that f ( x t +  t d t ) < f ( x t ) using line search If ||  t d t ||< ε x then: return “Converged in x ”, output x t Let x t +1 = x t +  t d t Return “Max number of iterations reached”, output x maxIters

15

16

17

18

19

20 R ELATED M ETHODS Steepest descent (discrete) Coordinate descent

21 Many local minima: good initialization, or random restarts


Download ppt "CS B553: A LGORITHMS FOR O PTIMIZATION AND L EARNING Gradient descent."

Similar presentations


Ads by Google