Download presentation
Presentation is loading. Please wait.
Published byGwendoline Morton Modified over 9 years ago
1
CS B553: A LGORITHMS FOR O PTIMIZATION AND L EARNING Gradient descent
2
K EY C ONCEPTS Gradient descent Line search Convergence rates depend on scaling Variants: discrete analogues, coordinate descent Random restarts
12
Line search: pick step size to lead to decrease in function value
13
(Use your favorite univariate optimization method) f(x- f(x))
14
G RADIENT D ESCENT P SEUDOCODE Input: f, starting value x 1, termination tolerances For t=1,2,…,maxIters: Compute the search direction d t = - f ( x t ) If || d t ||< ε g then: return “Converged to critical point”, output x t Find t so that f ( x t + t d t ) < f ( x t ) using line search If || t d t ||< ε x then: return “Converged in x ”, output x t Let x t +1 = x t + t d t Return “Max number of iterations reached”, output x maxIters
20
R ELATED M ETHODS Steepest descent (discrete) Coordinate descent
21
Many local minima: good initialization, or random restarts
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.