Download presentation
Presentation is loading. Please wait.
Published byJoseph Nash Modified over 11 years ago
1
Blind online optimization Gradient descent without a gradient Abie Flaxman CMU Adam Tauman Kalai TTI Brendan McMahan CMU
2
Standard convex optimization Convex feasible set S ½ < d Concave function f : S ! < } Goal: find x f(x) ¸ max z2S f(z) – = f(x*) - x* RdRd
3
Steepest ascent Move in the direction of steepest ascent Compute f(x) (rf(x) in higher dimensions) Works for convex optimization (and many other problems) x1x1 x2x2 x3x3 x4x4
4
Typical application Company produces certain numbers of cars per month Vector x 2 < d (#Corollas, #Camrys, …) Profit of company is concave function of production vector Maximize total (eq. average) profit PROBLEMS
5
Sequence of unknown concave functions period t: pick x t 2 S, find out only f t (x t ) convex Problem definition and results Theorem:
6
Online model Holds for arbitrary sequences Stronger than stochastic model: –f 1, f 2, …, i.i.d. from D –x * = arg min x2S E D [f(x)] expected regret
7
Outline Problem definition Simple algorithm Analysis sketch Variations Related work & applications
8
First try x1x1 f 1 (x 1 ) PROFIT #CAMRYS x2x2 f 2 (x 2 ) x3x3 f 3 (x 3 ) x4x4 f 4 (x 4 ) f1f1 f2f2 f3f3 f4f4 Zinkevich 03: If we could only compute gradients… x*
9
Idea: one point gradient PROFIT #CAMRYS x x+ x- With probability ½, estimate = f(x + )/ With probability ½, estimate = –f(x – )/ E[ estimate ] ¼ f(x)
10
d-dimensional online algorithm S x1x1 x2x2 x3x3 x4x4
11
Outline Problem definition Simple algorithm Analysis sketch Variations Related work & applications
12
Analysis ingredients E[1-point estimate] is gradient of is small Online gradient ascent analysis [Z03] Online expected gradient ascent analysis (Hidden complications)
13
1-pt gradient analysis PROFIT #CAMRYS x+ x-
14
1-pt gradient analysis (d-dim) E[1-point estimate] is gradient of is small 2 1
15
Online gradient ascent [Z03] (concave, bounded gradient)
16
Expected gradient ascent analysis Regular deterministic gradient ascent on g t (concave, bounded gradient)
17
Adaptive adversary…
18
Hidden complication… S
19
S
20
S
21
Thin sets are bad S
22
Hidden complication… Round sets are good …reshape into isotropic position [LV03]
23
Outline Problem definition Simple algorithm Analysis sketch Variations Related work & applications
24
Variations Works against adaptive adversary –Chooses f t knowing x 1, x 2, …, x t-1 Also works if we only get a noisy estimate of f t (x t ), i.e. E[h t (x t )|x t ]=f t (x t ) diameter gradient bound
25
Finite difference Related convex optimization Sighted (see entire function(s)) Blind (evaluations only) Regular (single f) Stochastic (dist over fs or dist over errors) Online (f 1, f 2, f 3, …) Gradient descent (stoch.) Gradient descent,...Ellipsoid, Random walk [BV02], Sim. annealing [KV05], Finite difference Gradient descent (online) [Z03] 1-pt. gradient appx. [BKM04] Finite difference [Kleinberg04] 1-pt. gradient appx. [G89,S97]
26
Related discrete optimization Linear function(s) over discrete set Sighted (see entire function(s)) Blind aka bandit (evaluations only) Regular (single f) Shortest path, max, … Stochastic (dist over fs) Huffman trees, … Online (f 1, f 2, f 3, …) Weighted majority, … Online linear optimization [Hannan57,KV03] Adversarial bandits, Blind linear optimization [AK04, MB04 (adaptive adversary)]
27
2 235 235 25 235 Switching lanes (experts) 031 503 034 230 S
28
2 235 235 25 235 Multi-armed bandit (experts) 1 0 0 0 S [R52,ACFS95,…]
29
Driving to work (online routing) Exponentially many paths… Exponentially many slot machines? Finite dimensions Exploration/exploitation tradeoff 25 [TW02,KV02, AK04,BM04] S
30
Online product design
31
One-dimensional problem easy Discretize, special case of multi-armed bandit problem 1/ slot machines No need for convexity d-dimensional problem harder Discretizing at granularity Exp many (1/ d ) slot machines ) exponential regret } High dimensions
32
Non-linear applications
33
Conclusions and future work Can learn to optimize a sequence of unrelated functions from evaluations Answer to: What is the sound of one hand clapping? Applications –Cholesterol –Paper airplanes –Advertising Future work –Many players using same algorithm (game theory)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.