Presentation is loading. Please wait.

Presentation is loading. Please wait.

Blind online optimization Gradient descent without a gradient Abie Flaxman CMU Adam Tauman Kalai TTI Brendan McMahan CMU.

Similar presentations


Presentation on theme: "Blind online optimization Gradient descent without a gradient Abie Flaxman CMU Adam Tauman Kalai TTI Brendan McMahan CMU."— Presentation transcript:

1 Blind online optimization Gradient descent without a gradient Abie Flaxman CMU Adam Tauman Kalai TTI Brendan McMahan CMU

2 Standard convex optimization Convex feasible set S ½ < d Concave function f : S ! < } Goal: find x f(x) ¸ max z2S f(z) – = f(x*) - x* RdRd

3 Steepest ascent Move in the direction of steepest ascent Compute f(x) (rf(x) in higher dimensions) Works for convex optimization (and many other problems) x1x1 x2x2 x3x3 x4x4

4 Typical application Company produces certain numbers of cars per month Vector x 2 < d (#Corollas, #Camrys, …) Profit of company is concave function of production vector Maximize total (eq. average) profit PROBLEMS

5 Sequence of unknown concave functions period t: pick x t 2 S, find out only f t (x t ) convex Problem definition and results Theorem:

6 Online model Holds for arbitrary sequences Stronger than stochastic model: –f 1, f 2, …, i.i.d. from D –x * = arg min x2S E D [f(x)] expected regret

7 Outline Problem definition Simple algorithm Analysis sketch Variations Related work & applications

8 First try x1x1 f 1 (x 1 ) PROFIT #CAMRYS x2x2 f 2 (x 2 ) x3x3 f 3 (x 3 ) x4x4 f 4 (x 4 ) f1f1 f2f2 f3f3 f4f4 Zinkevich 03: If we could only compute gradients… x*

9 Idea: one point gradient PROFIT #CAMRYS x x+ x- With probability ½, estimate = f(x + )/ With probability ½, estimate = –f(x – )/ E[ estimate ] ¼ f(x)

10 d-dimensional online algorithm S x1x1 x2x2 x3x3 x4x4

11 Outline Problem definition Simple algorithm Analysis sketch Variations Related work & applications

12 Analysis ingredients E[1-point estimate] is gradient of is small Online gradient ascent analysis [Z03] Online expected gradient ascent analysis (Hidden complications)

13 1-pt gradient analysis PROFIT #CAMRYS x+ x-

14 1-pt gradient analysis (d-dim) E[1-point estimate] is gradient of is small 2 1

15 Online gradient ascent [Z03] (concave, bounded gradient)

16 Expected gradient ascent analysis Regular deterministic gradient ascent on g t (concave, bounded gradient)

17 Adaptive adversary…

18 Hidden complication… S

19 S

20 S

21 Thin sets are bad S

22 Hidden complication… Round sets are good …reshape into isotropic position [LV03]

23 Outline Problem definition Simple algorithm Analysis sketch Variations Related work & applications

24 Variations Works against adaptive adversary –Chooses f t knowing x 1, x 2, …, x t-1 Also works if we only get a noisy estimate of f t (x t ), i.e. E[h t (x t )|x t ]=f t (x t ) diameter gradient bound

25 Finite difference Related convex optimization Sighted (see entire function(s)) Blind (evaluations only) Regular (single f) Stochastic (dist over fs or dist over errors) Online (f 1, f 2, f 3, …) Gradient descent (stoch.) Gradient descent,...Ellipsoid, Random walk [BV02], Sim. annealing [KV05], Finite difference Gradient descent (online) [Z03] 1-pt. gradient appx. [BKM04] Finite difference [Kleinberg04] 1-pt. gradient appx. [G89,S97]

26 Related discrete optimization Linear function(s) over discrete set Sighted (see entire function(s)) Blind aka bandit (evaluations only) Regular (single f) Shortest path, max, … Stochastic (dist over fs) Huffman trees, … Online (f 1, f 2, f 3, …) Weighted majority, … Online linear optimization [Hannan57,KV03] Adversarial bandits, Blind linear optimization [AK04, MB04 (adaptive adversary)]

27 2 235 235 25 235 Switching lanes (experts) 031 503 034 230 S

28 2 235 235 25 235 Multi-armed bandit (experts) 1 0 0 0 S [R52,ACFS95,…]

29 Driving to work (online routing) Exponentially many paths… Exponentially many slot machines? Finite dimensions Exploration/exploitation tradeoff 25 [TW02,KV02, AK04,BM04] S

30 Online product design

31 One-dimensional problem easy Discretize, special case of multi-armed bandit problem 1/ slot machines No need for convexity d-dimensional problem harder Discretizing at granularity Exp many (1/ d ) slot machines ) exponential regret } High dimensions

32 Non-linear applications

33 Conclusions and future work Can learn to optimize a sequence of unrelated functions from evaluations Answer to: What is the sound of one hand clapping? Applications –Cholesterol –Paper airplanes –Advertising Future work –Many players using same algorithm (game theory)


Download ppt "Blind online optimization Gradient descent without a gradient Abie Flaxman CMU Adam Tauman Kalai TTI Brendan McMahan CMU."

Similar presentations


Ads by Google