Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010.

Slides:



Advertisements
Similar presentations
Coordination Mechanisms for Unrelated Machine Scheduling Yossi Azar joint work with Kamal Jain Vahab Mirrokni.
Advertisements

Best Response Dynamics in Multicast Cost Sharing
Inefficiency of equilibria, and potential games Computational game theory Spring 2008 Michal Feldman TexPoint fonts used in EMF. Read the TexPoint manual.
Price of Stability Li Jian Fudan University May, 8 th,2007 Introduction to.
Price Of Anarchy: Routing
Lecturer: Moni Naor Algorithmic Game Theory Uri Feige Robi Krauthgamer Moni Naor Lecture 8: Regret Minimization.
Fast Convergence of Selfish Re-Routing Eyal Even-Dar, Tel-Aviv University Yishay Mansour, Tel-Aviv University.
Congestion Games with Player- Specific Payoff Functions Igal Milchtaich, Department of Mathematics, The Hebrew University of Jerusalem, 1993 Presentation.
Course: Price of Anarchy Professor: Michal Feldman Student: Iddan Golomb 26/02/2014 Non-Atomic Selfish Routing.
How Bad is Selfish Routing? By Tim Roughgarden Eva Tardos Presented by Alex Kogan.
Regret Minimization and the Price of Total Anarchy Paper by A. Blum, M. Hajiaghayi, K. Ligett, A.Roth Presented by Michael Wunder.
Online learning, minimizing regret, and combining expert advice
1 Smooth Games and Intrinsic Robustness Christodoulou and Koutsoupias, Roughgarden Slides stolen/modified from Tim Roughgarden TexPoint fonts used in EMF.
Algorithms and Economics of Networks Abraham Flaxman and Vahab Mirrokni, Microsoft Research.
Balázs Sziklai Selfish Routing in Non-cooperative Networks.
1 Algorithmic Game Theoretic Perspectives in Networking Dr. Liane Lewin-Eytan.
Mechanism Design without Money Lecture 4 1. Price of Anarchy simplest example G is given Route 1 unit from A to B, through AB,AXB OPT– route ½ on AB and.
Strategic Network Formation and Group Formation Elliot Anshelevich Rensselaer Polytechnic Institute (RPI)
Maria-Florina Balcan Approximation Algorithms and Online Mechanisms for Item Pricing Maria-Florina Balcan & Avrim Blum CMU, CSD.
Computational Game Theory
Item Pricing for Revenue Maximization in Combinatorial Auctions Maria-Florina Balcan, Carnegie Mellon University Joint with Avrim Blum and Yishay Mansour.
Bottleneck Routing Games in Communication Networks Ron Banner and Ariel Orda Department of Electrical Engineering Technion- Israel Institute of Technology.
Beyond selfish routing: Network Formation Games. Network Formation Games NFGs model the various ways in which selfish agents might create/use networks.
1 On the price of anarchy and stability of correlated equilibria of linear congestion games By George Christodoulou Elias Koutsoupias Presented by Efrat.
The Price Of Stability for Network Design with Fair Cost Allocation Elliot Anshelevich, Anirban Dasgupta, Jon Kleinberg, Eva Tardos, Tom Wexler, Tim Roughgarden.
On the Price of Stability for Designing Undirected Networks with Fair Cost Allocations Svetlana Olonetsky Joint work with Amos Fiat, Haim Kaplan, Meital.
Stackelberg Scheduling Strategies By Tim Roughgarden Presented by Alex Kogan.
Near Optimal Network Design With Selfish Agents Eliot Anshelevich Anirban Dasupta Eva Tardos Tom Wexler Presented by: Andrey Stolyarenko School of CS,
On the Price of Stability for Designing Undirected Networks with Fair Cost Allocations M.Sc. Thesis Defense Svetlana Olonetsky.
Potential games, Congestion games Computational game theory Spring 2010 Adapting slides by Michal Feldman TexPoint fonts used in EMF. Read the TexPoint.
The Price of Uncertainty Maria-Florina Balcan Georgia Tech Avrim Blum Carnegie Mellon Yishay Mansour Tel-Aviv/Google ACM-EC 2009.
Algorithms and Economics of Networks Abraham Flaxman and Vahab Mirrokni, Microsoft Research.
Network Formation Games. Netwok Formation Games NFGs model distinct ways in which selfish agents might create and evaluate networks We’ll see two models:
1 Introduction to Approximation Algorithms Lecture 15: Mar 5.
Network Formation Games. Netwok Formation Games NFGs model distinct ways in which selfish agents might create and evaluate networks We’ll see two models:
Price of Anarchy Bounds Price of Anarchy Convergence Based on Slides by Amir Epstein and by Svetlana Olonetsky Modified/Corrupted by Michal Feldman and.
Inefficiency of equilibria, and potential games Computational game theory Spring 2008 Michal Feldman.
Constant Price of Anarchy in Network Creation Games via Public Service Advertising Presented by Sepehr Assadi Based on a paper by Erik D. Demaine and Morteza.
1 Issues on the border of economics and computation נושאים בגבול כלכלה וחישוב Congestion Games, Potential Games and Price of Anarchy Liad Blumrosen ©
Experts Learning and The Minimax Theorem for Zero-Sum Games Maria Florina Balcan December 8th 2011.
Guiding dynamics in potential games Avrim Blum Carnegie Mellon University Joint work with Maria-Florina Balcan and Yishay Mansour [Cornell CSECON 2009]
1 Network Creation Game A. Fabrikant, A. Luthra, E. Maneva, C. H. Papadimitriou, and S. Shenker, PODC 2003 (Part of the Slides are taken from Alex Fabrikant’s.
On a Network Creation Game Joint work with Ankur Luthra, Elitza Maneva, Christos H. Papadimitriou, and Scott Shenker.
Beyond Routing Games: Network (Formation) Games. Network Games (NG) NG model the various ways in which selfish users (i.e., players) strategically interact.
The Effectiveness of Stackelberg strategies and Tolls for Network Congestion Games Chaitanya Swamy University of Waterloo.
On a Network Creation Game PoA Seminar Presenting: Oren Gilon Based on an article by Fabrikant et al 1.
Beyond Routing Games: Network (Formation) Games. Network Games (NG) NG model the various ways in which selfish users (i.e., players) strategically interact.
Price of Anarchy Georgios Piliouras. Games (i.e. Multi-Body Interactions) Interacting entities Pursuing their own goals Lack of centralized control Prediction?
1 Intrinsic Robustness of the Price of Anarchy Tim Roughgarden Stanford University.
Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.
Beyond selfish routing: Network Games. Network Games NGs model the various ways in which selfish agents strategically interact in using a network They.
Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 2, August 26 th 2010.
Beyond selfish routing: Network Games. Network Games NGs model the various ways in which selfish users (i.e., players) strategically interact in using.
Improved Equilibria via Public Service Advertising Maria-Florina Balcan TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:
Vasilis Syrgkanis Cornell University
Computational Game Theory: Network Creation Game Arbitrary Payments Credit to Slides To Eva Tardos Modified/Corrupted/Added to by Michal Feldman and Amos.
Correlation Clustering Nikhil Bansal Joint Work with Avrim Blum and Shuchi Chawla.
The Price of Routing Unsplittable Flow Yossi Azar Joint work with B. Awerbuch and A. Epstein.
Network Formation Games. NFGs model distinct ways in which selfish agents might create and evaluate networks We’ll see two models: Global Connection Game.
Network Formation Games. NFGs model distinct ways in which selfish agents might create and evaluate networks We’ll see two models: Global Connection Game.
On a Network Creation Game
Congestion games Computational game theory Fall 2010
On a Network Creation Game
Presented By Aaron Roth
Network Formation Games
Circumventing the Price of Anarchy
The Price of Routing Unsplittable Flow
Network Formation Games
The Price of Uncertainty:
Presentation transcript:

Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Improved Equilibria via Public Service Advertising

Good equilibria, Bad equilibria Many games have both bad and good equilibria. In some places, everyone drives their own car. In some, everybody uses and pays for good public transit.

G Fair cost-sharing Fair cost-sharing: n players in weighted directed graph G. Player i wants to get from s i to t i, and they share cost of edges they use with others.

Fair cost-sharing s t 1n Player i wants to get from s i to t i. All players share cost of edges they use with others. n players in directed graph G, each edge e costs c e. Good equilibrium: all use edge of cost 1. (paying 1/n each) Bad equilibrium: all use edge of cost n. (paying 1 each) Each player wants to minimize his own cost.

Inefficiency of equilibria, PoA and PoS Price of Stability (PoS): ratio of best Nash equilibrium to OPT. Price of Anarchy (PoA): ratio of worst Nash equilibrium to OPT. Significant effort spent on understanding these in CS. [Koutsoupias-Papadimitriou’99] [Anshelevich et. al, 2004] E.g., for fair cost-sharing, PoS is log(n), whereas PoA is n. “Algorithmic Game Theory”, Nisan, Roughgarden, Tardos, Vazirani

Fair Cost Sharing Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e. PoA is n; PoS is log(n). s t 1n PoA is O(n):in any Nash no player pays more than OPT s t 1n PoA is  (n):

Fair Cost Sharing … 1 1/21/n-1 s1s1 snsn t ² PoA is n; PoS is log(n). PoS is  (log(n)): 1/n 0 0 Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e.

Fair Cost Sharing … 1 1/21/n-1 s1s1 snsn t ² PoA is n; PoS is log(n). PoS is  (log(n)): 1/n 0 0 Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e.

Fair Cost Sharing t PoS is  (log(n)): potential function argument Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e. PoA is n; PoS is log(n).

Fair Cost Sharing t where Social cost of is Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e.

Fair Cost Sharing t where Social cost of is Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e. A player moves, change in player’s cost = change in potential Proof: player i moves, get from S to S’; let A be the edges in S but not in S’, and B the edges in S’ but not in S. Its change in cost: Change in the potential

Fair Cost Sharing t where Social cost of is Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e.

Fair Cost Sharing PoS is  (log(n)): potential function argument The potential does not increase & reach a pure Nash of cost · H(n) ¢ OPT. Iterate best-response dynamics starting from an optimal solution [i.e, while there is a player that can improve, pick an arbitrary such player and let him to best response]. Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e. Potential always decreases, finite # of states, so reach a pure Nash.

Congestion games more generally Game defined by n players and m resources. Cost of a resource j is a function f j (n j ) of the number n j of players using it. Each player i chooses a set of resources (e.g., a path) from collection S i of allowable sets of resources (e.g., paths from s i to t i ). Cost incurred by player i is the sum, over all resources being used, of the cost of the resource. Generic potential function: Best-response dynamics always gives an equilibrium.

Congestion games more generally Always have a pure-strategy equilibrium. Have a potential function s.t. whenever a player switches, potential drops by exactly that player’s improvement. Nice general class of games with many players. –Best-response dynamics always gives an equilibrium. But maybe a large gap between the quality of the best and the worst equilibrium. Lots of work on understanding properties of these games and quality of their equilibria.

Good equilibria, Bad equilibria Many games have both bad and good equilibria. In some places, everyone drives their own car. In some, everybody uses and pays for good public transit.

Guiding from Bad to Good Standard motivation for PoS: If a central authority could suggest a low-cost Nash (ride public transit), and everyone followed the suggestion, then this would be stable. Price of Anarchy (PoA): ratio of worst Nash equilibrium to OPT. Price of Stability (PoS): ratio of best Nash equilibrium to OPT. Can a helpful authority encourage (guide) behavior to move from a bad state to a good state?

What if only some  fraction will pay attention? Can the authority guide behavior to a good state? Will it just snap back? How does this depend on  ? Guiding from Bad to Good [Balcan-Blum-Mansour, SODA 2009]

Main Model 1.Authority launches advertising, proposing joint action s ad. 0. n players initially playing some arbitrary equilibrium. … s1s1 snsn t 000 k

Main Model 1.Authority launches advertising, proposing joint action s ad. Each player i follows with probability . Call players that follow receptive players 0. n players initially playing some arbitrary equilibrium. … s1s1 snsn t 000 k

Main Model 1.Authority launches advertising, proposing joint action s ad. 2.Remaining (non-receptive) players fall to some arbitrary equilibrium for themselves, given play of receptive players. 3.All players follow best-response dynamics to an overall Nash equilibrium. potential games, pure Nash eqs. Each player i follows with probability . Call players that follow receptive players Notes: social cost: 0. n players initially playing some arbitrary equilibrium.

Main Results If only a constant fraction  of the players follow the advice, then we can still get within O(1/  ) of the PoS. Extend to cost-sharing + linear delays. (PoS = log(n), PoA = n) (PoS = 1, PoA =  (n 2 )) Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)).

Fair Cost Sharing … s1s1 snsn t 000 k Note: this is best you can hope for. E.g., k =2  n. If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n)

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) Advertiser proposes OPT (any apx also works) random vars Phase 1:

Fair Cost Sharing - Moreover, this option is guaranteed to be at least as good as if other NR players didn’t exist. If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) - In any NE a non-receptive player i, can’t improve by switching to his path P i OPT in OPT. Cost of non-receptive players at the end of Phase 2

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) - In any NE a non-receptive player i, can’t improve by switching to his path P i OPT in OPT. Cost of non-receptive players at the end of Phase 2

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) - In any NE a non-receptive player i, can’t improve by switching to his path P i OPT in OPT. Cost of non-receptive players at the end of Phase 2 - Calculate total cost of these guaranteed options. Rearrange sum...

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) Cost of non-receptive players at the end of Phase 2 Cost of receptive players at the end of Phase 2

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) Cost of non-receptive players at the end of Phase 2 Use: X ~ Bi(n,p) Cost of receptive players at the end of Phase 2

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) Expected total cost at the end of Phase 2: O(OPT/  ). In Phase 3, potential argument shows behavior cannot get worse by more than an additional log(n) factor.

Cost Sharing, Extension - Still get same guarantee, but proof is trickier + linear delays: Problem: can’t argue as if remaining NR players didn’t exist since they add to delays

Cost Sharing, Extension - Still get same guarantee, but proof is trickier - Shadow game wrt non-receptieve players: pure linear latency fns. Offset defined by equilib. at end of phase 2. # users on e at end of phase 2 - This game has good PoA (5/2). + linear delays:

Cost Sharing, Extension - Still get same guarantee, but proof is trickier - Shadow game: pure linear latency fns - Behavior of NR at end of phase 2 is equilib for this game too. - Show Cost of the of nonreceptive players at the end of step 2: O(OPT/  ). + linear delays:

Cost Sharing, Extension - Still get same guarantee, but proof is trickier Need to still argue about the cost of the receptive players. Edge by edge charging: - more receptive players, loose a factor of two compared to OPT - more non-receptive players, already paid for, loose a factor of two Cost of the of nonreceptive players at the end of step 2: O(OPT/  ). + linear delays:

Party affiliation games Given graph G, each edge labeled + or -. Vertices have two actions: RED or BLUE. Pay 1 for each + edge with endpoints of different color, and each – edge with endpoints of same color. Special cases: All + edges is consensus game. All – edges is cut-game.

Party affiliation games OPT is an equilibrium so PoS = 1. But even for consensus games, PoA =  (n 2 ) Clique with perfect matching removed all edges labeled plus

Party affiliation games (PoS = 1, PoA =  (n 2 )) - Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)). - Same example as for consensus PoA, but sparser across cut. (lower bound) Degree ° n/8 across cut, ° =1/2- ®

Party affiliation games (PoS = 1, PoA =  (n 2 )) - Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)). - Same example as for consensus PoA, but sparser across cut. For large n, whp all nodes have at most a 1/2- ° /2 fraction on neighbs in R Initially, each node has a ° /4 fraction on nodes of the other color. So, players “locked” into place Degree ° n/8 across cut, ° =1/2- ® (lower bound)

Party affiliation games (upper bound, consensus games) - Advertising strategy = follow OPT, e.g. all red. - By Hoeffding, all nodes with degree log n/( ® -1/2) 2 have more than half of their neighbors in the set R, with prob. 1-1/n. - At the end of step two, all nodes are red. (PoS = 1, PoA =  (n 2 )) - Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)). Note: for general cut games, OPT might not have zero cost for each player.

Party affiliation games - Split nodes into those incurring low-cost vs those incurring high-cost under OPT. (upper bound, general party affiliation games) - Advertising strategy = follow OPT. - Show that low-cost will switch to behavior in OPT. For high-cost, don’t care. - Cost only improves in final best-response process. (PoS = 1, PoA =  (n 2 )) - Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)).

Party affiliation games S is a ¯ -dominating if every vertex not in S has more than a ½+ ¯ fraction of neighbs in S. If ® > ½+2 ¯, then set R of receptive players is ¯ -dominating whp (PoS = 1, PoA =  (n 2 )) - Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)). Split nodes into those incurring low-cost (less than a ¯ -fraction of incident edges incur a cost in OPT) vs those incurring high- cost under OPT. Low-cost will switch to behavior in OPT. For high-cost, can only incur a cost of only 1/ ¯ more their cost in OPT. (upper bound, general party affiliation games)

Summary Analyze ability of a central authority to guide behavior to a good equilibrium even if only ® fraction of players are paying attention.

Influencing Dynamics Play Best Response Play the Advertised Behavior Each player has a few abstract actions. Expert 1Expert 2 Uses a learning, experts based alg. to decide which one to use A more adaptive model [Balcan Blum Mansour, ICS 2010] [no rigid separation between receptive vs non-receptive players]

Open Questions Get around problem of natural dynamics converging to poor equilibrium without central authority by giving players more information about the game?