Best Reply Mechanisms Justin Thaler and Victor Shnayder.

Slides:

Advertisements

Similar presentations

6.896: Topics in Algorithmic Game Theory Lecture 21 Yang Cai.

Advertisements

6.896: Topics in Algorithmic Game Theory Lecture 20 Yang Cai.

Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.

Price Of Anarchy: Routing

Bipartite Matching, Extremal Problems, Matrix Tree Theorem.

This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.

Congestion Games with Player- Specific Payoff Functions Igal Milchtaich, Department of Mathematics, The Hebrew University of Jerusalem, 1993 Presentation.

Game Theory and Computer Networks: a useful combination? Christos Samaras, COMNET Group, DUTH.

Game Theory 1. Game Theory and Mechanism Design Game theory to analyze strategic behavior: Given a strategic environment (a “game”), and an assumption.

EC941 - Game Theory Lecture 7 Prof. Francesco Squintani

Regret Minimization and the Price of Total Anarchy Paper by A. Blum, M. Hajiaghayi, K. Ligett, A.Roth Presented by Michael Wunder.

Noam Nisan, Michael Schapira, Gregory Valiant, and Aviv Zohar.

Seminar In Game Theory Algorithms, TAU, Agenda  Introduction  Computational Complexity  Incentive Compatible Mechanism  LP Relaxation & Walrasian.

Prepared by Ilya Kolchinsky.  n generals, communicating through messengers  some of the generals (up to m) might be traitors  all loyal generals should.

1 Interdomain Routing and Games Hagay Levin, Michael Schapira and Aviv Zohar The Hebrew University.

Dynamic Games of Complete Information.. Repeated games Best understood class of dynamic games Past play cannot influence feasible actions or payoff functions.

EC941 - Game Theory Prof. Francesco Squintani Lecture 8 1.

Bundling Equilibrium in Combinatorial Auctions Written by: Presented by: Ron Holzman Rica Gonen Noa Kfir-Dahav Dov Monderer Moshe Tennenholtz.

Game Theoretic and Economic Perspectives on Interdomain Routing Michael Schapira Yale University and UC Berkeley.

1 Best-Reply Mechanisms Noam Nisan, Michael Schapira and Aviv Zohar.

Agent Technology for e-Commerce Chapter 10: Mechanism Design Maria Fasli

Beyond selfish routing: Network Formation Games. Network Formation Games NFGs model the various ways in which selfish agents might create/use networks.

Interdomain Routing and Games Michael Schapira Joint work with Hagay Levin and Aviv Zohar האוניברסיטה העברית בירושלים The Hebrew University of Jerusalem.

On the Stability of Rational, Heterogeneous Interdomain Route Selection Hao Wang Yale University Joint work with Haiyong Xie, Y. Richard Yang, Avi Silberschatz,

Interdomain Routing as Social Choice Ronny R. Dakdouk, Semih Salihoglu, Hao Wang, Haiyong Xie, Yang Richard Yang Yale University IBC ’ 06.

Near-Optimal Network Design with Selfish Agents By Elliot Anshelevich, Anirban Dasgupta, Eva Tardos, Tom Wexler STOC’03 Presented by Mustafa Suleyman CIFTCI.

When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)

The Strategic Justification for BGP Hagay Levin, Michael Schapira, Aviv Zohar.

Extensive Game with Imperfect Information Part I: Strategy and Nash equilibrium.

Reshef Meir School of Computer Science and Engineering Hebrew University, Jerusalem, Israel Joint work with Maria Polukarov, Jeffery S. Rosenschein and.

Mechanism Design Traditional Algorithmic Setting Mechanism Design Setting.

On Bounded Rationality and Computational Complexity Christos Papadimitriou and Mihallis Yannakakis.

Network Formation Games. Netwok Formation Games NFGs model distinct ways in which selfish agents might create and evaluate networks We’ll see two models:

UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.

Network Formation Games. Netwok Formation Games NFGs model distinct ways in which selfish agents might create and evaluate networks We’ll see two models:

Game Dynamics Out of Sync Michael Schapira (Yale University and UC Berkeley) Joint work with Aaron D. Jaggard and Rebecca N. Wright.

Inefficiency of equilibria, and potential games Computational game theory Spring 2008 Michal Feldman.

Computing Equilibria Christos H. Papadimitriou UC Berkeley “christos”

Mechanisms for Making Crowds Truthful Andrew Mao, Sergiy Nesterko.

CPS 173 Mechanism design Vincent Conitzer

Presenter: Jen Hua Chi Adviser: Yeong Sung Lin Network Games with Many Attackers and Defenders.

Nash equilibrium Nash equilibrium is defined in terms of strategies, not payoffs Every player is best responding simultaneously (everyone optimizes) This.

Transit price negotiation: repeated game approach Sogea 23 Mai 2007 Nancy, France D.Barth, J.Cohen, L.Echabbi and C.Hamlaoui

More on Social choice and implementations 1 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A Using slides by Uri.

6.853: Topics in Algorithmic Game Theory Fall 2011 Constantinos Daskalakis Lecture 21.

Mechanism Design CS 886 Electronic Market Design University of Waterloo.

Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks BGP.

Dynamic Games & The Extensive Form

Moshe Tennenholtz, Aviv Zohar Learning Equilibria in Repeated Congestion Games.

Chapters 29, 30 Game Theory A good time to talk about game theory since we have actually seen some types of equilibria last time. Game theory is concerned.

August 16, 2010 MPREF’10 Dynamic House Allocation Sujit Gujar 1, James Zou 2 and David C. Parkes 2 5 th Multidisciplinary Workshop on Advances in Preference.

1 The Price of Defense M. Mavronicolas , V. Papadopoulou , L. Michael ¥, A. Philippou , P. Spirakis § University of Cyprus, Cyprus  University of Patras.

Network Congestion Games

Beyond selfish routing: Network Games. Network Games NGs model the various ways in which selfish agents strategically interact in using a network They.

Mechanism Design II CS 886:Electronic Market Design Sept 27, 2004.

Beyond selfish routing: Network Games. Network Games NGs model the various ways in which selfish users (i.e., players) strategically interact in using.

1 What is Game Theory About? r Analysis of situations where conflict of interests is present r Goal is to prescribe how conflicts can be resolved 2 2 r.

Mechanism Design with Strategic Mediators Moran Feldman EPFL Joint work with: Moshe Babaioff, Microsoft Research Moshe Tennenholtz, Technion.

Vasilis Syrgkanis Cornell University

MAIN RESULT: We assume utility exhibits strategic complementarities. We show: Membership in larger k-core implies higher actions in equilibrium Higher.

Market Design and Analysis Lecture 2 Lecturer: Ning Chen ( 陈宁 )

Network Formation Games. NFGs model distinct ways in which selfish agents might create and evaluate networks We’ll see two models: Global Connection Game.

Network Formation Games. NFGs model distinct ways in which selfish agents might create and evaluate networks We’ll see two models: Global Connection Game.

COS 561: Advanced Computer Networks

Hao Wang Yale University Joint work with

Communication Complexity as a Lower Bound for Learning in Games

Chapter 5. Optimal Matchings

Instructor: Shengyu Zhang

Presented By Aaron Roth

Information, Incentives, and Mechanism Design

Presentation transcript:

Best Reply Mechanisms Justin Thaler and Victor Shnayder

What are best-reply dynamics? Start with an arbitrary strategy profile In each step let some player switch his strategy to be a best reply to the current strategies of the others.

What are best-reply dynamics? Definition: A repeated-reply mechanism for a private info game G: Extensive form game with perfect recall (same players) At most M steps. In each step: A single player announces an element of A i Players play in round-robin order Stop when all players “pass” in n consecutive steps. Enforce action proﬁle of the most recently announced actions If M steps go by without stopping, penalize the players.

What are best-reply dynamics? Need a penalty to ensure non-convergence is not in best interest of any player. Realistic modeling assumption for BGP, TCP, etc. Best-reply dynamics is the strategy profile of a repeated-reply mechanism in which each player i updates to i’s best-reply to the other players’ strategies each time it is i’s turn.

Why best reply dynamics? If convergence occurs, we have a highly justifiable Nash Equilibrium Computationally simple Players only need private information Feasible in distributed, asynchronous settings Prescribed by existing protocols (Ex: BGP)

Why best reply dynamics? In light of Theorems 1 and 2 (which we’ll see soon): Often gives a non-VCG way of creating incentive compatible mechanisms (?). And sometimes without $$$. Often get collusion-proofness, Pareto- efficiency

Outline When do best reply dynamics work? Universal max-solvability (UMS) Thm: UMS implies convergence to unique NE, collusion-proofness Example applications (correlated markets, BGP, etc) Connections to strategy-proofness Discussion

Universal max- dominance A subset T of S is universally max- dominated if: Very strong condition! Existence of max-dominated set is strictly stronger than existence of dominated strategy. Exists s i, s i ’ s.t. u i (s i, s -i ) < u i (s i ’, s -i ) for all s -i

Universal max- solveability (UMS) A game G is universally max-solvable if we can iteratively remove universally max- dominated strategy sets and get to a single strategy for each player. Stronger condition than solvable by iterated removal of strictly dominated strategies (IRSDS)

Example 1 5, 50, 0 10, 04, 4 Solvable by IRSDS, but not UMS. Neither player has a universally max-dominated set. Note unique NE is not PE, and best-reply dynamics are not incentive compatible for the row player.

Example 2 0, 1 1, 11, 0 UMS

Example 2 0, 1 1, 11, 0 UMS

Example 2 0, 1 1, 11, 0 UMS

Example 3 (UMS) 1, 92, 9 3, 13, 2 3, 14, 35, 4 L M R A C B

Example 3 (UMS) 1, 92, 9 3, 13, 2 3, 14, 35, 4 L M R A C B

Example 3 (UMS) 1, 92, 9 3, 13, 2 3, 14, 35, 4 L M R A C B

Example 3 (UMS) 1, 92, 9 3, 13, 2 3, 14, 35, 4 L M R A C B

Example 3 (UMS) 1, 92, 9 3, 13, 2 3, 14, 35, 4 L M R A C B

Theorems Theorem 1: G is UMS ⇒ G has unique, pure NE, and it is collusion-proof. Corollary: Collusion-proof NE ⇒ NE is Pareto optimal Theorems Note that solvable by IRSDS suffices for unique, pure NE. UMS is needed for collusion-proofness and PE.

Proof of theorem 1: By contradiction: G is UMS, so fix an elimination sequence of dominated strategy-sets. Let s* be the final strategy profile. If s* is not collusion proof NE, some set of players T can deviate and be better off. Let s be new strategies where players in T change strategy from s* Let s i be first strategy eliminated. Then it was max-dominated, so s i * is strictly better, so i can’t be better off.

Example 1 5, 50, 0 10, 04, 4 Solvable by IRSDS, but not UMS. Neither player has a universally max-dominated set. Note unique NE is not PE, and best-reply dynamics are not incentive compatible for the row player.

Theorems Theorem 2: If G is UMS with private information, then best reply dynamics are incentive-compatible in ex-post NE, and converge to the unique NE of the induced full-information game. Theorems Proof: Similar to Theorem 1. The main idea is that a strategy eliminated in the t‘th step of the UMS elimination process can never be used after the nt’th step of the best-reply mechanism.

Correlated two-sided markets Agents: buyers and sellers Game: weighted bipartite graph -- buyers on one side, sellers on the other Buyers have preference order over sellers (higher edge weight = higher preference) Sellers prefer buyers connected by heavier edges

Correlated two-sided markets are UMS Let e be maximum weight edge. Choosing it universally max-dominates all other strategies of both endpoints. Remove the two endpoints of e and all incident edges, repeat. Therefore, best reply dynamics converge to ex-post NE.

Extended Example: BGP

Internet routing: BGP Receive update messages from neighbours announcing routes to d. Choose a single neighbor, whose route you prefer most, to send tra ﬃ c through. Announce your new route to all your neighbors d12 12d 1d 21d 2d

Internet routing: BGP BGP is asynchronous, distributed Prescribes best-reply dynamics But does BGP converge? And is BGP “incentive compatible”? Do ASes have an incentive to deviate from the protocol?

Does BGP Converge? We can break this into two questions: Does a stable solution even exist in the static game? If so, will BGP find such a solution? But we only need one answer.

Does a Stable Solution Exist? d123 13d 1d 21d 2d 32d 3d No stable solution exists! It is actually NP- complete to determine existence in general networks

Does BGP Converge When A Stable Solution Exists? d12 12d 1d 21d 2d Notice that multiple NE exist. And asynchronous best-reply dynamics do not necessarily converge. So must not be UMS.

So What Do We Do? Approach #1: Use mechanism design to achieve IC convergence, but solution must be distributed. Approach #2: Identify conditions (on network topology and/or AS preferences) under which BGP converges and is IC. Both approaches are canonical problems in Distributed Algorithmic Mechanism Design.

Approach #2 for Convergence Griffin et al. (1999): If BGP fails to converge, then there exists a Dispute Wheel. Each u i would rather route clockwise through u i+1 than Q i Image Source: Levin et al. “Internet Routing and Games,” 2008.

Approach #2 for Convergence Gao and Rexford (2001): Identified reasonable conditions based on economic structure of the Internet that guarantee No Dispute Wheel and hence convergence. (No bounds on convergence rate given). But limited progress made until recently on conditions for guaranteeing that BGP is IC.

Approach #2 for Incentive Compatibility Theorem 3: Assuming non-convergence after n 3 rounds is a penalty, and No Dispute Wheel holds, then routing games are UMS. Corollary: Under the above conditions, best- reply strategies are IC in collusion-proof ex-post NE. Corollary: Under the Gao-Rexford conditions, BGP converges in O(n 3 ) time and is IC.

Theorem 3 Proof sketch: The case of finding the first universally max-dominated action set is general. Find a node a 1 with at least 2 actions. Let R be a 1 ’s most preferred existing route. One of two cases must occur:

Theorem 3 1.Every node a 2 on R prefers the suffix of R leading from a 2 to d. In this case, if u is the closest node to d on R with at least two actions, then (u, d) universally max- dominates all other actions of u, and we’re done. 2. Some node a 2 on R prefers some other path over the suffix of R leading from a 2 to d. In this case, we repeat the analysis at a 2. Eventually we either form a dispute wheel or find ourselves in Case 1.

What’s left in Routing? Complete characterization of BGP convergence (No Dispute Wheel sufficient, not necessary). Conditions for convergence to globally optimal solution. Can it even be efficiently found? Do mechanism design and/or $$$ have a role to play? Changes in network topology?

Other applications Congestion control Criticism: Best-reply dynamics are only somewhat descriptive of how TCP works in practice. Cost sharing games Matching games (stable-roommate, intern assignment) Auctions (unit demand bidders, GSP) Relies a lot on VCG results Main contribution is proof of convergence! (opposite of BGP)

Relationship to DSIC Outcomeθ Ex-postNE Play s(θ) Given UMS game, best-replying is a strategy that gives ex-post NE. Get a direct-revelation, dominant strategy IC mechanism. Good: New way to create DSIC mechanisms. Bad: Impossibility results limit the class of problems amenable to this approach (at least without money or limits on preferences).

Discussion What is the main contribution? 1. Sufficient conditions for IC convergence of best-reply dynamics. General enough to encompass many applications, esp. BGP. 2. Bounds on time to convergence. 3. New framework for developing IC mechanisms?

Next Steps 1. Necessary conditions for best-reply dynamics to converge? To be IC (under what definition?)? 2. Better-reply dynamics? Other types of dynamics aka algorithms? What types of dynamics are reasonable or “natural”?

Economists and Complexity See recent blog post by Noam Nisan: Does complexity of equilibria matter? Kamal Jain: “If your laptop can’t find it then neither can the market“. Jeff Ely: “Solving the n-body problem is beyond the capabilities of the world’s smartest mathematicians. How do those rocks-for-brains planets manage to do pull it off?“