Presentation is loading. Please wait.

Presentation is loading. Please wait.

Techniques for Computing Game-Theoretic Solutions Vincent Conitzer Duke University Parts of this talk are joint work with Tuomas Sandholm.

Similar presentations


Presentation on theme: "Techniques for Computing Game-Theoretic Solutions Vincent Conitzer Duke University Parts of this talk are joint work with Tuomas Sandholm."— Presentation transcript:

1 Techniques for Computing Game-Theoretic Solutions Vincent Conitzer Duke University Parts of this talk are joint work with Tuomas Sandholm

2 Introduction Increasingly, computer science is confronted with settings where multiple, self-interested parties interact –network routing –job scheduling –electronic commerce –e-government –…–… Parties can be humans or computers/software agents A B C D job 1 job 2job 3 machine 1 machine 2 v( ) = $500 v( ) = $700 > > > >

3 Where do we face difficult computational problems? Running the mechanism under which the agents interact Computing how the agents should act under the mechanism (strategically) Computing which mechanisms give the best outcomes (under strategic behavior) this talk

4 Game theory

5 Rock-paper-scissors 0, 0-1, 11, -1 0, 0-1, 1 1, -10, 0

6 The presentation game Pay attention Do not pay attention Put effort into presentation Do not put effort into presentation 4, 4-16, -14 0, -20, 0

7 “Should I buy an SUV?” (aka. Prisoner’s Dilemma) -10, -10-7, -11 -11, -7-8, -8 cost: 5 cost: 3 cost: 5 cost: 8cost: 2 dominance

8 Nash equilibrium Nash equilibrium: a strategy for each player so that no player has an incentive to change strategies 4, 4-16, -14 0, -20, 0 A NA ENE 0, 0-1, 11, -1 0, 0-1, 1 1, -10, 0 Two Nash equilibria No Nash equilibria! Really? Can a game have no Nash equilibria?

9 Nash equilibrium… Mixed strategy: a probability distribution over (pure) strategies 4, 4-16, -14 0, -20, 0 -1, 11, -1 0, 0-1, 1 1, -10, 0 A third Nash equilibrium One Nash equilibrium 4/5 1/5 1/10 9/10 1/3 At least one Nash equilibrium always exists (in finite games) [Nash 50]

10 How do we compute solutions? Computing dominance is easy (and many, though not all, variants are as well [C. & Sandholm EC05] ) Computing Nash equilibria is harder…

11 A useful reduction (SAT -> game) [C. & Sandholm IJCAI03/extended draft] Formula:(x 1 or -x 2 ) and (-x 1 or x 2 ) Solutions: x 1 =true, x 2 =true x 1 =false,x 2 =false Game: x1x1 x2x2 +x 1 -x 1 +x 2 -x 2 (x 1 or -x 2 )(-x 1 or x 2 )default x1x1 -2,-2 0,-2 2,-2 -2,-2 0,1 x2x2 -2,-2 2,-2 0,-2 -2,-2 0,1 +x 1 -2,0-2,21,1-2,-21,1 -2,0-2,20,1 -x 1 -2,0-2,2-2,-21,1 -2,2-2,00,1 +x 2 -2,2-2,01,1 -2,-2-2,2-2,00,1 -x 2 -2,2-2,01,1 -2,-21,1-2,0-2,20,1 (x 1 or -x 2 ) -2,-2 0,-22,-2 0,-2-2,-2 0,1 (-x 1 or x 2 ) -2,-2 2,-20,-2 2,-2-2,-2 0,1 default 1,0 ε, ε Every satisfying assignment (if there are any) corresponds to an equilibrium with utilities 1, 1 Exactly one additional equilibrium with utilities ε, ε that always exists

12 What about just computing one (any) Nash equilibrium? Complexity was completely open for a long time –[Papadimitriou STOC01]: “together with factoring […] the most important concrete open question on the boundary of P today” Recent sequence of papers shows that computing one (any) Nash equilibrium is PPAD-complete [Daskalakis, Goldberg, Papadimitriou 05; Chen, Deng 05] –Just as hard in symmetric games [C. 03/Tardos 03] All known algorithms require exponential time (in the worst case)

13 Search-based approaches Suppose we know the support X i of each player i’s mixed strategy in equilibrium Then, we have a simple linear feasibility problem: –for both i, for any s i  X i, Σp -i (s -i )u i (s i, s -i ) = u i –for both i, for any s i  S i - X i, Σp -i (s -i )u i (s i, s -i ) ≤ u i Thus, we can search over supports –This is the basic idea underlying methods in [Dickhaut & Kaplan 91; Porter, Nudelman, Shoham AAAI04; Sandholm, Gilpin, C. AAAI05] Dominated strategies can be eliminated

14 A class of hard games [Sandholm, Gilpin, C. AAAI05] 0, 20, 33, 00, 00, 20, 00, 2 0, 0 0, 30, 23, 00, 2 2, 42, 0 4, 22, 03, 3 2, 0 2, 42, 04, 2 0, 23, 00, 30, 00, 20, 00, 2 0, 0 3, 00, 20, 30, 2 4, 22, 0 3, 32, 02, 4 1/3 0000 0 0 0 0

15 Eliminability concepts Dominance: strategy always does worse than some other (mixed) strategy - strong argument - local reasoning - easy to compute - often does not apply Nash equilibrium: strategy does not appear in support of any Nash equilibrium -weaker argument - global reasoning - hard to compute - applies more often 3, 22, 3 3, 2 4, 00, 1.5 0 Is there something “in between” that combines good aspects of both? Yes! [C. & Sandholm AAAI05] 3, 22, 3 3, 2 2, 02, 1.5

16 Definition as game between attacker and defender Stage 1: Defender specifies probabilities on E strategies (e r * must get > 0) 3, 00, 30, 2 sr4sr4 0, 33, 00, 2 sr3sr3 2, 0 2, 2 sr2sr2 2, 0 2, 2 sr1sr1 sc4sc4 sc3sc3 sc2sc2 sc1sc1 0.4 0.3 0.50.4 Stage 2: Attacker chooses one of the E strategies with positive probability to attack and chooses (possibly mixed) attacking strategy 0.50.4 attacked attacking Stage 3: Defender chooses on which (non-E) strategy to place the remainder of the probability –If attacking outperforms attacked, attacker wins attacked attacking 0.50.40.1 e r * = s r 3, E r = {s r 3, s r 4 }, E c = {s c 3, s c 4 } 3, 00, 30, 2 sr4sr4 0, 33, 00, 2 sr3sr3 2, 0 2, 2 sr2sr2 2, 0 2, 2 sr1sr1 sc4sc4 sc3sc3 sc2sc2 sc1sc1 3, 00, 30, 2 sr4sr4 0, 33, 00, 2 sr3sr3 2, 0 2, 2 sr2sr2 2, 0 2, 2 sr1sr1 sc4sc4 sc3sc3 sc2sc2 sc1sc1

17 A spectrum of elimination power The larger the E i sets, the more strategies are eliminable If the E i sets include all strategies, then a strategy is eliminable if and only if no Nash equilibrium places positive probability on it If the E i sets are empty (with the exception of e r *) then e r * is eliminable if and only if it is dominated dominance Nash equilibrium larger E i sets

18 Alternative definition Stage 1: Defender specifies probabilities on E sets (e r * must get > 0) 0.4 0.3 0.50.4 Stage 2: Attacker chooses one of the E strategies with positive probability to attack Stage 3: Defender distributes the remainder of the probability (not on E) attacked 0.50.4 attacked 0.50.4 Stage 4: Attacker chooses attacking strategy –If attacking outperforms attacked, attacker wins 0.05 attacked 0.50.40.05 attacking e r * = s r 3, E r = {s r 3, s r 4 }, E c = {s c 3, s c 4 }

19 Equivalence Theorem. The alternative definition is equivalent to the original one. Proof based on duality (more specifically, Minimax Theorem [von Neumann 1927] )

20 Mixed integer programming approach (using alternative definition) Continuous variables: p i (e i ), p i e -i (s i ), binary: b i (e i ) maximize p r (e r *) subject to –for both i, for any e i  E i, Σp -i (e -i ) + Σp -i e i (s -i ) = 1 –for both i, for any e i  E i, p i (e i ) ≤ b i (e i ) –for both i, for any e i  E i and any d i  S i, Σp -i (e -i )(u i (e i, e -i )-u i (d i, e -i )) + Σp -i e i (s -i )(u i (e i, s -i )-u i (d i, s -i )) ≥ (b i (e i )-1)U i U i is the maximum difference between two of player i’s utilities Number of binary variables = |E r | + |E c | –Exponential only in this!

21 Eliminating strategies in the hard game 0, 20, 33, 00, 00, 20, 00, 2 0, 0 0, 30, 23, 00, 2 2, 42, 0 4, 22, 03, 3 2, 0 2, 42, 04, 2 0, 23, 00, 30, 00, 20, 00, 2 0, 0 3, 00, 20, 30, 2 4, 22, 0 3, 32, 02, 4 1/3 0000 0 0 0 0 ErEr EcEc

22 Another preprocessing technique for computing a Nash equilibrium [C. & Sandholm AAMAS06] a l, d ml …a 2, d m2 a 1, d m1 ……… a l, d 2l …a 2, d 22 a 1, d 21 a l, d 1l …a 2, d 12 a 1, d 11 c kn, b k …c k2, b k c k1, b k ……… c 2n, b 2 …c 22, b 2 c 21, b 2 c 1n, b 1 …c 12, b 1 c 11, b 1 G π r, π c a l, Σ i p G (s i )d 1l …a 2, Σ i p G (s i )d i2 a 1,Σ i p G (s i )d i1 Σ j p G (t j )c kj, b k … Σ j p G (t j )c 2j, b 2 Σ j p G (t j )c 1j, b 1 G H H

23 Required structure on original game O a l, d ml …a 2, d m2 a 1, d m1 smsm ………… a l, d 2l …a 2, d 22 a 1, d 21 s2s2 a l, d 1l …a 2, d 12 a 1, d 11 s1s1 c kn, b k …c k2, b k c k1, b k ukuk ………… c 2n, b 2 …c 22, b 2 c 21, b 2 u2u2 c 1n, b 1 …c 12, b 1 c 11, b 1 u1u1 tntn …t2t2 t1t1 vlvl …v2v2 v1v1 That is: against any fixed v j, all the s i give the row player the same utility a j against any fixed u i, all the t j give the column player the same utility b i H G

24 Solve for equilibrium of G (recursively) smsm … s2s2 s1s1 tntn …t2t2 t1t1 Obtain –Equilibrium distributions p G (s i ), p G (t j ) –Player’s expected payoffs in equilibrium π r, π c G

25 Reduced game R π r, π c a l, Σ i p G (s i )d 1l …a 2, Σ i p G (s i )d i2 a 1,Σ i p G (s i )d i1 s Σ j p G (t j )c kj, b k ukuk …… Σ j p G (t j )c 2j, b 2 u2u2 Σ j p G (t j )c 1j, b 1 u1u1 tvlvl …v2v2 v1v1 Expected payoffs when row player plays the equilibrium of G, column player plays v i Expected payoffs when both players play the equilibrium of G Theorem. p R (u i ), p R (s)p G (s i ); p R (v j ), p R (t)p G (t j ) constitutes a Nash equilibrium of original game. H

26 Example v1v1 t1t1 t2t2 u1u1 2, 20, 32, 3 s1s1 1, 24, 00, 4 s2s2 1, 40, 44, 0 t1t1 t2t2 s1s1 0, 4 s2s2 4, 0 0.5 v1v1 t u1u1 2, 21, 3 s 2, 2 0.5 0.25

27 A more difficult example = the game that we solved before! v 1 = b 2 t 1 = b 1 t 2 = b 3 u 1 = a 2 2, 20, 32, 3 s 1 = a 1 1, 24, 00, 4 s 2 = a 3 1, 40, 44, 0 b1b1 b2b2 b3b3 a1a1 1, 20, 4 a2a2 0, 32, 22, 3 a3a3 0, 41, 44, 0 But how (in general) do we find the correct labeling of the strategies as u i, s i, v j, t j ? Can it be done in polynomial time?

28 Let’s try to use satisfiability b1b1 b2b2 b3b3 a1a1 4, 01, 20, 4 a2a2 0, 32, 22, 3 a3a3 0, 41, 44, 0 Say that v(σ) = true if we label σ as one of the s i or t j (that is, we put it “in” G) If a 1, a 2 are both in G, then b 1 must also be in G because a 1, a 2 get different payoffs against b 1 Equivalently, v(a 1 ) and v(a 2 )  v(b 1 ) –or (-v(a 1 ) or -v(a 2 ) or v(b 1 )) Theorem: satisfaction of all such clauses  the condition is satisfied

29 Clauses for the example b1b1 b2b2 b3b3 a1a1 4, 01, 20, 4 a2a2 0, 32, 22, 3 a3a3 0, 41, 44, 0 v(a 1 ) and v(a 2 )  v(b 1 ) and v(b 2 ) and v(b 3 ) v(a 1 ) and v(a 3 )  v(b 1 ) and v(b 3 ) v(a 2 ) and v(a 3 )  v(b 2 ) and v(b 3 ) v(b 1 ) and v(b 2 )  v(a 1 ) and v(a 2 ) v(b 1 ) and v(b 3 )  v(a 1 ) and v(a 3 ) v(b 2 ) and v(b 3 )  v(a 1 ) and v(a 2 ) and v(a 3 ) Complete characterization of solutions: –Set at most one variable to true for each player (does not reduce game) –Set all variables to true (G = whole game!) –Only nontrivial solution: set v(a 1 ), v(a 3 ), v(b 1 ), v(b 3 ) to true

30 Simple algorithm Algorithm to find nontrivial solution: –Start with any two variables for the same agent set to true –Follow the implications –If all variables set to true, start with next pair of variables

31 Solving the example with the algorithm (pass 1) b1b1 b2b2 b3b3 a1a1 4, 01, 20, 4 a2a2 0, 32, 22, 3 a3a3 0, 41, 44, 0 v(a 1 ) and v(a 2 )  v(b 1 ) and v(b 2 ) and v(b 3 ) v(a 1 ) and v(a 3 )  v(b 1 ) and v(b 3 ) v(a 2 ) and v(a 3 )  v(b 2 ) and v(b 3 ) v(b 1 ) and v(b 2 )  v(a 1 ) and v(a 2 ) v(b 1 ) and v(b 3 )  v(a 1 ) and v(a 3 ) v(b 2 ) and v(b 3 )  v(a 1 ) and v(a 2 ) and v(a 3 ) Variables set to true: v(a 1 ) v(a 2 ) v(a 3 ) v(b 1 )v(b 2 )v(b 3 )

32 Solving the example with the algorithm (pass 2) b1b1 b2b2 b3b3 a1a1 4, 01, 20, 4 a2a2 0, 32, 22, 3 a3a3 0, 41, 44, 0 v(a 1 ) and v(a 2 )  v(b 1 ) and v(b 2 ) and v(b 3 ) v(a 1 ) and v(a 3 )  v(b 1 ) and v(b 3 ) v(a 2 ) and v(a 3 )  v(b 2 ) and v(b 3 ) v(b 1 ) and v(b 2 )  v(a 1 ) and v(a 2 ) v(b 1 ) and v(b 3 )  v(a 1 ) and v(a 3 ) v(b 2 ) and v(b 3 )  v(a 1 ) and v(a 2 ) and v(a 3 ) Variables set to true: v(a 1 ) v(a 3 )v(b 1 )v(b 3 )

33 Algorithm complexity Theorem. Requires at most O((#rows+#columns) 4 ) clause applications –That is, quadratic if the game is square Can improve in practice by caching previous results

34 Preprocessing the hard game 2, 44, 23, 3 2, 44, 2 3, 32, 4 0, 21.5, 1.50, 2 2, 42, 04, 23, 3 2, 02, 44, 2 2, 03, 32, 4 0, 20, 33, 00, 00, 20, 00, 2 0, 0 0, 30, 23, 00, 2 2, 42, 0 4, 22, 03, 3 2, 0 2, 42, 04, 2 0, 23, 00, 30, 00, 20, 00, 2 0, 0 3, 00, 20, 30, 2 4, 22, 0 3, 32, 02, 4 0, 33, 0 0, 3 3, 00, 0 1.5, 1.5 3, 00, 30, 0 1.5, 1.50, 0 1.5, 1.5 0, 33, 0 0, 3 3, 00, 0 0, 33, 0 0, 30, 0 3, 00, 3 1/2 1/3 1 1 0 0

35 Another game 2, 14, 0 1, 03, 1 dominates

36 What if player 1 commits first? 2, 14, 0 1, 03, 1 0 1 1 0

37 What if player 1 commits first? 2, 14, 0 1, 03, 1 1/2 (- ε) 1/2 (+ ε) 1 0

38 Computing optimal mixed strategies to commit to [C. & Sandholm EC06] For every t, solve: maximize Σ s p s u l (s, t) subject to for all t’, Σ s p s u f (s, t) ≥ Σ s p s u f (s, t’) Σ s p s = 1 Choose solution with highest objective

39 Example solve maximize 2p Up + 1p Down subject to 1p Up ≥ 1p Down p Up + p Down = 1 solution: p Up = 1, p Down = 0, objective = 2 maximize 4p Up + 3p Down subject to 1p Down ≥ 1p Up p Up + p Down = 1 solution: p Up =.5, p Down =.5, objective = 3.5 2, 14, 0 1, 03, 1

40 Optimal computer player for “Liar’s Dice” games “accept” “9” “10” “bluff” (One variant of) Liar’s Dice: –Player rolls some number of dice under cup, peeks –Makes claim about total score (can lie) –Next player can accept or call bluff –If next player accepts, has to claim higher number in her turn “5” “accept” “7” “bluff” Red player wins

41 Conclusions To act strategically (according to game theory), we need algorithms for computing game-theoretic solutions In computer science, we often get to design the game, too –Network protocols, e-commerce mechanisms, … Many important computational questions here –Even just running the mechanism can be hard –Finding the optimal mechanism (given strategic behavior) is even harder Thank you for your attention!

42 Other game theory topics 4, 02, 1 3, 11, 0 10 1/2 - ε 1/2 + ε Computing optimal strategies under sequential commitment [C. & Sandholm ACM-EC06] Learning in games –First algorithm that converges to Nash equilibrium in self-play and to best-response against fixed player in general games [C. & Sandholm ICML03a] –Framework for assessing the cost of not knowing (initially) part of the game being played [C. & Sandholm ICML03b] –Lower bounds on convergence time for learning algorithms [C. & Sandholm ICML04] Computing solutions in cooperative game theory [C. & Sandholm AIJ06/IJCAI03b, AAAI04; Yokoo, C., Sandholm, Ohta, Iwasaki AAAI05] Optimal computer player for Liar’s Dice

43 Other topics

44 Expressive negotiation/markets Combinatorial auction: bidders are allowed to bid on bundles of multiple items Winner determination problem is NP-hard [Rothkopf et al. 98], inapproximable [Sandholm AIJ02] Elicitation problem: do not force the agents to bid on too many bundles If bidders’ valuation functions have special structure, these problems sometimes become easier [C., Derryberry, & Sandholm AAAI04; C., Sandholm, Santi AAAI05; Santi, C., Sandholm COLT04] New types of expressive negotiation: negotiating over donations to multiple causes [C. & Sandholm ACM-EC04a], in complex settings with externalities [C. & Sandholm AAAI05b]

45 Social choice theory (voting) Voting over alternatives is a general way of aggregating the preferences of multiple agents Every agent/voter ranks all alternatives –E.g. a > c > b > d Voting rule decides winning alternative based on votes Search-based algorithms for executing certain voting rules [C. 05; C., Davenport, Kalagnanam 05] Voting rules that are computationally hard to manipulate by misreporting preferences [C. & Sandholm AAAI02a, TARK03, IJCAI03c] Efficiently eliciting voters’ preferences [C. & Sandholm AAAI02b, ACM- EC05b] Voting rules as maximum likelihood estimators of the “correct” outcome [C. & Sandholm UAI05]

46 Mechanism design Mechanism design: design the game so that good outcomes will happen in spite of agents’ strategic behavior General-purpose techniques from economics not always satisfactory esp. in combinatorial auctions/exchanges [e.g., C. & Sandholm AAMAS06b] Certain characterization results from economics do not hold when agents are computationally bounded [C. & Sandholm LOFT04] Automated mechanism design: let the computer design the best possible game for your setting! [C. & Sandholm UAI02, ICEC03, ACMEC04, AAMAS04] Incrementally modifying games to take care of strategic effects one at a time [C. & Sandholm draft 05] Thank you for your attention!

47 Definition (2-player only) Given, for both players i, subsets of the player’s pure strategies E i  S i, and a distinguished strategy e r *  E r, we say that e r * is not eliminable relative to these sets if: there exist partial probability distributions (summing to at most 1) p i over the E i (with p r (e r *)>0) such that for both i, for any e i  E i with p i (e i )>0 and any mixed strategy d i there is some way of completing p -i (placing the remaining probability on one strategy in S -i -E -i ) such that e i gives a higher utility against p -i than d i. –-i = player other than i

48 “Chicken” 0, 0-1, 1 1, -1-5, -5 D S DS S D D S


Download ppt "Techniques for Computing Game-Theoretic Solutions Vincent Conitzer Duke University Parts of this talk are joint work with Tuomas Sandholm."

Similar presentations


Ads by Google