Chapter 17: Making Complex Decisions April 1, 2004
17.6 Decisions With Multiple Agents: Game Theory Assume that agents make simultaneous moves Assume that the game is a single move game.
Uses Agent Design (2 finger Morra) Mechanism Design
Game Components Players Actions Payoff Matrix e.g. rock-paper-scissors
Terminology Pure Strategy – deterministic policy Mixed Strategy – randomized policy, [p: a; (1-p): b] Outcome – result of game Solution: player adopts a strategy profile that is a rational strategy
Prisoners Dilemna B testifiesB refuses A testifiesA = -5 B = -5 A = 0 B = -10 A refusesA = -10 B = 0 A = -1 B = -1
Terminology (testify, testify) is a dominant strategy s strongly dominates s – s is better than s for all other player strategies s weakly dominates s – s is better than s for one other strategy and is at least as good as all the rest
Terminology An outcome is Pareto optimal if there is no other outcome that all players would prefer An equilibrium is a strategy profile where no player benefits by switching strategies given that no other player may switch strategies Nash showed that every game has an equilibrium Prisoners Dilemna!
Example: Two Nash Equilibria no dominant strategy! B: dvdB: cd A: dvdA = 9 B = 9 A = -4 B = -1 A: cdA = -1 B = -4 A = 5 B = 5
Von Neumanns Maximin zero sum game E maximizer (2 finger Morra) O minimizer (2 finger Morra) U(E = 1, O = 1) = 2 U(E = 1, O = 2) = -3 U(E = 2, O = 1) = -3 U(E = 2, O = 2) = 4
Maximin E reveals strategy, moves first [p: one; 1-p: two] O chooses based on p one: 2p -3(1-p) two: -3p + 4(1-p) p = 7/12 U E,O = -1/12
Maximin O reveals strategy, moves first [q: one; 1-q: two] E chooses based on q one: 2q -3(1-q) two: -3q + 4(1-q) q = 7/12 U O,E = -1/12
Maximin [7/12: one, 5/12: two] is the Maximin equilibrium or Nash equilibrium Always exists for mixed strategies! The value is a maximin for both players.
Repeated Move Games Application: packet collision in an Ethernet network Prisoners Dilemna – fixed number of rounds – no change! Prisoners Dilemna – variable number of rounds (e.g. 99% chance of meeting again) –perpetual punishment –tit for tat
Repeated Move Games Partial Information Games – games that occur in a partially observable environment such as blackjack
17.7 Mechanism Design Given rational agents, what game should we design Tragedy of the Commons
Auctions Single Item Bidder i has a utility v i for the item v i is only known to Bidder i English Auction Sealed Bid Auction Sealed Bid Second Price or Vickrey auction (no communication, no knowledge of others)