An Intro to Game Theory 15-451 Avrim Blum 12/07/04.

Slides:

Advertisements

Similar presentations

Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.

Advertisements

Module 4 Game Theory To accompany Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna Power Point slides created by Jeff Heyl.

Price Of Anarchy: Routing

Totally Unimodular Matrices

Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Mixed Strategies CMPT 882 Computational Game Theory Simon Fraser University Spring 2010 Instructor: Oliver Schulte.

Lecturer: Moni Naor Algorithmic Game Theory Uri Feige Robi Krauthgamer Moni Naor Lecture 8: Regret Minimization.

Game Theory S-1.

Two-Player Zero-Sum Games

ECO290E: Game Theory Lecture 5 Mixed Strategy Equilibrium.

Operations Research Assistant Professor Dr. Sana’a Wafa Al-Sayegh 2 nd Semester ITGD4207 University of Palestine.

1 Chapter 4: Minimax Equilibrium in Zero Sum Game SCIT1003 Chapter 4: Minimax Equilibrium in Zero Sum Game Prof. Tsang.

MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)

Study Group Randomized Algorithms 21 st June 03. Topics Covered Game Tree Evaluation –its expected run time is better than the worst- case complexity.

Shengyu Zhang The Chinese University of Hong Kong.

Game Theory: introduction and applications to computer networks Game Theory: introduction and applications to computer networks Zero-Sum Games (follow-up)

How Bad is Selfish Routing? By Tim Roughgarden Eva Tardos Presented by Alex Kogan.

Online learning, minimizing regret, and combining expert advice

Online Learning Avrim Blum Carnegie Mellon University Your guide: [Machine Learning Summer School 2012]

Chapter 15- Game Theory: The Mathematics of Competition

General-sum games You could still play a minimax strategy in general- sum games –I.e., pretend that the opponent is only trying to hurt you But this is.

Part 3: The Minimax Theorem

An Introduction to Game Theory Part I: Strategic Games

Equilibrium Concepts in Two Player Games Kevin Byrnes Department of Applied Mathematics & Statistics.

Vincent Conitzer CPS Normal-form games Vincent Conitzer

by Vincent Conitzer of Duke

A Network Pricing Game for Selfish Traffic

An Introduction to Game Theory Part II: Mixed and Correlated Strategies Bernhard Nebel.

Social Networks 101 P ROF. J ASON H ARTLINE AND P ROF. N ICOLE I MMORLICA.

Lecture 1 - Introduction 1.  Introduction to Game Theory  Basic Game Theory Examples  Strategic Games  More Game Theory Examples  Equilibrium  Mixed.

Games of pure conflict two person constant sum. Two-person constant sum game Sometimes called zero-sum game. The sum of the players’ payoffs is the same,

1 Computing Nash Equilibrium Presenter: Yishay Mansour.

An Introduction to Game Theory Part III: Strictly Competitive Games Bernhard Nebel.

Potential games, Congestion games Computational game theory Spring 2010 Adapting slides by Michal Feldman TexPoint fonts used in EMF. Read the TexPoint.

Algorithmic Issues in Non- cooperative (i.e., strategic) Distributed Systems.

Finite Mathematics & Its Applications, 10/e by Goldstein/Schneider/SiegelCopyright © 2010 Pearson Education, Inc. 1 of 68 Chapter 9 The Theory of Games.

UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.

Inefficiency of equilibria, and potential games Computational game theory Spring 2008 Michal Feldman.

Experts Learning and The Minimax Theorem for Zero-Sum Games Maria Florina Balcan December 8th 2011.

Minimax strategies, Nash equilibria, correlated equilibria Vincent Conitzer

Mechanism Design without Money Lecture 2 1. Let’s start another way… Everyone choose a number between zero and a hundred, and write it on the piece of.

CPS 170: Artificial Intelligence Game Theory Instructor: Vincent Conitzer.

The Multiplicative Weights Update Method Based on Arora, Hazan & Kale (2005) Mashor Housh Oded Cats Advanced simulation methods Prof. Rubinstein.

6.853: Topics in Algorithmic Game Theory Fall 2011 Constantinos Daskalakis Lecture 11.

Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.

Game Theory Part 2: Zero Sum Games. Zero Sum Games The following matrix defines a zero-sum game. Notice the sum of the payoffs to each player, at every.

The Design & Analysis of the Algorithms Lecture by me M. Sakalli Download two pdf files..

Introduction to Game Theory application to networks Joy Ghosh CSE 716, 25 th April, 2003.

Game theory & Linear Programming Steve Gu Mar 28, 2008.

THE “CLASSIC” 2 x 2 SIMULTANEOUS CHOICE GAMES Topic #4.

Game Theory: introduction and applications to computer networks Game Theory: introduction and applications to computer networks Lecture 2: two-person non.

Game Theory: introduction and applications to computer networks Game Theory: introduction and applications to computer networks Introduction Giovanni Neglia.

The Science of Networks 6.1 Today’s topics Game Theory Normal-form games Dominating strategies Nash equilibria Acknowledgements Vincent Conitzer, Michael.

Mechanism Design without Money Lecture 3 1. A game You need to get from A to B Travelling on AX or YB takes 20 minutes Travelling on AY or XB takes n.

Issues on the border of economics and computation נושאים בגבול כלכלה וחישוב Speaker: Dr. Michael Schapira Topic: Dynamics in Games (Part III) (Some slides.

1 Intrinsic Robustness of the Price of Anarchy Tim Roughgarden Stanford University.

1 What is Game Theory About? r Analysis of situations where conflict of interests is present r Goal is to prescribe how conflicts can be resolved 2 2 r.

Shall we play a game? Game Theory and Computer Science Game Theory /06/05 - Zero-sum games - General-sum games.

Algorithms for solving two-player normal form games

1. 2 You should know by now… u The security level of a strategy for a player is the minimum payoff regardless of what strategy his opponent uses. u A.

Tools for Decision Analysis: Analysis of Risky Decisions

Speaker: Dr. Michael Schapira Topic: Dynamics in Games (Part II)

Games of pure conflict two person constant sum

Lecture 20 Linear Program Duality

The Weighted Majority Algorithm

Presentation transcript:

An Intro to Game Theory 15-451 Avrim Blum 12/07/04

Plan for Today 2-Player Zero-Sum Games (matrix games) Minimax optimal strategies Minimax theorem General-Sum Games (bimatrix games) notion of Nash Equilibrium Proof of existence of Nash Equilibria using Brouwer’s fixed-point theorem test material not test material

2-Player Zero-Sum games Two players R and C. Zero-sum means that what’s good for one is bad for the other. Game defined by matrix with a row for each of R’s options and a column for each of C’s options. Matrix tells who wins how much. an entry (x,y) means: x = payoff to row player, y = payoff to column player. “Zero sum” means that y = -x. E.g., matching pennies / penalty shot / hide-a-coin: (-1,1) (1,-1) (1,-1) (-1,1) Left Right Left Right goalie shooter shooter loses and goalie wins

Minimax-optimal strategies Minimax optimal strategy is a (randomized) strategy that has the best guarantee on its expected gain. [maximizes the minimum] I.e., the thing to play if your opponent knows you well. Same as our notion of a randomized strategy with a good worst-case bound. In class on Linear Programming, we saw how to solve for this using LP. polynomial time in size of matrix if use poly-time LP alg (like Ellipsoid).

Minimax-optimal strategies E.g., penalty shot / hide-a-coin (-1,1) (1,-1) (1,-1) (-1,1) Left Right Left Right Minimax optimal for both players is 50/50. Gives expected gain of 0. Any other is worse.

Minimax-optimal strategies E.g., penalty shot with goalie who’s weaker on the left. (0,0) (1,-1) (1,-1) (-1,1) Left Right Left Right Minimax optimal for both players is (2/3,1/3). Gives expected gain 1/3. Any other is worse.

Matrix games and Algorithms Gives a useful way of thinking about guarantees on algorithms. Think of rows as different algorithms, columns as different possible inputs. M(i,j) = cost of algorithm i on input j. Of course matrix may be HUGE. But helpful conceptually.

An simple example Sorting three items (A,B,C) by comparisons: Compare two of them. Then compare 3rd to larger of 1st two. If we’re lucky it’s larger, else need one more comparison. 2 3 (A,B) first (A,C) first (B,C) first (payoff to adversary) adversary largest is: C B A

An simple example Minimax optimal strategy: Pick first pair at random. Then expected cost is 2+2/3, whatever the input looks like. 2 3 (A,B) first (A,C) first (B,C) first (payoff to adversary) adversary largest is: C B A

Matrix games and Algs Alg player Adversary What is a deterministic alg with a good worst-case guarantee? A row that does well against all columns. What is a lower bound for deterministic algorithms? Showing that for each row i there exists a column j such that M(i,j) is bad. How to give lower bound for randomized algs? Give randomized strategy for adversary that is bad for all i. Must also be bad for all distributions over i.

E.g., hashing Rows are different hash functions. Alg player Adversary Rows are different hash functions. Cols are different sets of n items to hash. M(i,j) = #collisions incurred by alg i on set j. [alg is trying to minimize] For any row, can reverse-engineer a bad column. Universal hashing is a randomized strategy for row player.

Minimax Theorem (von Neumann 1928) Every 2-player zero-sum game has a unique value V. Minimax optimal strategy for R guarantees R’s expected gain at least V. Minimax optimal strategy for C guarantees R’s expected gain at most V. Counterintuitive: Means it doesn’t hurt to publish your strategy if both players are optimal. (Borel had proved for symmetric 5x5 but thought was false for larger games)

Nice proof of minimax thm (sketch) Suppose for contradiction it was false. This means some game G has VC > VR: If Column player commits first, there exists a row that gets at least VC. But if Row player has to commit first, the Column player can make him get only VR. Scale matrix so payoffs to row are in [0,1]. Say VR = VC(1-e). VC VR

Proof sketch, contd Consider repeatedly playing game G against some opponent. [think of you as row player] Use “picking a winner / expert advice” alg to do nearly as well as best fixed row in hindsight. Alg gets ¸ (1-e/2)OPT – c*log(n)/e > (1-e)OPT [if play long enough] OPT ¸ VC [Best against opponent’s empirical distribution] Alg · VR [Each time, opponent knows your randomized strategy] Contradicts assumption.

person driving towards you General-sum games In general-sum (bimatrix) games, can get win-win and lose-lose situations. E.g., “what side of road to drive on?”: person driving towards you (1,1) (-1,-1) (-1,-1) (1,1) Left Right Left Right you

General-sum games In general-sum (bimatrix) games, can get win-win and lose-lose situations. E.g., “which movie should we go to?”: Spongebob Oceans-12 (8,2) (0,0) (0,0) (2,8) Spongebob Oceans-12 No longer a unique “value” to the game.

Nash Equilibrium A Nash Equilibrium is a stable pair of strategies (could be randomized). Stable means that neither player has incentive to deviate on their own. E.g., “what side of road to drive on”: (1,1) (-1,-1) (-1,-1) (1,1) Left Right Left Right NE are: both left, both right, or both 50/50.

Nash Equilibrium A Nash Equilibrium is a stable pair of strategies (could be randomized). Stable means that neither player has incentive to deviate. E.g., “which movie to go to”: Spongebob Oceans-12 (8,2) (0,0) (0,0) (2,8) Spongebob Oceans12 NE are: both Sb, both Oc, or (80/20,20/80)

Uses Economists use games and equilibria as models of interaction. E.g., pollution / prisoner’s dilemma: (imagine pollution controls cost $4 but improve everyone’s environment by $3) don’t pollute pollute (2,2) (-1,3) (3,-1) (0,0) don’t pollute pollute Need to add extra incentives to get good overall behavior.

NE can do strange things Braess paradox: Road network, traffic going from s to t. travel time as function of fraction x of traffic on a given edge. travel time t(x)=x. travel time = 1, indep of traffic s x 1 t Fine. NE is 50/50. Travel time = 1.5

NE can do strange things Braess paradox: Road network, traffic going from s to t. travel time as function of fraction x of traffic on a given edge. travel time t(x)=x. travel time = 1, indep of traffic 1 x s t x 1 Add new superhighway. NE: everyone uses zig-zag path. Travel time = 2.

Existence of NE Nash (1950) proved: any general-sum game must have at least one such equilibrium. Might require randomized strategies (called “mixed strategies”) This also yields minimax thm as a corollary. Pick some NE and let V = value to row player in that equilibrium. Since it’s a NE, neither player can do better even knowing the (randomized) strategy their opponent is playing. So, they’re each playing minimax optimal.

Existence of NE Proof will be non-constructive. Unlike case of zero-sum games, we do not know any polynomial-time algorithm for finding Nash Equilibria in general-sum games. [great open problem!] Notation: Assume an nxn matrix. Use (p1,...,pn) to denote mixed strategy for row player, and (q1,...,qn) to denote mixed strategy for column player.

Proof We’ll start with Brouwer’s fixed point theorem. Let S be a compact convex region in Rn and let f:S ! S be a continuous function. Then there must exist x 2 S such that f(x)=x. x is called a “fixed point” of f. Simple case: S is the interval [0,1]. We will care about: S = {(p,q): p,q are legal probability distributions on 1,...,n}. I.e., S = simplexn £ simplexn

Proof (cont) S = {(p,q): p,q are mixed strategies}. Want to define f(p,q) = (p’,q’) such that: f is continuous. This means that changing p or q a little bit shouldn’t cause p’ or q’ to change a lot. Any fixed point of f is a Nash Equilibrium. Then Brouwer will imply existence of NE.

Try #1 What about f(p,q) = (p’,q’) where p’ is best response to q, and q’ is best response to p? Problem: not continuous: E.g., penalty shot: If p = (0.51, 0.49) then q’ = (1,0). If p = (0.49,0.51) then q’ = (0,1). (-1,1) (1,-1) (1,-1) (-1,1) Left Right Left Right

Try #1 What about f(p,q) = (p’,q’) where p’ is best response to q, and q’ is best response to p? Problem: also not necessarily well-defined: E.g., if p = (0.5,0.5) then q’ could be anything. (-1,1) (1,-1) (1,-1) (-1,1) Left Right Left Right

Instead we will use... f(p,q) = (p’,q’) such that: p p’ q’ maximizes [(expected gain wrt p) - ||q-q’||2] p’ maximizes [(expected gain wrt q) - ||p-p’||2] p p’ Note: quadratic + linear = quadratic.

Instead we will use... f(p,q) = (p’,q’) such that: p’ p q’ maximizes [(expected gain wrt p) - ||q-q’||2] p’ maximizes [(expected gain wrt q) - ||p-p’||2] p’ p Note: quadratic + linear = quadratic.

Instead we will use... f(p,q) = (p’,q’) such that: q’ maximizes [(expected gain wrt p) - ||q-q’||2] p’ maximizes [(expected gain wrt q) - ||p-p’||2] f is well-defined and continuous since quadratic has unique maximum and small change to p,q only moves this a little. Also fixed point = NE. (even if tiny incentive to move, will move little bit). So, that’s it!