Download presentation
1
The Prisoner’s Dilemma
Ulises Ruiz
2
Game Theory Deals with the mathematical modeling of scenarios that involve two or more players (individuals, companies, nations), who want to maximize their payoffs. The payoff of each player depends on their own strategy and the strategies of the other players. First introduced in the late 1920’s when John von Neumann proved his minimax theorem that established that in zero-sum games with perfect information there exists a set of strategies that allows the players to minimize their maximum losses. Is extensively used in many fields like, economics, biology, and psychology. Main focus as of right now is to find ways to promote cooperation in games or business.
3
The prisoners’ Dilemma
Origin can be traced back to the early 1950’s when Merrill Flood and Melvin Dresher were part of the Rand Corporation to find possible applications to global nuclear strategies (The Cold War; Cuban Missile Crisis). Later formalized by Albert W. Tucker with prison sentence rewards and gave its name “prisoner’s dilemma” to help him explain it to Stanford psychologists. It’s a class model of a two person non-constant sum game.
4
Prisoners’ Dilemma (cont.)
Scenario: Two criminals suspects are arrested and are sent to prison. Each prisoner is placed in solitary confinement with no way of talking or exchanging messages with each other. The police admit that they don’t have evidence to convict the pair on the higher charge. So they plan to sentence both of them to one year in prison for the lesser charge. The police offer each prisoner a Faustian bargain, where they are given the opportunity to either confess or remain silent.
5
Possible Outcomes If prisoner A and prisoner B both choose to confess, both of them will serve 5 years in jail; If prisoner A confesses and prisoner B remains silent, A will be set free and B will serve 10 years (vice versa); If both prisoners remain silent, both of them will only serve 1 year in prison (on the lesser charge).
6
Pareto Optimal In Game theory and economics the concept of Pareto Optimal (PO) states that there is no outcome that makes at least one player better off without making another player worse off and this shouldn’t be confused with Nash’s Equilibrium. An outcome is Pareto Optimal if no other outcome is better for ALL players.
7
Nash’s Equilibrium Nash’s Equilibrium states that there may or may not be a stage where no participant can gain by changing strategy as long as the other participant remains constant. (C,C) is Nash’s Equilibrium since neither prisoner can gain by changing strategy if the other stays constant. Both prisoners goes from 5 years to possibly 10 years in prison if one of them goes come confessing to remaining silent.
8
Iterated Prisoners’ Dilemma
The Iterated Prisoners’ Dilemma (IPD) is basically the repeated form of the Prisoners’ Dilemma. It is more realistic if we assume that each player interacts with multiple players, and that each player remembers the past history of all interactions. Players are not allowed to know the number of iterations.
9
Payoff Matrix for the IPD
R= Reward for cooperating. Each participant receives 3 points. T= Temptation; if one player defects and the other cooperates, then the one who defected receives 5 points. S= Sucker; the player who cooperated with the defector, receives no points. P= Punishment; when both participants defect, and only receive one point.
10
What should you do? Well if this was the normal Prisoners’ Dilemma, suppose you think the other will cooperate, well you may want to cooperate and receive the Reward of 3 points for mutual cooperation. However, you can also decide to detect and receive the Temptation payoff of 5 points. On the other hand, you assume that the other player will detect, it's better for you to detect and get one point rather than cooperate since you will receive the Sucker payoff. If the game is only to be played once, there is no reason for either player to do anything, but to defect. If the game is meant to be played an indefinite amount of times, under certain conditions, can cooperation be seen as the best policy
11
What should you do? (Cont.)
First of all, certain conditions need to be upheld. The order of the payoffs matter. The best player can do is T. The worst a player can get is a S payoff. If the two players cooperated then the reward for mutual cooperation, R, should be better than mutual defection, P. Therefore, the following must be true; T>R>P>S Secondly, players should not play the game so that they end up with half of the time being exploited and the other half exploiting their opponent. Therefore, R must be greater than the average of the payoff for T and S. The following then must be true; R>T+S/2 Using these conditions, one must figure out a strategy that causes players to cooperate. If players would only cooperate then their payoff will be maximized if they play a definite number of games, rather than defecting and hoping the other player cooperates
12
Tit-for-tat In 1979, Axelrod organized a IDP competition and invited theorists to submit their strategies. 14 strategies were entered into the competition and competed against each other and the winner was Anatol Rapoport who submitted the Tit-for-Tat (TFT) strategy. This strategy basically states both should cooperate on the first move, then do whatever the opponent did on the previous move. There are three play movements that underlie the Tit-for-Tat strategy: Nice- which both players cooperate on the first move, Retaliate- which a player defects if their opponent defected on the last turn, Forgive- which a player cooperates with a past defector that now has chosen to cooperate.
13
Tit-for-tat (Cont.) This strategy was once again entered into the same competition in 1980 with 62 entries in total that year, but Tit-for-Tat was once again crowned the winner. In the long run, the Tit-for-Tat strategy proved to be accurate and efficient and determined that mutual cooperation was best policy. Although this isn't a perfect strategy while dealing with IPD competitions, the TFT strategy can be adapted to perform better in certain situations
14
Real world application
In economics, advertising is a scenario where the PD can be applied. Back in the 1950's and 1960's when cigarette advertising was legal in the United States, cigarette manufacturers had to decide on how much money to spend on advertising. We assume that the effectiveness of company A is determined by the advertising by the company B. If both companies decide to advertise, then the amount of customers remain constant and expenses rise due to the advertising. The same exact thing happens if both companies decide not to advertise, but the expenses would remain constant. However, if company A decides to advertise and company B decides not to, A will benefit greatly by advertising
15
Bibliography Dixit, Avinash, and Barry Nalebuff. "Prisoners' Dilemma." : The Concise Encyclopedia of Economics. Library of Economics and Libery, Web. 29 July 2015. Kendall, Graham, Xin Yao, and Siang Yew. Chong. The Iterated Prisoners' Dilemma: 20 Years on. Singapore: World Scientific, Print. Kolokolʹt︠s︡ov, V. N., and O. A. Malafeev. Understanding Game Theory: Introduction to the Analysis of Many Agent Systems with Competition and Cooperation. Singapore: World Scientific, Print. Kuhn, Steven. "Prisoner's Dilemma." Stanford University. Stanford University, 04 Sept Web. 29 July 2015. Neumann, John Von, and Oskar Morgenstern. Theory of Games and Economic Behavior. Princeton: Princeton UP, Print. Ruby, Douglass A. "Game Theory." Game Theory. Digital Economist, Web. 29 July 2015.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.