tit-for-tat algorithm

Slides:



Advertisements
Similar presentations
PowerPoint Slides by Robert F. BrookerCopyright (c) 2001 by Harcourt, Inc. All rights reserved. Strategic Behavior Game Theory –Players –Strategies –Payoff.
Advertisements

Tutorial 1 Ata Kaban School of Computer Science University of Birmingham.
Infinitely Repeated Games
Crime, Punishment, and Forgiveness
Markov Decision Process
Game Theory S-1.
The basics of Game Theory Understanding strategic behaviour.
Infinitely Repeated Games Econ 171. Finitely Repeated Game Take any game play it, then play it again, for a specified number of times. The game that is.
AP Economics Mr. Bernstein Module 65: Game Theory December 10, 2014.
Prisoner's dilemma Two suspects A, B are arrested by the police.Two suspects A, B are arrested by the police. The police have insufficient evidence for.
Games What is ‘Game Theory’? There are several tools and techniques used by applied modelers to generate testable hypotheses Modeling techniques widely.
Prisoner’s Dilemma. The scenario In the Prisoner’s Dilemma, you and Lucifer are picked up by the police and interrogated in separate cells without the.
GAME THEORY.
EC – Tutorial / Case study Iterated Prisoner's Dilemma Ata Kaban University of Birmingham.
Story time! Robert Axelrod. Contest #1 Call for entries to game theorists All entrants told of preliminary experiments 15 strategies = 14 entries + 1.
On Bounded Rationality and Computational Complexity Christos Papadimitriou and Mihallis Yannakakis.
A Game-Theoretic Approach to Strategic Behavior. Chapter Outline ©2015 McGraw-Hill Education. All Rights Reserved. 2 The Prisoner’s Dilemma: An Introduction.
Agenda, Day 2  Questions about syllabus? About myths?  Prisoner’s dilemma  Prisoner’s dilemma vs negotiation  Play a single round  Play multiple rounds.
Game Outcome The goal of a game is to produce a measurable outcome.
Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -
Natural Computation and Behavioral Robotics Competition, Games and Evolution Harris Georgiou – 3.
Finite Iterated Prisoner’s Dilemma Revisited: Belief Change and End Game Effect Jiawei Li (Michael) & Graham Kendall University of Nottingham.
Game Theory & the cuban missile crisis
KRUGMAN'S MICROECONOMICS for AP* Game Theory Margaret Ray and David Anderson Micro: Econ: Module.
Lecture 5 Introduction to Game theory. What is game theory? Game theory studies situations where players have strategic interactions; the payoff that.
Section 2 – Ec1818 Jeremy Barofsky
Robert Axelrod’s Tournaments Robert Axelrod’s Tournaments, as reported in Axelrod, Robert. 1980a. “Effective Choice in the Prisoner’s Dilemma.” Journal.
UUCF Summer RE 2011 Brain Glitches Session 4: Prisoner’s Dilemma.
Outline for 9/17: More Optimism about International Affairs 1.More on Free Trade: International Trade as a Harmony Game. 2. Cooperation is possible in.
Evolving Strategies for the Prisoner’s Dilemma Jennifer Golbeck University of Maryland, College Park Department of Computer Science July 23, 2002.
The Prisoner’s Dilemma or Life With My Brother and Sister John CT.
Game Outcome The goal of a game is to produce a measurable outcome.
Oligopoly. Some Oligopolistic Industries Economics in Action - To get a better picture of market structure, economists often use the “four- firm concentration.
Lecture XV: Real P2P Systems
Game theory Chapter 28 and 29
Yuan Deng Vincent Conitzer Duke University
PRISONER’S DILEMMA BERK EROL
Tools for Decision Analysis: Analysis of Risky Decisions
Recitation #3 Tel Aviv University 2016/2017 Slava Novgorodov
Module 32 Game Theory.
Project BEST Game Theory.
Chapter 12 - Imperfect Competition: A Game-Theoretic Approach
Strategic Interaction of and pricing: II
Lecture 10.
Introduction to Game Theory
Prisoner’s Dilemma.
Vincent Conitzer CPS Repeated games Vincent Conitzer
Game theory Chapter 28 and 29
Computer-Mediated Communication
Game Theory Module KRUGMAN'S MICROECONOMICS for AP* Micro: Econ:
Decision Theory and Game Theory
BEC 30325: MANAGERIAL ECONOMICS
The art of Emotional Decisions (Sunk Cost Evaluation)
Prisoner’s Dilemma with N-Participants and Optional Cooperation
Computer-Mediated Communication
COOPERATION Tit-for-Tat and reciprocal altruism By-product mutualism
Instructors: Fei Fang (This Lecture) and Dave Touretzky
Two suspects are arrested by the police Two suspects are arrested by the police. The police have insufficient evidence for a conviction,
LECTURE 6: MULTIAGENT INTERACTIONS
Monkeys in a Prisoner’s Dilemma
Monkeys in a Prisoner’s Dilemma
Multiagent Systems Repeated Games © Manfred Huber 2018.
Game Theory Fall Mike Shor Topic 5.
Vincent Conitzer Repeated games Vincent Conitzer
Game Theory Lesson 15 Section 65.
Molly W. Dahl Georgetown University Econ 101 – Spring 2009
Collaboration in Repeated Games
Game Theory Spring Mike Shor Topic 5.
Vincent Conitzer CPS Repeated games Vincent Conitzer
Presentation transcript:

tit-for-tat algorithm auxiliary document tit-for-tat algorithm lectured by Chang-jin Suh Soongsil University, Dep. of Computer Science Tel : 820-0686 cjsuh @ ssu.ac.kr

0.contents 1. Prinsoners’ Dilemma 2. Iterative PD(IPD) 3. famous IPD solutions II-a: Application Layer

1. Prinsoners’ Dilemma description Two very clever suspects are arrested by the police. Evidences are insufficient for a conviction(유죄선고). Having separated both prisoners, the policeman visits each of them to offer the same plea deal(형량거래). deal : See the table next slide. Each prisoner must choose to betray or to be silent. Each one is assured that the other would not know about the betrayal until declaring the sentence(선고). What do you choose if you were a prisoner? tit-for-tat algorithm

1. Prinsoners’ Dilemma plea deal matrix (A, B represent prisoners.) simplified plea deal matrix (x,y) : (A’s and B’s sentence.) PD payoff matrix (negative) penalty changes to (positive) payoff. ** fair payoff : Same payoff numbers to each player. B stays silent B betrays A stays silent A,B serve 0.5 year A: 10 years B: goes free A betrays A: goes free B: 10 years A,B serve 5 years cooperate betray (3,3) (0,5) (5,0) (1,1) B cooperates B betrays A cooperates (-0.5y, -0.5y) (-10y,0) A betrays (0,-10y) (-5y,-5) tit-for-tat algorithm

1. Prinsoners’ Dilemma (PD) 1. general PD problem Given the PD matrix, how does player do to maximize its payoff ? T> R > P > S, 2 R > T+S PD problem’s solution Choose “betray”(=”war”). (proof) Under a given but unknown other’s decision, I always can get benefit by choosing “betray/war” if other is ‘c’, R(3) < T(5), if other is ‘d’, S(0) < T(1). even though my decision damages the other. if other is ‘c’, R(3) > S(0), if other is ‘d’, T(5) > T(1). cooperate betray (R,R) (S,T) (T,S) (P,P) peace war (3,3) (0,5) (5,0) (1,1) peach-war game table tit-for-tat algorithm

1. Prinsoners’ Dilemma (PD) Prisoner’s dilemma Both (very clever) prisoners know they can achieve the maximum payoff, if both choose “cooperate” (=“peace”). But they cannot do it because they are too clever and they know the previous proof. tit-for-tat algorithm

2. Iterative PD(IPD) iterated prisoner's dilemma problem Repeat the PD problem without announcing the repetition number. (If it is known, all “war” is the best solution.) remembering results in the current IPD game. Players can punish the opponent’s “war” in the later rounds by choosing “war”. games goal : Maximize the accumulated payoff We do not count lose or win of the current IPD game! Greedy players(who prefer war) used to win, but used to accumulate less payoff. tit-for-tat algorithm

2. Iterative PD(IPD) iterated prisoner's dilemma problem contest called “peach war game” or “IPD tournament” held once a year since 1975. game objective : Maximize payoff while playing IPD with many players. tit-for-tat algorithm

2. Iterative PD(IPD) well-known good IPD game strategies Nice : this is also called "optimistic" def : Do not “defect” before its opponent does Retaliating : def : Do not do “blind optimism (always-nice)”. why? “nasty”(un-nice) strategy ruthlessly attacks it. Forgiving : def : Do not do “infinite retaliation”. why? to shorten the long runs of revenge and counter-revenge, to maximize payoff. Non-envious : def : Do not strive to win the game (score more than the opponent’s). nice player are always non-envious. tit-for-tat algorithm

3. famous IPD solutions (very simple original) tit-for-tat rule 1’st decision : ‘p’ n’th decision : the opponent’s (n-1)’th decision. (n=2,3,4, …) property : nice, retaliating, non-envious non-forgiving : If two players use this strategies, each will never forgive. examples two original tit-for-tat peace war (3,3) (0,5) (5,0) (1,1) pessimist vs (original tit-for-tat) round 1 2 3 4 5 score tit-for-tat1 p 15 tit-for-tat2 round 1 2 3 4 5 score pessimist w 9 tit-for-tat p tit-for-tat algorithm

3. famous IPD solutions (continued) death spiral example A,B players uses original tit-for-tat, but B shows “war” at the first round peace war (3,3) (0,5) (5,0) (1,1) 1 2 3 4 5 score A p w 15 B 10 tit-for-tat algorithm

3. famous IPD solutions tit-for-tat with forgiveness This is generally called simply “tit-for-tat” rule Unless provoked, the agent will always cooperate. nice If provoked, the agent will retaliate : retaliating The agent is quick to forgive. : forgiving The agent must have a good chance of competing against the opponent more than once. (?) tit-for-tat algorithm

3. famous IPD solutions tit-for-two tat rule : only different part from tit-for-tat is defined. If provoked twice consecutively, the agent will retaliate. property : nicer than tit-for-tat usage : a variant of tit-for-two tat is used in bitTorrent. bitTorrent call it as “optimistically un-choked”. tit-for-tat algorithm