Multi-Agent Strategic Modeling in a Robotic Soccer Domain Andraz Bezek, Matjaz Gams Department of Intelligent Systems, Jozef Stefan Institute {andraz.bezek,

Slides:



Advertisements
Similar presentations
Adaptive Support using Cognitive Models of Trust Robbert-Jan Beun (UU), Jurriaan van Diggelen (TNO), Mark Hoogendoorn (VU), Syed Waqar Jaffry (VU), Peter-Paul.
Advertisements

INT 2 PE Preparation of the Body Lecture 1 – Performance Requirements/Physical, Skill Related and Mental Fitness.
INTERMEDIATE 2 PE Preparation of the Body Performance Requirements Physical, Skill Related and Mental Fitness (Through Football)
Grid Work - Receiving and Space Awareness
SOCCER.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Field Hockey Unit.
The AGILO Autonomous Robot Soccer Team: Computational Principles, Experiences, and Perspectives Michael Beetz, Sebastian Buck, Robert Hanek, Thorsten Schmitt,
Frank Augustus Miller Middle School Physical Education
Reinforcement Learning for the Soccer Dribbling Task Arthur Carvalho Renato Oliveira.
Linear Obfuscation to Combat Symbolic Execution Zhi Wang 1, Jiang Ming 2, Chunfu Jia 1 and Debin Gao 3 1 Nankai University 2 Pennsylvania State University.
Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.
AI Lab Weekly Seminar By: Buluç Çelik.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Decision Tree Algorithm
Paper Title Your Name CMSC 838 Presentation. CMSC 838T – Presentation Motivation u Problem paper is trying to solve  Characteristics of problem  … u.
Feature Selection for Regression Problems
Graph-Based Concept Learning Jesus A. Gonzalez, Lawrence B. Holder, and Diane J. Cook Department of Computer Science and Engineering University of Texas.
Evaluating Hypotheses
School of Computing and Mathematics, University of Huddersfield Knowledge Engineering: Issues for the Planning Community Lee McCluskey Department of Computing.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.2 [P]: Agent Architectures and Hierarchical.
RoboCup: The Robot World Cup Initiative Based on Wikipedia and presentations by Mariya Miteva, Kevin Lam, Paul Marlow.
Task decomposition, dynamic role assignment and low-bandwidth communication for real-time strategic teamwork Peter Stone, Manuela Veloso Presented by Radu.
Towards Modelling Information Security with Key-Challenge Petri Nets Teijo Venäläinen
Alexandria Soccer Association U9-U12 Curriculum: Unit 1
© 2003 The RoboCup Federation Progress and Research Results In Robot Soccer Professor Peter Stone Trustee, The RoboCup Federation Department of Computer.
WEEK #4 KNOCK DOWN 5 & under C Divide the players between two coaches. Each player with a ball. Each player takes a turn and shoots at a cone trying to.
A Framework for Detection of Anomalous and Suspicious Behavior from Agent’s Spatio-Temporal Traces Boštjan Kaluža Depratment of Intelligent Systems, Jožef.
A Scene Learning and Recognition Framework for RoboCup Kevin Lam Carleton University September 6, 2005 M.A.Sc Thesis Defense.
Qualitative Induction Dorian Šuc and Ivan Bratko Artificial Intelligence Laboratory Faculty of Computer and Information Science University of Ljubljana,
林偉楷 Taiwan Evolutionary Intelligence Laboratory.
Cristian Urs and Ben Riveira. Introduction The article we chose focuses on improving the performance of Genetic Algorithms by: Use of predictive models.
HOMEWORK BOOKLET – YEAR 7&8 NAME: _____________________________ TEACHER: __________________________.
SYSA BAYS 6v6 Evaluation Guide Evaluation Sheets for the BAYS evaluation nights.
K. J. O’Hara AMRS: Behavior Recognition and Opponent Modeling Oct Behavior Recognition and Opponent Modeling in Autonomous Multi-Robot Systems.
SYSA BAYS 11v11 Evaluation Guide Evaluation Sheets for the BAYS evaluation nights.
Skill Acquisition via Transfer Learning and Advice Taking Lisa Torrey, Jude Shavlik, Trevor Walker University of Wisconsin-Madison, USA Richard Maclin.
U.S. U17 MNT vs. Turkey: Technical Summary (National A License) 1 Date: Analysis By: D. Chesler Jan Technical Summary Due June 9.
Soccer. Overview History Gameplay Techniques Field & Positions.
Top level learning Pass selection using TPOT-RL. DT receiver choice function DT is trained off-line in artificial situation DT used in a heuristic, hand-coded.
POMDPs: 5 Reward Shaping: 4 Intrinsic RL: 4 Function Approximation: 3.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
Evolutionary Programming
Artificial Immune System based Cooperative Strategies for Robot Soccer Competition International Forum on Strategic Technology, p.p , Oct
Learning to Share Meaning in a Multi-Agent System (Part I) Ganesh Padmanabhan.
WEIGHTED SYNERGY GRAPHS FOR EFFECTIVE TEAM FORMATION WITH HETEROGENEOUS AD HOC AGENTS Somchaya Liemhetcharat, Manuela Veloso Presented by: Raymond Mead.
A Hyper-heuristic for scheduling independent jobs in Computational Grids Author: Juan Antonio Gonzalez Sanchez Coauthors: Maria Serna and Fatos Xhafa.
RADHA-KRISHNA BALLA 19 FEBRUARY, 2009 UCT for Tactical Assault Battles in Real-Time Strategy Games.
Basketball Game Situations 1-Name the playing positions of the basketball team and describe their duties on the court?
Wingers creating space on the flank in formation
Learning to Play the Game of GO Lei Li Computer Science Department May 3, 2007.

Search Strategies: Hi-Lo Game Clif Kussmaul, Muhlenberg College SIGCSE 2013 Special Session: Engaging Mathematical Reasoning.
Adaptive Reinforcement Learning Agents in RTS Games Eric Kok.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Lecture #1: Introduction to Algorithms and Problem Solving Dr. Hmood Al-Dossari King Saud University Department of Computer Science 6 February 2012.
Soccer. Overview History Gameplay Techniques Vocabulary Field & Positions YELLOW is input.
A Case-based Reasoning Approach to Imitating RoboCup Players Michael W. Floyd, Babak Esfandiari and Kevin Lam FLAIRS 21 May 15, 2008.
Evolutionary Programming A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Chapter 5.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 26, 2000.
Reinforcement Learning for 3 vs. 2 Keepaway P. Stone, R. S. Sutton, and S. Singh Presented by Brian Light.
RoboCup: The Robot World Cup Initiative
HOMEWORK BOOKLET – ENRICHMENT
FIFA World Cup Brasil 2014 Tactical & Technical findings
HOMEWORK BOOKLET – YEAR 9
SYSA BAYS Evaluation Guide 2018
Soccer.
Football. Goal Center Circle SIDELINE HALF LINE Corner Kick Mark Penalty Box Goal Box THE FIELD ENDLINE.
Presentation transcript:

Multi-Agent Strategic Modeling in a Robotic Soccer Domain Andraz Bezek, Matjaz Gams Department of Intelligent Systems, Jozef Stefan Institute {andraz.bezek, Ivan Bratko Faculty of Computer and Information Science, University of Ljubljana

Talk Outline Overview of the Problem Overview of the Problem Multi-Agent Strategy Discovering Algorithm Multi-Agent Strategy Discovering Algorithm Results on the RoboCup Domain Results on the RoboCup Domain Results on the 3vs2 Keepaway Domain* Results on the 3vs2 Keepaway Domain* *not in the paper ( latest results )!

Schema of Multi-Agent Strategy Discovering Algorithm (MASDA) MASDA Input: Multi-agent action sequence (E.g. A RoboCup game) Output: Strategic concepts (E.g. Describing a specific RoboCup game) Input:Basic domain knowledge (E.g. Basic soccer and RoboCup domain knowledge) Input: Basic domain knowledge (E.g. Basic soccer and RoboCup domain knowledge)

An example MAS problem: a RoboCup attack

Goal: Human description of strategic action concept left forward player dribbles from the left half of the middle third into the penalty box left forward makes a pass into the the penalty box center forward in the center of the penalty box successfully shoots into the right part of the goal.

Multi-Agent Strategy Discovering Algorithm (MASDA) Numeric data (~ ) Symbolic data (~ ) Action graph (~6.500) Abstract action graph (~1000) Strategic action descriptions (~100) Strategic concepts (~10) Increasing abstraction I.1 I.1 I.2, I.3 I.2, I.3 II.2 II.2 II.3 II.3 III.1, III.2, III.3 III.1, III.2, III.3 II.1 II.1

Step I. Data preprocessing: I.1. Detection of actions in raw data

Step I. Data preprocessing: I.2. Action sequence generationtagent1agent20dashturn 1turndash 2turndash 3dashkick 4turndash 5dashturn 

Step I. Data preprocessing: I.3. Introduction of domain knowledget Left midfielder Center midfielder 0 creating space dribble 1 2 3attacksupport pass to player 4 5dribble creating space 

Step II: Graphical description: II.1. Action graph creation t Left midfielder Center midfielder 0 creating space dribble attacksupport pass to player 4 5dribble creating space  L-MF: creating space C-MF: dribble L-MF: attack support C-MF: pass to player C-MF: creating space L-MF: dribble

Step II: Graphical description: II.1. Action graph creation

Step II: Graphical description: II.2. Abstraction process Abstraction

Step II: Graphical description: II.3. Strategy selection Abstraction

Step III: Symbolic description learning: III.1. Generation of action descriptions LTeam.R-FW: Long dribble LTeam.R-FW: LTeam.R-FW: Pass to space LTeam.R-FW: LTeam.C-MF: Successful shoot LTeam.C-MF: LTeam.MF: Pass to player LTeam.MF:

Step III: Symbolic description learning: III.2. Generation of learning examples classfeature1feature2...LTeam.MF:Pass_to_playerTF...FT LTeam.R-FW:Long_dribbleTF...FT LTeam.R-FW:Pass_to_spaceTF...FT LTeam.C-MF:Successful_shootTF...FT

Step III: Symbolic description learning: III.3. Rule induction Each edge in a strategy represents one class. Each edge in a strategy represents one class. 2-class learning problem: 2-class learning problem: positive examples: action instances for a given edge positive examples: action instances for a given edge negative examples: all other action instances negative examples: all other action instances Induce rules for a positive class (i.e. edge) Induce rules for a positive class (i.e. edge) Repeat for all edges in a strategy Repeat for all edges in a strategy

Testing on the RoboCup Simulated League Domain Input: Input: 10 RoboCup games: a fixed team vs. various opponent teams 10 RoboCup games: a fixed team vs. various opponent teams Basic soccer knowledge (no knowledge about strategy, no tactics, and no rules of the game): Basic soccer knowledge (no knowledge about strategy, no tactics, and no rules of the game): soccer roles (e.g. left-forward) soccer roles (e.g. left-forward) soccer actions (e.g. control dribble) soccer actions (e.g. control dribble) relations between players (e.g. behind) relations between players (e.g. behind) playing-field areas (e.g. penalty box) playing-field areas (e.g. penalty box) Output: Output: strategic concepts (shown on next slide) strategic concepts (shown on next slide)

RoboCup Domain: an example strategic concept LTeam.FW:Pass to player: RTeam.R-FB:Immediate LTeam.FW:Long dribble: RTeam.C-MF:Moving-away-slow  RTeam.L-FB:Still  RTeam.R-FB:Short-distance LTeam.FW:Successful shoot: RTeam.C-FW:Moving-away  LTeam.R-FW:Short-distance LTeam.FW:Successful shoot (end): RTeam.RC-FB:Left  RTeam.RC-FB:Moving-away-fast  RTeam.R-FB:Long-distance

RoboCup Domain: testing methodology Create a reference strategic concept on 10 RoboCup games Create a reference strategic concept on 10 RoboCup games Leave-one-out cross validation to generate 10 learning tasks (learn: 9 games, test: 1 game) Leave-one-out cross validation to generate 10 learning tasks (learn: 9 games, test: 1 game) positive examples: examples matching with a reference strategic concept positive examples: examples matching with a reference strategic concept negative examples: all other examples negative examples: all other examples Generate strategic concepts on 9 learning games and test on the remaining game Generate strategic concepts on 9 learning games and test on the remaining game Measure accuracy, recall and precision for a given strategy using: Measure accuracy, recall and precision for a given strategy using: only action description only action description only generated rules only generated rules both both Varying level of abstraction: 1-20 Varying level of abstraction: 1-20

RoboCup Domain: analysis of 10 RoboCup games

3vs2 Keepaway Domain Motivation: Motivation: RoboCup is too complex to play with learned concepts RoboCup is too complex to play with learned concepts In 3vs2 Keepwaway domain we are able play with learned concepts In 3vs2 Keepwaway domain we are able play with learned concepts Basic domain info: 5 agents, 3 high-level agent actions, 13 state variables Basic domain info: 5 agents, 3 high-level agent actions, 13 state variables (Peter Stone et al.)

3vs2 Keepaway Domain Measure average episode duration Measure average episode duration Two handcoded reference strategies: Two handcoded reference strategies: good strategy: hand (14s) - hold the ball till the nearest opponent is within 5m, then pass to the most open player good strategy: hand (14s) - hold the ball till the nearest opponent is within 5m, then pass to the most open player random: rand (5.2s) - randomly choose possible actions random: rand (5.2s) - randomly choose possible actions Our task: learn rules for reference strategies and play as similar as possible Our task: learn rules for reference strategies and play as similar as possible MASDA remains identical MASDA remains identical Modified only domain knowledge: Modified only domain knowledge: roles (K1, K2, K3, T1, T2), roles (K1, K2, K3, T1, T2), actions (hold, passK2, passK3) actions (hold, passK2, passK3) 13 domain attributes 13 domain attributes

Testing Methodology Reference game with a known strategy MASDA (rule induction) Game with a learned strategy Rules are handcoded into the program Rules are handcoded into the program Comparison of episode duration Compute average episode duration Compute average episode duration Compute average episode duration Compute average episode duration

Episode duration comparison of reference and learned game

Visual comparison of reference and learned game reference game: random (rand.avi) reference game: handcoded (hand.avi) learned random (rand-pass4.avi) learned handoced (hand-holdpass2.avi)

Comparison of handcoded strategy and learned rules DistK1T1  [6, 16)  DistK1T2  [6, 16)  DistK1C  [6, 12)  MinAngK3K1T1T2  [0, 90) => Hold DistK1T1  [6, 16)  DistK1T2  [6, 16)  DistK1C  [6, 12)  MinAngK3K1T1T2  [0, 90) => Hold DistK1T1  [6, 12)  DistK1T2  [6, 16)  DistK1K3  [10, 14)  DistK1K2  [8, 14) => Hold DistK1T1  [6, 12)  DistK1T2  [6, 16)  DistK1K3  [10, 14)  DistK1K2  [8, 14) => Hold MinDistK2T1T2  [12, 16)  DistK3C  [8, 16)  DistK1T2  [2, 10)  DistK1T1  [0, 6)  MinAngK2K1T1T2  [15, 135) => pass to K2 MinDistK2T1T2  [12, 16)  DistK3C  [8, 16)  DistK1T2  [2, 10)  DistK1T1  [0, 6)  MinAngK2K1T1T2  [15, 135) => pass to K2 DistK1T1  [2, 6)  MinDistK3T1T2  [10, 16)  DistK1K2  [10, 16)  DistK2C  [4, 14)  DistK1T2  [2, 8)  MinAngK2K1T1T2  [0, 15) => pass to K3 DistK1T1  [2, 6)  MinDistK3T1T2  [10, 16)  DistK1K2  [10, 16)  DistK2C  [4, 14)  DistK1T2  [2, 8)  MinAngK2K1T1T2  [0, 15) => pass to K3 if dist(K1, T1) > 5m => hold dist(K1, T1) <= 5m player K2 is not free => pass to K3 player K2 is free => pass to K2

Conclusion We have designed a domain independent strategy learning algorithm (MASDA), which learns from action trace and basic domain knowledge We have designed a domain independent strategy learning algorithm (MASDA), which learns from action trace and basic domain knowledge Successful implementation on: Successful implementation on: RoboCup domain evaluated by human expert and cross validation. RoboCup domain evaluated by human expert and cross validation. 3vs2 Keepaway domain evaluated by comparing with two reference strategies thru episode duration, visual comparison and rule inspection 3vs2 Keepaway domain evaluated by comparing with two reference strategies thru episode duration, visual comparison and rule inspection

Questions

RoboCup Domain: successful attack strategies R-FW: pass to player → FW:control dribble → FW:shoot R-FW:dribble → R-FW:pass to player → FW:shoot FW:pass to player → L-FW:control dribble → L-FW:shoot L-FW:long dribble → L-FW:pass → FW:shoot L-FW:pass to player → FW:dribble → FW:shoot C-FW:long dribble → C-FW:pass → FW:dribble → FW:shoot