UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces Jacob Schrum Igor V. Karpov

Slides:



Advertisements
Similar presentations
A New World Or People Keep Telling Me This is Ambitious By Jeremiah Lewis.
Advertisements

Constructing Complex NPC Behavior via Multi- Objective Neuroevolution Jacob Schrum – Risto Miikkulainen –
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
Improving Gameplay: Characterising Differences between NPCs & Human Players Jennifer Sandercock.
Elitist Non-dominated Sorting Genetic Algorithm: NSGA-II
Half life 2/ Counter Strike: Source bot Charlie Cross CIS
Generating Diverse Opponents with Multi-Objective Evolution Generating Diverse Opponents with Multi-Objective Evolution Alexandros Agapitos, Julian Togelius,
RED DEAD REVOLVER Artificial Intelligence Critique By Mitchell C. Dodes CIS 588.
Evolving Neural Network Agents in the NERO Video Game Author : Kenneth O. Stanley, Bobby D. Bryant, and Risto Miikkulainen Presented by Yi Cheng Lin.
Using a GA to Create Prey Tactics Presented by Tony Morelli on 11/29/04.
Integrating POMDP and RL for a Two Layer Simulated Robot Architecture Presented by Alp Sardağ.
Design of Autonomous Navigation Controllers for Unmanned Aerial Vehicles using Multi-objective Genetic Programming Gregory J. Barlow March 19, 2004.
Othello Sean Farrell June 29, Othello Two-player game played on 8x8 board All pieces have one white side and one black side Initial board setup.
Neural Networks Slides by Megan Vasta. Neural Networks Biological approach to AI Developed in 1943 Comprised of one or more layers of neurons Several.
Chuang-Hue Moh Spring Embodied Intelligence: Final Project.
Marcus Gallagher and Mark Ledwich School of Information Technology and Electrical Engineering University of Queensland, Australia Sumaira Saeed Evolving.
By Jacob Schrum and Risto Miikkulainen
1 Reasons for parallelization Can we make GA faster? One of the most promising choices is to use parallel implementations. The reasons for parallelization.
Evolutionary Multi-objective Optimization – A Big Picture Karthik Sindhya, PhD Postdoctoral Researcher Industrial Optimization Group Department of Mathematical.
Ioannis Karamouzas, Roland Geraerts, Mark Overmars Indicative Routes for Path Planning and Crowd Simulation.
Evolving Multi-modal Behavior in NPCs Jacob Schrum – Risto Miikkulainen –
Evolution, Brains and Multiple Objectives
Raven Robin Burke GAM 376. Soccer standings Burke, 7 Ingebristen, 6 Buer, 6 Bukk, 6 Krishnaswamy, 4 Lobes, 3 Borys, 2 Rojas, 2 Bieneman, 2.
Improving Gameplay: Characterising Differences between NPCs & Human Players Jennifer Sandercock.
On comparison of different approaches to the stability radius calculation Olga Karelkina Department of Mathematics University of Turku MCDM 2011.
Jorge Munoz, German Gutierrez, Araceli Sanchis Presented by: Itay Bittan.
Constructing Intelligent Agents via Neuroevolution By Jacob Schrum
Study on Genetic Network Programming (GNP) with Learning and Evolution Hirasawa laboratory, Artificial Intelligence section Information architecture field.
CS 4730 Action vs. Interaction CS 4730 – Computer Game Design Credit: Several slides from Walker White (Cornell)
Design of a real time strategy game with a genetic AI By Bharat Ponnaluri.
Evolving Virtual Creatures & Evolving 3D Morphology and Behavior by Competition Papers by Karl Sims Presented by Sarah Waziruddin.
Mike Taks Bram van de Klundert. About Published 2005 Cited 286 times Kenneth O. Stanley Associate Professor at University of Central Florida Risto Miikkulainen.
Kanpur Genetic Algorithms Laboratory IIT Kanpur 25, July 2006 (11:00 AM) Multi-Objective Dynamic Optimization using Evolutionary Algorithms by Udaya Bhaskara.
ON THE INTERMEDIATE SYMBOL RECOVERY RATE OF RATELESS CODES Ali Talari, and Nazanin Rahnavard IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 60, NO. 5, MAY 2012.
AI Evaluation David Nowell CIS 588 2/14/05 Baldur’s Gate.
Improving the Genetic Algorithm Performance in Aerial Spray Deposition Management University of Georgia L. Wu, W.D. Potter, K. Rasheed USDA Forest Service.
Pac-Man AI using GA. Why Machine Learning in Video Games? Better player experience Agents can adapt to player Increased variety of agent behaviors Ever-changing.
2/29/20121 Optimizing LCLS2 taper profile with genetic algorithms: preliminary results X. Huang, J. Wu, T. Raubenhaimer, Y. Jiao, S. Spampinati, A. Mandlekar,
Artificial Intelligence Research in Video Games By Jacob Schrum
Evolving Multimodal Networks for Multitask Games
Evolving Agent Behavior in Multiobjective Domains Using Fitness-Based Shaping Jacob Schrum and Risto Miikkulainen University of Texas at Austin Department.
Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.
Botprize 2010 Jacob Schrum, Igor Karpov, and Risto Miikkulainen
1 Algoritmos Genéticos aplicados em Machine Learning Controle de um Robo (em inglês)
AI in Space Group 3 Raquel Cohen Yesenia De La Cruz John Gratton.
CAP6938 Neuroevolution and Artificial Embryogeny Real-time NEAT Dr. Kenneth Stanley February 22, 2006.
On Survivability of Mobile Cyber Physical Systems with Intrusion Detection Authors: Robert Mitchell, Ing-Ray Chen Presented by: Ting Hua.
Optimization Problems
Solving Interleaved and Blended Sequential Decision-Making Problems through Modular Neuroevolution Jacob Schrum Risto Miikkulainen.
TORCS WORKS Jang Su-Hyung.
Extending wireless Ad-Hoc
Background Shapes & Collision Resolution (Top-down and Side-scrolling)
CS Fall 2016 (Shavlik©), Lecture 11, Week 6
Dr. Kenneth Stanley September 25, 2006
Evolving Multimodal Networks for Multitask Games
Heuristic-based Recommendation for Metamodel–OCL Coevolution
Exploring Computer Science Lesson 6-2
Workshop II UU Crowd Simulation Framework
UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces Jacob Schrum, Igor Karpov, and Risto Miikkulainen
Optimization Problems
CIS 488/588 Bruce R. Maxim UM-Dearborn
A Review By Nick Crooker
Interaction with artificial intelligence in games
Environment Design.
THe University of Georgia Genetic Algorithm BOT
Dr. Kenneth Stanley February 6, 2006
What do we know (page 1)? Define the word "Taxonomy." (Knowledge)
Md. Tanveer Anwar University of Arkansas
Games & Adversarial Search
A system for automatic animation of piano performances
Presentation transcript:

UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces Jacob Schrum Igor V. Karpov Risto Miikkulainen

Our Approach: UT^2 Human traces to get unstuck and navigate –Filter data to get general-purpose traces Evolve skilled combat behavior –Restrictions/filters maintain humanness Observe and judge like a human –Necessary to account for the judging game

Bot Architecture

Human Trace Replay

Record and Index Human Games Synthetic pose data Indexed by nearest navpoint Replay nearest trace when needed

Unstuck Controller Mix scripted responses and human traces –Previous UT^2 used only human traces Human traces also used after repeated failures Stuck ConditionResponse StillMove Forward Collide With WallMove Away Frequent CollisionsDodge Away Under ElevatorGoto Nearest Item or Dodge Bump AgentMove Away Same NavpointHuman Traces Off Navpoint GridHuman Traces

Explorative Retrace Explore the level like a human Collisions allowed when using RETRACE –Humans often bump walls with no problem If RETRACE fails –No trace available, or trace gets bot stuck –Fall through to PATH module (Nav graph)

Evolved Battle Controller

Battle Controller Outputs 6 movement outputs –Advance –Retreat –Strafe left –Strafe right –Move to nearest item –Stand still Additional output –Jump?

Battle Controller Inputs Pie slice sensors for enemies Ray traces for walls/level geometry Other misc. sensors for current weapon properties, nearby item properties, etc.

Battle Controller Inputs Opponent movement sensors –Opponent performing movement action X? –Opponents modeled as moving like bot –Approximation used

Constructive Neuroevolution Genetic Algorithms + Neural Networks Build structure incrementally (complexification) Good at generating control policies Three basic mutations (no crossover used) Perturb WeightAdd ConnectionAdd Node

Evolving Battle Controller Used NSGA-II* with 3 objectives –Damage dealt –Damage received (negative) –Geometry collisions (negative) Evolved in DM-1on1-Albatross –Small level to encourage combat –One native bot opponent High score favored in selection of final network Final combat behavior highly constrained *K. Deb et al. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. Evol. Comp. 2002

Action Filtering Network choice not always used –Forced to stand still sometimes Sniping, not threatened, high ground –Prevented from jumping while still –Prevented from jumping near walls/opponents –Prevented from going to unwanted items –Prevented from strafing/retreating into walls –Etc… Forced lower accuracy Forced delays to simulate human response time Evolution constrained to look human

Importance of Observing Humans don’t just want max score Human goal is to judge correctly –Requires observation w/o fighting Observe module –Bot hasn’t judged opponent –Avoids crowds Judging module –Lengthy observation leads to judging

Approach Still Retreat Use Battle Controller Observation Behavior

Human Subject Evaluation BotPrize tests humanness without saying what is human-like vs. bot-like Idea: BotPrize style experiment in which players are extensively interviewed IRB Human Subject Study w/cash prizes Performed at UT: –6 human volunteers –3 human interviewers –4 versions of UT^2 –Native bots

Justify Judgments Record each match and replay to human Human explains rationale for judgments Downsides: –Humans forget –Humans make things up –Humans change their minds Still, many common themes emerged:

Humans Aren’t Killing Machines Accuracy affected by movement/distraction Pause before responding to surprises Humans don’t fire non-stop –Waiting for opportune shot –Saving ammo Few weapon switches Pause to observe

Humans Aren’t Stupid Humans rapidly correct mistakes –Get unstuck quickly –Move/dodge when fired upon –Don’t stare at walls Humans know their limitations –Prefer weapons requiring less accuracy –Don’t fight with a weak weapon

Complex Human Movements Do –Chase opponents tenaciously –Retreat while firing on opponent –Move in and out from cover Don’t –Perform many rapid movements too quickly –Turn around too quickly

Cognitive Issues Theory of Mind –Behavior transitions A chasing human expects to fight Humans expect to be chased (traps) –Communication via judging Human knows that its action will be recognized as human-like by humans –Emotion Revenge on humans more satisfying Fear of dangerous opponents

Conclusion Human trace replay provides human style exploration and gets bot unstuck Multiobjective neuroevolution provides combat behavior Simulated observation makes bot seem more human-like Future work: Incorporate Theory of Mind

Questions? Jacob Schrum Igor V. Karpov Risto Miikkulainen

Auxiliary Slides

Multiobjective Optimization Game with two objectives: –Damage Dealt –Remaining Health A dominates B iff A is strictly better in one objective and at least as good in others Population of points not dominated are best: Pareto Front Weighted-sum provably incapable of capturing non-convex front Dealt lot of damage, but lost lots of health Tradeoff between objectives High health but did not deal much damage

NSGA-II Evolution: natural approach for finding optimal population Non-Dominated Sorting Genetic Algorithm II* –Population P with size N; Evaluate P –Use mutation to get P´ size N; Evaluate P´ –Calculate non-dominated fronts of {P  P´} size 2N –New population size N from highest fronts of {P  P´} *K. Deb et al. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. Evol. Comp. 2002