Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S

 Authors Jacob Schrum Risto Miikkulainen  Youtube channel: https://www.youtube.com/channel/UCCKhH1p0tj1frvcD70tEyDg https://www.youtube.com/channel/UCCKhH1p0tj1frvcD70tEyDg  Times cited: 18 PAPER

 NPCs  Multi-modal Behavior  Downsides  So; they want to propose a method that encourages the development of this multi-modal behavior. GOAL

 Neuroevolution?  Why neuroevolution?  Improving the neuroevolution method NEUROEVOLUTION

 Player, NPC team, healthpoints, and a bat  Fight game  Flight game  Example footage: http://nn.cs.utexas.edu/?multimodal09  What would be beneficial behavior (four objectives)?  Fight  Maximize damage dealt  Fight  Minimize damage received  Fight  Maximize time alive  Flight  Maximize damage dealth  What’s useful about this game?  MOEAs FIGHT OR FLIGHT GAME

 Solutions consist of neural networks  FS-Neat (In this paper)  Copy existing solutions  Mutate the copies  Do competition  Apply selection  Rinse and repeat APPLICATION OF NEUROEVOLUTION

Neural network Input nodes Output nodes NEURAL NETWORKS

FS-NEAT Input nodes Output nodes

Input nodes Output nodes ADD NODE MUTATION Input nodes Output nodes

Input nodes Output nodes ADD LINK MUTATION Input nodes Output nodes

UNIQUE: MERGE MUTATION Input nodes Output nodes Input nodes Output nodes

UNIQUE: TWO TYPES OF OUTPUT NODES Neural network Input nodes Output nodes Policy nodes Preference node

UNIQUE: ADD OUTPUT NODES MUTATION! Input nodes Output nodes Input nodes Output nodes

MULTI-OBJECTIVE EVOLUTION (SELECTION)  Domination  Pareto front  NSGA-II

NSGA-II Next generatio n

NSGA-II Next generatio n PARETO FRONT DOESN’T FIT

DROP OBJECTIVE / PARETO DIMENSION NEW PARETO FRONT

TRY AGAIN Next generatio n PARETO FRONT DOES FIT!

PRIORITIES FOR GOAL DROPPING  Fight-DamageDealt  Flight-DamageDealt  Fight-DamageReceived  Fight-TimeAlive  In ascending order  Goal disabling

SELECTION OVERVIEW  Non dominated solutions are inserted in next generation and removed from consideration  Repeat untill next gen is full  Cutoff is often reached  NSGA-II uses “crowding distance”  This paper re-sorts while dropping objectives  Goals are sometimes temporarly disabled when all members of the population have reached them and renabled when the population starts performing poor on them.  When the final dimension is under consideration and equal score is achieved members are selected by random.

LET’S SEE HOW IT ALL WORKS OUT Experimental approach parameters:  Evolved in Homogeneous teams:  Predictable team-mates  Fitness score for group as whole  encourages teamwork, like a suicide bomber tactic, killing one but increasing teamscore.  Neuroevolution combined with NSGA-II as outlined before is used to evolve the NPCs on the goals.  A single parent population of 50 neural networks  Evaluation: 5 Fights and 5 Flights. Because evaluation on just one is noisy. Take avg score. This number (5/5) is a decent tradeoff (time vs. computation)

BOTS  Experimental trials are battled out against another NPC, henceforth called a BOT. Because it’s way too boring for normal people.  These bots are armed with two clever strategies for the tasks  Fight:  Go right out swinging hard against the closest enemy. This works  Flight:  Run away backwards so it can keep moving away from its closest attacker easily when attacked, because you can see it.  These tactics are pretty tough for the computer to beat, so the bot starts at 0% speed (only able to turn), increasing to 40/80/100%

GOALS  Per objective the NPCs are measured on the previously outlined goals. The targets are as follows:  Fight  1. Maximise DMG dealt: 50  2. Minimise DMG taken: -20  3. Maximise time alive: 80% of the trial  Flight  1. Maximise DMG dealt: 100  These goals are represented as numbers and the values should be attained by an average NPC. So, teamwork can help.  Once a network achieves all goals, the bot speed increases.

TWO CONDITIONS 1MODE versus ModeMutation  1MODE Neural networks with a single output mode containing two nodes, and no preference node (because just one mode).  ModeMutation starts as a single output mode, but can gain more through mutation. Starts with three output nodes for a single mode network (two policy and one preference node).  Each condition is evaluated in 10 separate trials for 300 generations or until all goals are achieved at 100% bot speed, whichever comes first. The bot speed starts at 0% and goes up from there.

RESULTS  ModeMutation performs twice as well as 1MODE.  ModeMutation beats the bot on all fronts on 100% speed 4 out of 10 trials whereas 1MODE only manages 2.  ModeMutation is also more successful at finding good strategies for each task (Fight or Flight). 1MODE is successful at finding very good strategies on some tasks, but very bad on others.  Example of a good Fight strategy:  Closest NPC moves away from bot, disallowing it to catch up, others follow. But then the closest turns slightly to the left, causing bot to turn to hit it, take some dmg, but the others catch up and mess up the bot real bad.  Example for Flight:  Keep knocking the bot towards the center so it keeps being surrounded while getting his ass kicked.  ModeMutation found both of these strategies!

DISCUSSION  ModeMutation develops good strategies across the board, on all goals. These stategies aren’t necessarily clear from evaluations in general, for they are mostly averages.  1MODE proved more likely to develop strong strategies on one front but will suffer for it on others.  SO: we find that ModeMutation is better at finding multi-modal strategies (which we were looking for), whereas 1Mode is more of an idiot savant.  This has an interesting implication for games: we don’t want NPCs that are crazy good one some occasions but run around like retards on others. This makes multi-modal behaviour much more interesting.

CONCLUSION  A mutation operator that adds new modes by adding nodes to the output layer of a neural network in neuroevolution is a promising way to develop multi-modal behaviour in NPCs.  Thanks for watching. Questions?

Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Similar presentations

Presentation on theme: "Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Similar presentations

Presentation on theme: "Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S."— Presentation transcript:

Similar presentations

About project

Feedback