Nicholas Mifsud
Behaviour Trees (BT) proposed as an improvement over Finite State Machines (FSM) BTs are simple to design, implement, easily scalable, modular and reusable BTs provide a hierarchical way of organising behaviours in a descending order of complexity (broad tasks on top, sub-tasks at the bottom) Each tree has one defined high level behaviour, multiple trees are used to define an AI-bot
Nodes are divided in two categories – ◦ Control Nodes Drive execution flow through the tree, deciding which nodes to execute next Sequence – execute children left to right until one fails (logical AND) Selector – execute children until one fails (logical OR) ◦ Leaf Nodes Conditions – query the game state (left most) Actions – carry out specific tasks (sequence of actions follow conditions)
Lim et al. investigate feasibility of applying evolutionary techniques to develop BTs for competitive AI-bots Authors give a brief outline of ◦ DEFCON ◦ Relevance of BTs ◦ Four different fitness functions ◦ Experiments and Results ◦ Future Work
Multiplayer Real-Time Strategy game Six world territories Players have set of units and structures at their disposal Default AI-bot uses a deterministic FSM consisting of 5 sets and follows them in sequence AI-bot was given all functionalitites to play the game (makes use of DEFCON API) Decisions performed randomly
Original set of trees were hand defined in order to cater for all possible actions Handcrafted trees then used as a basis to produce more complex behaviours Develop AI-bot behaviour by following an evolution algorithm (genetic operators + fitness functions) Evolved behaviour trees for individual behaviours and combined the best performing trees
Tree structures naturally allow genetic operators to modify the behaviours Crossover on branches Mutation on nodes ◦ Random mutation ◦ Incremental mutation Add a new branch to a behaviour ◦ Point mutation Change the spawn point of a structure/unit
Defence potential – total number of air units destroyed in a given game Uncovered enemy units – number of enemy sea units uncovered Fleet Actions – number of enemy buildings uncovered, buildings destroyed and number of units destroyed Timing of attacks – difference between final end scores as an overall tally (large difference would mean convincing win)
Four different experiments run each to develop a different fitness described above Each population contained 100 individuals Each experiment evolved between 80 to 250 generations Mutation rate set to 5%
All results indicate the mean of each fitness function increased as more generations were produced Higher percentage win rate
AI-bot constructed with a controller that used the best trees evolved for each of the four behaviours Played 200 games against the default AI-bot Won 55% of the time Games that it lost, was by a very low margin Before evolving the behaviour trees it only won 3%
By using evolutionary computing and BTs, able to defeat a handcrafted AI-bot Mean fitness reached a plateau indicating more generations may be redundant Raises question whether other techniques could be coupled with this process Not all possibilities of game explored (only two territories and only 4 tasks were considered)
Perez et al. also investigate the applicability of Genetic Programming to evolve BTs Authors make reference to DEFCON paper discussed above and state that their work may be extended by using Grammar Based Genetic Programming systems such as Grammatical Evolution Develop AI agent for the Mario AI Benchmark following Grammatical Evolution
Specify syntax of possible solutions through a context free grammar Variable length integer strings are evolved following a genetic algorithm and these then choose the production rules from the grammar until all symbols are mapped
Open source software based on the Mario World All information given in two matrices (21x21) Data about geometry of level and enemies that populate it is given Information about current state of game is also given (position, status, mode etc) Mario has 6 effectors (up, down, left, right, jump and fire and jump)
At every cycle a button needs to be pressed in order to move Mario – impacts execution of BT Control Nodes ◦ Sequence – logical AND ◦ Selector – logical OR ◦ Filters - added to add loops Leaf Nodes ◦ Conditions – use level info to perform checks ◦ Actions – Mario’s possible movements ◦ Sub-Trees – designed to solve specific problems (long jumps)
The BT syntax was coded into the grammar ◦ 30 conditions ◦ 8 actions ◦ 19 sub-trees ◦ 4 filters Evolution combines these as long as syntax is respected Had to limit certain rule combinations through the grammar since some trees were impossible to read and too demanding to execute
Trees can have variable length but follow an and-or structure Trees consist of a root node and number of sub trees (behaviour blocks) Each block consists of one or more conditions followed by a sequence of actions Tree has a default sub tree that is selected if no conditions are satisfied If no default then it could be the case that Mario does not move
Each behaviour block is self contained and hence it allows for individuals to exchange these between them in order to produce different behaviours Use two point crossover Allow for sub tree swap operation also (internal crossover)
Each individual evaluated on 18 levels Elitism ensures % of population kept from one generation to the next At the end of all runs, all best individuals were evaluated on 600 unseen maps
Best BT generated comprised of four behaviour blocks
BT sent to Mario AI competition and managed to score in fourth place, proving BTs validity in the field of AI (relatively high kill count) First and third place used A* variants Second place used a neural network
Use of Grammar simplifies the task of encoding the syntax of the BTs Remarkable reactive behaviour capabilities but does not excel at planning Hybrid approach under construction to aid in path planning
DEFCON starts with a set of hand crafted trees, encoding feasible behaviours for each of the game four parts Separate genetic programs were run for each part, creating new predefined behaviours from the original set Main difference is that the BTs used in the Mario agent were encoded into a context free grammar in order to reduce complexity and create the behaviour blocks Mario BTs also have loops defined in structure