Jorge Munoz, German Gutierrez, Araceli Sanchis Presented by: Itay Bittan.

Slides:



Advertisements
Similar presentations
Exploring Machine Learning in Computer Games Presented by: Matthew Hayden Thurs, 25 th March 2010.
Advertisements

Optimized Sensory-motor Couplings plus Strategy Extensions for the TORCS Car Racing Challenge COBOSTAR Cognitive BodySpaces for Torcs based Adaptive Racing.
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.
Constructing Complex NPC Behavior via Multi- Objective Neuroevolution Jacob Schrum – Risto Miikkulainen –
Introduction to Training and Learning in Neural Networks n CS/PY 399 Lab Presentation # 4 n February 1, 2001 n Mount Union College.
By Luigi Cardamone, Daniele Loiacono and Pier Luca Lanzi.
Minimax and Alpha-Beta Reduction Borrows from Spring 2006 CS 440 Lecture Slides.
Artificial Intelligence in Game Design Introduction to Learning.
Car Racing Competition. Goal  Learn or design a controller for TORCS that races as fast as possible alone or in the presence of others drivers  “Spiritual.
RACE CAR STRATEGY OPTIMISATION UNDER SIMULATION Naveen Chaudhary Shashank Sharma.
Evolving Driving Controllers using Genetic Programming Marc Ebner and Thorsten Tiede.
4-1 Management Information Systems for the Information Age Copyright 2002 The McGraw-Hill Companies, Inc. All rights reserved Chapter 4 Decision Support.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Games with Chance Other Search Algorithms CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 3 Adapted from slides of Yoonsuck Choe.
Evolving Neural Network Agents in the NERO Video Game Author : Kenneth O. Stanley, Bobby D. Bryant, and Risto Miikkulainen Presented by Yi Cheng Lin.
Neural Networks Marco Loog.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Integrating POMDP and RL for a Two Layer Simulated Robot Architecture Presented by Alp Sardağ.
1 Chapter 4 Decision Support and Artificial Intelligence Brainpower for Your Business.
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors.
Neural Networks Lecture 17: Self-Organizing Maps
Radial Basis Function (RBF) Networks
Ocober 10, 2012Introduction to Artificial Intelligence Lecture 9: Machine Evolution 1 The Alpha-Beta Procedure Example: max.
Neuro-fuzzy Systems Xinbo Gao School of Electronic Engineering Xidian University 2004,10.
Marcus Gallagher and Mark Ledwich School of Information Technology and Electrical Engineering University of Queensland, Australia Sumaira Saeed Evolving.
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
Chapter 10 Artificial Intelligence. © 2005 Pearson Addison-Wesley. All rights reserved 10-2 Chapter 10: Artificial Intelligence 10.1 Intelligence and.
Artificial Neural Networks
Neural Networks Chapter 6 Joost N. Kok Universiteit Leiden.
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
Learning BlackJack with ANN (Aritificial Neural Network) Ip Kei Sam ID:
Study on Genetic Network Programming (GNP) with Learning and Evolution Hirasawa laboratory, Artificial Intelligence section Information architecture field.
Back-Propagation MLP Neural Network Optimizer ECE 539 Andrew Beckwith.
NEURAL NETWORKS FOR DATA MINING
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Intelligence Techniques Artificial Stupidity?
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
1 Phase II - Checkers Operator: Eric Bengfort Temporal Status: End of Week Five Location: Phase Two Presentation Systems Check: Checkers Checksum Passed.
CSC Intro. to Computing Lecture 22: Artificial Intelligence.
CHECKERS: TD(Λ) LEARNING APPLIED FOR DETERMINISTIC GAME Presented By: Presented To: Amna Khan Mis Saleha Raza.
James Ihrig Awrad Mohammed Ali.  In our project we will use HyperNeat C++ to train a neural network to drive a car using TORCS as a platform.  This.
1 CSC 9010 Spring Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.
Non-Bayes classifiers. Linear discriminants, neural networks.
Neural Network Implementation of Poker AI
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Evolving Reactive NPCs for the Real-Time Simulation Game.
Pac-Man AI using GA. Why Machine Learning in Video Games? Better player experience Agents can adapt to player Increased variety of agent behaviors Ever-changing.
Over-Trained Network Node Removal and Neurotransmitter-Inspired Artificial Neural Networks By: Kyle Wray.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Professor : Ming – Shyan Wang Department of Electrical Engineering Southern Taiwan University Thesis progress report Sensorless Operation of PMSM Using.
Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
Application of the GA-PSO with the Fuzzy controller to the robot soccer Department of Electrical Engineering, Southern Taiwan University, Tainan, R.O.C.
Chapter 6 Neural Network.
The Pentium Goes to Vegas Training a Neural Network to Play BlackJack Paul Ruvolo and Christine Spritke.
Artificial Intelligence in Game Design Lecture 20: Hill Climbing and N-Grams.
Artificial Intelligence, simulation and modelling.
CAP6938 Neuroevolution and Artificial Embryogeny Real-time NEAT Dr. Kenneth Stanley February 22, 2006.
Understanding AI of 2 Player Games. Motivation Not much experience in AI (first AI project) and no specific interests/passion that I wanted to explore.
Business Analytics Several odds and ends Copyright © 2016 Curt Hill.
March 1, 2016Introduction to Artificial Intelligence Lecture 11: Machine Evolution 1 Let’s look at… Machine Evolution.
TORCS WORKS Jang Su-Hyung.
CSE 473 Introduction to Artificial Intelligence Neural Networks
Games with Chance Other Search Algorithms
CSE P573 Applications of Artificial Intelligence Neural Networks
CSE 573 Introduction to Artificial Intelligence Neural Networks
Introduction to Artificial Intelligence Lecture 11: Machine Evolution
The Alpha-Beta Procedure
Fundamentals of Neural Networks Dr. Satinder Bal Gupta
Presentation transcript:

Jorge Munoz, German Gutierrez, Araceli Sanchis Presented by: Itay Bittan

 1. Introduction  2. Related work  3. TORCS competition  4. Controllers  5. Controller learning by imitation  6. Results  7. Conclusions  8. Future works

Initial approach to create a controller for TORCS by learning how another controller or humans play the game. The data obtained from 3 controllers. One human player and two controllers:  The winner of the WCCI 2008 Simulated Car Racing  Hand coded controller that performs a complete lap in all tracks.

First, each kind of controller is imitated separately, then a mix of data is used to create new controllers. The imitation is performed by means of training a feed forward neural network with the data, using the backpropagation algorithm for learning. ANN - Artificial Neural Networks

Human players realize they are not playing vs. another human – and finds a way to beat the NPC (non-player character).  NPC sometimes cheats for winning humans.  Another option is to play in Internet against other human players With a lot of cheats or playing versus experienced human make you lose in every game – boring!

 Create opponents as intelligent as a human player.  The AI must be able to adapt its behavior depending on the opponent (play in the same level of the human) In this way the AI will provide a better entertainment for the player!

Create competitive NPCs that imitates the human behavior. The controller can play as well as its opponent – in the same level. NPC can adapt its adapt its behavior when the human improve his/her player skills to remain competitive.

 Realistic game where human plays versus one or more NPC.  There is option to compare the results with other researchers.  Allows to analyze behaviors that take place in a short period of time.

In all the experiments the controller created is a feed-forward ANN (Artificial Neural Networks) that was trained with data generated by the controllers. The learning algorithm for the ANN was backpropagation.

 1. Introduction  2. Related work  3. TORCS competition  4. Controllers  5. Controller learning by imitation  6. Results  7. Conclusions  8. Future works

Wide area of researching is to create computational intelligence in games with ANN. NEAT – NeuroEvolution of Augmenting Topologies NEAT is an effective method (algorithm) to create ANN - it alters both the weighting parameters and structures of networks. It starts with a small population of random ANN (with only input & output layers) that evolves to the problem.

An approaches to adapt the AI of the game to the player Examples: Rapidly adaptive game AI – method that applies continuously small adaptations to the AI based on observations and evaluation of the user actions. Dynamic Scripting – based on a set of rules that are used for the game, whose weights to select one or another rule are modified through a machine learning algorithm.

One researcher clone the behavior of RoboCup player using case base reasoning (solving new problems based on the solutions of similar past problems) RoboCup – simulated league of soccer.

Other researcher program robosoccer agents by modelling human behaviors with successful results. RoboCup – simulated league of soccer.

 1. Introduction  2. Related work  3. TORCS competition  4. Controllers  5. Controller learning by imitation  6. Results  7. Conclusions  8. Future works

 Very realistic simulator that has a sophisticated physic engine that takes into account many aspects of the racing such as fuel consumption, collisions or traction.  Provides a lot of tracks, cars with different feature  TORCS is open software and that allows the researchers to make modifications to the game and adapt it to their requirements.

Info. Provided: The lap (current lap time, best lap time, distance raced, race position) The car status (damage, fuel, actual gear, speed, lateral speed and R.P.M) Distanse between the car and the track edges More.. Action sent: Acceleration level Brake level Gear Steering of the wheel

 1. Introduction  2. Related work  3. TORCS competition  4. Controllers  5. Controller learning by imitation  6. Results  7. Conclusions  8. Future works

In the experiments we use the data obtained from three different controllers:  Human player  The winner of the WCCI2008 competition  Hand coded controller

 The information that the human gets from watching the game monitor is much richer.  He drove the car trying to go through the middle of the road, with soft accelerations and brakes, braking much before the next curve started, without fast and sharp turns. The human tried to drive the car as programmed controller would do, but with the mistakes that human makes.

Created my Matt Simmerson. This controller was the winner of the WCCI 2008 simulated car racing competition. As INPUTs of the ANN created by NEAT he selected: the current speed, the angle to track axis, the track position with respect to left and right edges, the current gear selection, the four wheels spin sensors, the current R.P.M and 19 track sensors. All these inputs were scaled to the range [0,1].

The OUTPUTs of the ANN were: The power (accelerate and brake), the gear change and the steering. The two first are range [0,1] and the last one is in range [-1,1]. The fitness function used to evaluate the ANN in NEAT took into account the distance raced, the R.P.M, the maximum speed reached and a value to measure of how much the car stayed on the track.

The idea of creating another controller was due to the human controller sometimes make mistakes and the Simmerson’s controller does not perform one complete lap in all the tracks and sometimes gets out from the track. Thus, the requirements are: To have same outputs for same inputs (without mistakes) – deterministic To perform a lap without getting out of the track

To calculate the values for the acceleration and the brake, we calculate the speed the car should have (estimated speed). Where sum_semsors is the sum of the three front sensors (which give the distance between the car and the edge of the track). Alpha and beta are predefined parameters. With this value (estimated speed) we calculated the diffrence:

The acceleration and brake values are proportional to the absolute value of the difference of the actual speed and the estimated speed: Where again there are adjustment parameters.

The steering value calculation: First, we check if the car is in straight or in a curve. We suppose that the car is in a straight when any of the 3 front sensors has the max value, and in a curve otherwise. Straight equation: Curve equation:

Finally the gear is calculated by: Where lambda is an adjustment parameter and speed is the current speed of the car. (for the gear the controller does not allow to change the gear twice in less than a second)

 1. Introduction  2. Related work  3. TORCS competition  4. Controllers  5. Controller learning by imitation  6. Results  7. Conclusions  8. Future works

For the goal of learning the behavior of the controller we have used an ANN. First, we obtain the data from the controller we want to imitate, then the ANN is trained with the backpropagation algorithm and finally the new controller is tested in the tracks.

Inputs: Current speed of the car Angle of the car with the axis The current gear The lateral speed of the car The R.P.M The 4 spins of the wheels. 19 sensors – distance between the car and the adges of the track Outputs: The accelerate / brake value The gear The steering

For all the experiments the ANN has 3 hidden layers of 28 neurons each one, were trained during 1000 cycles and the learning rate starts in 0.9 and finished in The data was taken from 17 roads tracks (next slide) – but only if the controller complete almost 3 laps. Hidden Layer Input layer Output layer

That have been obtained per each controller:

 1. Introduction  2. Related work  3. TORCS competition  4. Controllers  5. Controller learning by imitation  6. Results  7. Conclusions  8. Future works

For the “controllers learning by imitation” we used the data of all tracks to train the ANN of the controller and then test it in each track.

The time obtained by the controllers described before each track

The results of the learnt controllers for all the 3-controllers.

A controller created with the data of human, Simmersons and handcoded controllers was not created because they did not get good result with mixed configurations, as shown in the last 3 tables.

 1. Introduction  2. Related work  3. TORCS competition  4. Controllers  5. Controller learning by imitation  6. Results  7. Conclusions  8. Future works

It is very complicated to learn the human behavior in a video game: The human makes different actions in same situations. He does not make all actions in the proper way. He makes a lot of mistakes that have to solve. This sort of data makes completely impossible that an ANN could learn something useful.

The gear problem: the gear change has not been learned, despite it is probably the easiest output to learn. Maybe it is because of the high amount of data and due to the gear is also an input of the ANN.

The mixed data from two types of controllers – the result show that those controllers do not work. The controller learned has mixed features, but non of these is learned properly, so the car goes out of the track easily and the simulation ends.

 1. Introduction  2. Related work  3. TORCS competition  4. Controllers  5. Controller learning by imitation  6. Results  7. Conclusions  8. Future works

Pre-process of the data before use it to train the neural network. Two ideas:  To decrease the amount of data / remove duplicates.  Data pattern with same input and different output must be removed.

Train the controllers with some data of one controller. For example, the data of the straights of one controller that performs the straights good and the data of the curves of another controller that make well the turns.

If we want to imitate the human behavior: We have to increase the information used to train the ANN. There is a lack of information because the human player senses more information from the domain that the other controllers use. The human also remember and improve his/her behavior in each lap. The human also not perform in the same action under the same circumstances.

The ANN: There is no context information, the controller does not remember its past actions and cannot take decision with that information. Maybe recurrent neural networks need to be used and maybe we have to try different learning algorithms.