Selection of Behavioral Parameters: Integration of Case-Based Reasoning with Learning Momentum Brian Lee, Maxim Likhachev, and Ronald C. Arkin Mobile Robot.

Slides:

Advertisements

Similar presentations

Viktor Zhumatiya, Faustino Gomeza,

Advertisements

Reactive and Potential Field Planners

Reinforcement Learning

Georgia Tech / Mobile Intelligence 1 Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural Software Systems DARPA MARS Review.

The Problem of Concept Drift: Definitions and Related Work Alexev Tsymbalo paper. (April 29, 2004)

Intelligent Profiling by Example From: “Intelligent profiling by Example”, Sybil Sherin, Henry Lieberman

Presented by: GROUP 7 Gayathri Gandhamuneni & Yumeng Wang.

Optimizing Flocking Controllers using Gradient Descent

DARPA Mobile Autonomous Robot SoftwareMay Adaptive Intelligent Mobile Robotics William D. Smart, Presenter Leslie Pack Kaelbling, PI Artificial.

Effective Reinforcement Learning for Mobile Robots Smart, D.L and Kaelbing, L.P.

Artificial Intelligence in Game Design Introduction to Learning.

AuRA: Principles and Practice in Review

Uncertainty Representation. Gaussian Distribution variance Standard deviation.

Planning under Uncertainty

Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural Software Systems Georgia Tech College of Computing Georgia Tech Research.

Online Performance Auditing Using Hot Optimizations Without Getting Burned Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD)

Chapter 10 Artificial Intelligence © 2007 Pearson Addison-Wesley. All rights reserved.

Visual Navigation in Modified Environments From Biology to SLAM Sotirios Ch. Diamantas and Richard Crowder.

Large Scale Navigation Based on Perception Maria Joao Rendas I3S, CNRS-UNSA.

Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,

Distributed Radiation Detection Daniel Obenshain Arthur Rock SURF Fellow.

Integrating POMDP and RL for a Two Layer Simulated Robot Architecture Presented by Alp Sardağ.

Behavior-Based Artificial Intelligence Pattie Maes MIT Media-Laboratory Presentation by: Derak Berreyesa UNR, CS Department.

On Three-Layer Architecture Erann Gat Jet Propulsion Laboratory California Institute of Technology Presentation by: Ekkasit Tiamkaew Date: 09/09/04.

8/9/20151 DARPA-MARS Kickoff Adaptive Intelligent Mobile Robots Leslie Pack Kaelbling Artificial Intelligence Laboratory MIT.

Georgia Tech / Mobile Intelligence 1 Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural Software Systems DARPA MARS Review.

Optimal Stopping of the Context Collection Process in Mobile Sensor Networks Christos Anagnostopoulos 1, Stathes Hadjiefthymiades 2, Evangelos Zervas 3.

Chapter 11: Artificial Intelligence

Introduction to Discrete Event Simulation Customer population Service system Served customers Waiting line Priority rule Service facilities Figure C.1.

DARPA Mobile Autonomous Robot SoftwareLeslie Pack Kaelbling; March Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling Artificial Intelligence.

Robotica Lecture 3. 2 Robot Control Robot control is the mean by which the sensing and action of a robot are coordinated The infinitely many possible.

1 A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu, Chris Jermaine University of Florida September 2007.

Simultaneous Localization and Mapping Presented by Lihan He Apr. 21, 2006.

Mapping and Localization with RFID Technology Matthai Philipose, Kenneth P Fishkin, Dieter Fox, Dirk Hahnel, Wolfram Burgard Presenter: Aniket Shah.

Robotica Lecture 3. 2 Robot Control Robot control is the mean by which the sensing and action of a robot are coordinated The infinitely many possible.

Submitted by: Giorgio Tabarani, Christian Galinski Supervised by: Amir Geva CIS and ISL Laboratory, Technion.

Artificial Intelligence in Game Design N-Grams and Decision Tree Learning.

Spatio-Temporal Case-Based Reasoning for Behavioral Selection Maxim Likhachev and Ronald Arkin Mobile Robot Laboratory Georgia Tech.

1 Distributed and Optimal Motion Planning for Multiple Mobile Robots Yi Guo and Lynne Parker Center for Engineering Science Advanced Research Computer.

University of Windsor School of Computer Science Topics in Artificial Intelligence Fall 2008 Sept 11, 2008.

Georgia Tech / Mobile Intelligence 1 Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural Software Systems DARPA MARS Kickoff.

Controlling Individual Agents in High-Density Crowd Simulation

Learning to Navigate Through Crowded Environments Peter Henry 1, Christian Vollmer 2, Brian Ferris 1, Dieter Fox 1 Tuesday, May 4, University of.

CUHK Learning-Based Power Management for Multi-Core Processors YE Rong Nov 15, 2011.

CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.

MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF )

Joseph Xu Soar Workshop Learning Modal Continuous Models.

Ｉｎｔｒｏｄｕｃｔｉｏｎ of Intelligent Agents

Behavior-based Multirobot Architectures. Why Behavior Based Control for Multi-Robot Teams? Multi-Robot control naturally grew out of single robot control.

DARPA Mobile Autonomous Robot SoftwareLeslie Pack Kaelbling; January Adaptive Intelligent Mobile Robotics Leslie Pack Kaelbling Artificial Intelligence.

Mobility Models for Wireless Ad Hoc Network Research EECS 600 Advanced Network Research, Spring 2005 Instructor: Shudong Jin March 28, 2005.

Learning Momentum: Integration and Experimentation Brian Lee and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech Atlanta, GA.

Probabilistic Robotics Introduction.  Robotics is the science of perceiving and manipulating the physical world through computer-controlled devices.

Exponential random graphs and dynamic graph algorithms David Eppstein Comp. Sci. Dept., UC Irvine.

Adaptive Triangular Deployment Algorithm for Unattended Mobile Sensor Networks Ming Ma and Yuanyuan Yang Department of Electrical & Computer Engineering.

Simulating Crowds Simulating Dynamical Features of Escape Panic & Self-Organization Phenomena in Pedestrian Crowds Papers by Helbing.

Learning for Physically Diverse Robot Teams Robot Teams - Chapter 7 CS8803 Autonomous Multi-Robot Systems 10/3/02.

Goal Finding Robot using Fuzzy Logic and Approximate Q-Learning

Distributed, Self-stabilizing Placement of Replicated Resources in Emerging Networks Bong-Jun Ko, Dan Rubenstein Presented by Jason Waddle.

Artificial Intelligence in Game Design Lecture 20: Hill Climbing and N-Grams.

Learning Behavioral Parameterization Using Spatio-Temporal Case-Based Reasoning Maxim Likhachev, Michael Kaess, and Ronald C. Arkin Mobile Robot Laboratory.

Probabilistic Robotics Introduction. SA-1 2 Introduction  Robotics is the science of perceiving and manipulating the physical world through computer-controlled.

Robot Intelligence Technology Lab. Evolutionary Robotics Chapter 3. How to Evolve Robots Chi-Ho Lee.

Towards Robust Revenue Management: Capacity Control Using Limited Demand Information Michael Ball, Huina Gao, Yingjie Lan & Itir Karaesmen Robert H Smith.

Machine Learning Supervised Learning Classification and Regression

Chapter 11: Artificial Intelligence

CS b659: Intelligent Robotics

CIS 488/588 Bruce R. Maxim UM-Dearborn

CS 188: Artificial Intelligence Fall 2008

Presentation transcript:

Selection of Behavioral Parameters: Integration of Case-Based Reasoning with Learning Momentum Brian Lee, Maxim Likhachev, and Ronald C. Arkin Mobile Robot Laboratory Georgia Tech Atlanta, GA This research was funded under the DARPA MARS program.

Integrated Multi-layered Learning THE LEARNING CONTINUUM: Deliberative (pre-mission). Behavioral switching. Reactive (online adaptation) CBR Wizardry –Guide the operator Probabilistic Planning –Manage complexity for the operator RL for Behavioral Assemblage Selection –Learn what works for the robot CBR for Behavior Transitions –Adapt to situations the robot can recognize Learning Momentum –Vary robot parameters in real time

Motivation It’s hard to manually derive behavioral controller parameters. –The parameter space increases exponentially with the number of parameters. You don’t always have a priori knowledge of the environment. –Without prior knowledge, a user can’t confidently derive appropriate parameter values, so it becomes necessary for the robot to adapt on its own to what it finds. Obstacle densities and layout in the environment may be heterogeneous. –Parameters that work well for one type of environment may not work well with another type. A solution is to provide adaptability to the system while remaining fully reactive.

Context for Case-based Reasoning (CBR) Spatial and temporal features are used to select stored cases from a case library. Cases contain parameters for a behavior-based reactive controller. Selected parameters are adapted for the current situation. The controller is updated with new parameters that should be more appropriate to the current environment.

CBR Module Feature Identification Spatial Feature Matching Temporal Feature Matching Random Selection Process Case Library Case Switching Decision Case Adaptation Case Application Sensors

Context for Learning Momentum (LM) A crude form of reinforcement learning. –If the robot is doing well, try doing what it’s doing a little more, otherwise try something different. Behavior parameters are continually changed in response to progress and obstacles. Static rules for pre-defined situations are used to update behavior parameters. Different sets of rules for parameter changes can be used (ballooning versus squeezing).

LM Strategies Ballooning –Alter parameters so the robot reacts to obstacles at larger distances than normal to push it out of box canyon situations. Squeezing –Alter parameters so the robot reacts to obstacles only at shorter distances than normal so it can move between closely spaced obstacles. Example ballooning rule : if ( situation == NO_PROGRESS_WITH_OBSTACLES ) obstacle_sphere_of_influence += 0.5 meters else obstacle_sphere_of_influence -= 0.5 meters

LM Module Sensors Short Sensor History Situation Matching Behavioral Parameters Parameter Deltas Parameter Adaptation Old parameters Adapted parameters

Effects of CBR and LM When Used Separately Reported in ICRA 2001 Effects of CBR –Distances traversed were shorter –Time taken was shorter Effects of LM –Completion rates were much higher for dense obstacles –Completion times were higher than those for successful non-adaptive robots

Why Integrate? Want discontinuous switching + continuous searching in the parameter space. CBR is not continuous –Parameter changes are triggered by environment changes or case time-outs. –Case library is manually built to provide only ballpark solutions for different environment types. LM does not make large, discontinuous changes –LM may take a while to adapt to large environmental changes. LM cannot change strategies at run time –The LM strategies of ballooning and squeezing are tuned for different environments.

Currently Used Behaviors Move to Goal –Always returns a vector pointing toward the goal position. Avoid Obstacles –Returns a sum of weighted vectors pointing away from obstacles. Wander –Returns vectors pointing in random directions. Bias Move –Returns a vector biasing the robot’s movement in a certain direction (i.e. away from high obstacle densities), and is set by the CBR module. –Only used when CBR is present.

Adjustable Behavioral Parameters Move to goal vector gain Avoid obstacle vector gain Avoid obstacle sphere of influence –Radius around the robot inside of which obstacles reacted to Wander vector gain Wander persistence –The number of consecutive steps the wander vector points in the same direction Bias Move vector gain Bias Move X, Bias Move Y –These are the components of the vector returned by Bias Move

Integration Core Behavior-Based Controller Behavioral Parameters Sensors Actuators Base System

Integration Core Behavior-Based Controller Behavioral Parameters Sensors Actuators CBR Module Updated Parameters Addition of CBR Module

Integration Core Behavior-Based Controller Behavioral Parameters Sensors Actuators CBR Module LM Module Updated Deltas and Parameter Bounds Updated Parameters Addition of LM Module

Simulation Setup Heterogeneous Environments –varying obstacles density, order, and size –350 x 350 meters Homogeneous Environments –even obstacle distribution –random obstacle placement and size –two environments with 15% density and two environments with 20% density –150 x 150 meters

CBR-LM in Simulation

Simulation Results For a Heterogeneous Environment

Simulation Results For a Heterogeneous Environment

For a Homogeneous Environment Simulation Results

For a Homogeneous Environment Simulation Results

Simulation Observations Beneficial Attributes of CBR are Preserved. –We see quick, radical changes in behavior. –Time taken is about the same as CBR only. Beneficial Attributes of LM are not always apparent. –Results can probably be attributed to a well-tuned case library. –If the case library is good enough, LM should not be needed.

RWI ATRV-Jr robot Forward and rear LMS SICK laser scanners Odometry, compass, and gyroscope for localization Straight-line start to goal distance of about 46 meters Physical Robot Experiments Outdoor environment with trees and man-made obstacles CBR-LM, CBR, LM, and non-adaptive systems were compared The squeezing strategy was used in the LM-only experiments. Data was averaged over 10 runs per adaptation algorithm

Outdoor Run

Physical Experiments Results All valid runs were able to reach the goal. Both CBR and LM beat the non-adaptive system. The CBR-LM integrated system gave the best performance.

Difference From Simulation CBR-LM outperformed CBR on the physical robot more than in simulation. –The case library for the real robot may not have been as well tuned as the simulation library.

Conclusions A performance increase is not guaranteed. For a well-tuned case library, there may be little for LM to do. Integration of CBR and LM can result in a performance increase –observed up to 29% improvement in steps over CBR Benefits of LM are more likely to be apparent when the CBR case library is not well-tuned (which is likely to be the case for real robots.) LM could be used to dynamically update the case library with better sets of parameters.